Linus Torvalds [Sun, 28 Aug 2022 21:36:27 +0000 (14:36 -0700)]
Merge tag 'bitmap-6.0-rc3' of github.com:/norov/linux
Pull bitmap fixes from Yury Norov:
"Fix the reported issues, and implements the suggested improvements,
for the version of the cpumask tests [1] that was merged with commit
c41e8866c28c ("lib/test: introduce cpumask KUnit test suite").
These changes include fixes for the tests, and better alignment with
the KUnit style guidelines"
* tag 'bitmap-6.0-rc3' of github.com:/norov/linux:
lib/cpumask_kunit: add tests file to MAINTAINERS
lib/cpumask_kunit: log mask contents
lib/test_cpumask: follow KUnit style guidelines
lib/test_cpumask: fix cpu_possible_mask last test
lib/test_cpumask: drop cpu_possible_mask full test
Linus Torvalds [Sun, 28 Aug 2022 17:44:04 +0000 (10:44 -0700)]
Merge tag 'for-6.0-rc3-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"Fixes:
- check that subvolume is writable when changing xattrs from security
namespace
- fix memory leak in device lookup helper
- update generation of hole file extent item when merging holes
- fix space cache corruption and potential double allocations; this
is a rare bug but can be serious once it happens, stable backports
and analysis tool will be provided
- fix error handling when deleting root references
- fix crash due to assert when attempting to cancel suspended device
replace, add message what to do if mount fails due to missing
replace item
Regressions:
- don't merge pages into bio if their page offset is not contiguous
- don't allow large NOWAIT direct reads, this could lead to short
reads eg. in io_uring"
* tag 'for-6.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: add info when mount fails due to stale replace target
btrfs: replace: drop assert for suspended replace
btrfs: fix silent failure when deleting root reference
btrfs: fix space cache corruption and potential double allocations
btrfs: don't allow large NOWAIT direct reads
btrfs: don't merge pages into bio if their page offset is not contiguous
btrfs: update generation of hole file extent item when merging holes
btrfs: fix possible memory leak in btrfs_get_dev_args_from_path()
btrfs: check if root is readonly while setting security xattr
Linus Torvalds [Sun, 28 Aug 2022 17:35:16 +0000 (10:35 -0700)]
Merge tag '6.0-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cfis fixes from Steve French:
- two locking fixes (zero range, punch hole)
- DFS 9 fix (padding), affecting some servers
- three minor cleanup changes
* tag '6.0-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Add helper function to check smb1+ server
cifs: Use help macro to get the mid header size
cifs: Use help macro to get the header preamble size
cifs: skip extra NULL byte in filenames
smb3: missing inode locks in punch hole
smb3: missing inode locks in zero range
Linus Torvalds [Sun, 28 Aug 2022 17:10:23 +0000 (10:10 -0700)]
Merge tag 'x86-urgent-2022-08-28' of git://git./linux/kernel/git/tip/tip
Pull misc x86 fixes from Ingo Molnar:
- Fix PAT on Xen, which caused i915 driver failures
- Fix compat INT 80 entry crash on Xen PV guests
- Fix 'MMIO Stale Data' mitigation status reporting on older Intel CPUs
- Fix RSB stuffing regressions
- Fix ORC unwinding on ftrace trampolines
- Add Intel Raptor Lake CPU model number
- Fix (work around) a SEV-SNP bootloader bug providing bogus values in
boot_params->cc_blob_address, by ignoring the value on !SEV-SNP
bootups.
- Fix SEV-SNP early boot failure
- Fix the objtool list of noreturn functions and annotate snp_abort(),
which bug confused objtool on gcc-12.
- Fix the documentation for retbleed
* tag 'x86-urgent-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Documentation/ABI: Mention retbleed vulnerability info file for sysfs
x86/sev: Mark snp_abort() noreturn
x86/sev: Don't use cc_platform_has() for early SEV-SNP calls
x86/boot: Don't propagate uninitialized boot_params->cc_blob_address
x86/cpu: Add new Raptor Lake CPU model number
x86/unwind/orc: Unwind ftrace trampolines with correct ORC entry
x86/nospec: Fix i386 RSB stuffing
x86/nospec: Unwreck the RSB stuffing
x86/bugs: Add "unknown" reporting for MMIO Stale Data
x86/entry: Fix entry_INT80_compat for Xen PV guests
x86/PAT: Have pat_enabled() properly reflect state when running on Xen
Linus Torvalds [Sun, 28 Aug 2022 17:05:42 +0000 (10:05 -0700)]
Merge tag 'perf-urgent-2022-08-28' of git://git./linux/kernel/git/tip/tip
Pull x86 perf fixes from Ingo Molnar:
"Misc fixes: an Arch-LBR fix, a PEBS enumeration fix, an Intel DS fix,
PEBS constraints fix on Alder Lake CPUs and an Intel uncore PMU fix"
* tag 'perf-urgent-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/uncore: Fix broken read_counter() for SNB IMC PMU
perf/x86/intel: Fix pebs event constraints for ADL
perf/x86/intel/ds: Fix precise store latency handling
perf/x86/core: Set pebs_capable and PMU_FL_PEBS_ALL for the Baseline
perf/x86/lbr: Enable the branch type for the Arch LBR by default
Linus Torvalds [Sun, 28 Aug 2022 16:58:00 +0000 (09:58 -0700)]
Merge tag 'perf-tools-fixes-for-v6.0-2022-08-27' of git://git./linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fixup setup of weak groups when using 'perf stat --repeat', add a
'perf test' for it.
- Fix memory leaks in 'perf sched record' detected with
-fsanitize=address.
- Fix build when PYTHON_CONFIG is user supplied.
- Capitalize topdown metrics' names in 'perf stat', so that the output,
sometimes parsed, matches the Intel SDM docs.
- Make sure the documentation for the save_type filter about Intel
systems with Arch LBR support (12th-Gen+ client or 4th-Gen Xeon+
server) reflects recent related kernel changes.
- Fix 'perf record' man page formatting of description of support to
hybrid systems.
- Update arm64´s KVM header from the kernel sources.
* tag 'perf-tools-fixes-for-v6.0-2022-08-27' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf stat: Capitalize topdown metrics' names
perf docs: Update the documentation for the save_type filter
perf sched: Fix memory leaks in __cmd_record detected with -fsanitize=address
perf record: Fix manpage formatting of description of support to hybrid systems
perf test: Stat test for repeat with a weak group
perf stat: Clear evsel->reset_group for each stat run
tools kvm headers arm64: Update KVM header from the kernel sources
perf python: Fix build when PYTHON_CONFIG is user supplied
Linus Torvalds [Sat, 27 Aug 2022 22:58:38 +0000 (15:58 -0700)]
Merge tag 'thermal-6.0-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull thermal control fixes from Rafael Wysocki:
"Fix two issues introduced recently and one driver problem leading to a
NULL pointer dereference in some cases.
Specifics:
- Add missing EXPORT_SYMBOL_GPL in the thermal core and add back the
required 'trips' property to the thermal zone DT bindings (Daniel
Lezcano)
- Prevent the int340x_thermal driver from crashing when a package
with a buffer of 0 length is returned by an ACPI control method
evaluated by it (Lee, Chun-Yi)"
* tag 'thermal-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal/int340x_thermal: handle data_vault when the value is ZERO_SIZE_PTR
dt-bindings: thermal: Fix missing required property
thermal/core: Add missing EXPORT_SYMBOL_GPL
Linus Torvalds [Sat, 27 Aug 2022 22:53:49 +0000 (15:53 -0700)]
Merge tag 'pm-6.0-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"Make __resolve_freq() check the presence of the frequency table
instead of checking whether or not the ->target_index() callback is
implemented by the driver, because that need not be the case when
__resolve_freq() is used (Lukasz Luba)"
* tag 'pm-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: check only freq_table in __resolve_freq()
Linus Torvalds [Sat, 27 Aug 2022 22:47:02 +0000 (15:47 -0700)]
Merge tag 'acpi-6.0-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix issues introduced by recent changes related to the handling
of ACPI device properties and a coding mistake in the exit path of the
ACPI processor driver.
Specifics:
- Prevent acpi_thermal_cpufreq_exit() from attempting to remove
the same frequency QoS request multiple times (Riwen Lu)
- Fix type detection for integer ACPI device properties (Stefan
Binding)
- Avoid emitting false-positive warnings when processing ACPI
device properties and drop the useless default case from the
acpi_copy_property_array_uint() macro (Sakari Ailus)"
* tag 'acpi-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: property: Remove default association from integer maximum values
ACPI: property: Ignore already existing data node tags
ACPI: property: Fix type detection of unified integer reading functions
ACPI: processor: Remove freq Qos request for all CPUs
Linus Torvalds [Sat, 27 Aug 2022 22:40:51 +0000 (15:40 -0700)]
Merge tag 's390-6.0-2' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:
- Fix double free of guarded storage and runtime instrumentation
control blocks on fork() failure
- Fix triggering write fault when VMA does not allow VM_WRITE
* tag 's390-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/mm: do not trigger write fault when vma does not allow VM_WRITE
s390: fix double free of GS and RI CBs on fork() failure
Linus Torvalds [Sat, 27 Aug 2022 22:38:00 +0000 (15:38 -0700)]
Merge tag 'for-linus-6.0-rc3-tag' of git://git./linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- two minor cleanups
- a fix of the xen/privcmd driver avoiding a possible NULL dereference
in an error case
* tag 'for-linus-6.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/privcmd: fix error exit of privcmd_ioctl_dm_op()
xen: move from strlcpy with unused retval to strscpy
xen: x86: remove setting the obsolete config XEN_MAX_DOMAIN_MEMORY
Linus Torvalds [Sat, 27 Aug 2022 22:31:12 +0000 (15:31 -0700)]
Merge tag 'audit-pr-
20220826' of git://git./linux/kernel/git/pcmoore/audit
Pull audit fix from Paul Moore:
"Another small audit patch, this time to fix a bug where the return
codes were not properly set before the audit filters were run,
potentially resulting in missed audit records"
* tag 'audit-pr-
20220826' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
audit: move audit_return_fixup before the filters
Linus Torvalds [Sat, 27 Aug 2022 16:57:58 +0000 (09:57 -0700)]
Merge tag 'fbdev-for-6.0-rc3' of git://git./linux/kernel/git/deller/linux-fbdev
Pull fbdev fixes and updates from Helge Deller:
"Mostly just small patches, with the exception of the bigger indenting
cleanups in the sisfb and radeonfb drivers.
Two patches should be mentioned though: A fix-up for fbdev if the
screen resize fails (by Shigeru Yoshida), and a potential divide by
zero fix in fb_pm2fb (by Letu Ren).
Summary:
Major fixes:
- Revert the changes for fbcon console when vc_resize() fails
[Shigeru Yoshida]
- Avoid a potential divide by zero error in fb_pm2fb [Letu Ren]
Minor fixes:
- Add missing pci_disable_device() in chipsfb_pci_init() [Yang
Yingliang]
- Fix tests for platform_get_irq() failure in omapfb [Yu Zhe]
- Destroy mutex on freeing struct fb_info in fbsysfs [Shigeru
Yoshida]
Cleanups:
- Move fbdev drivers from strlcpy to strscpy [Wolfram Sang]
- Indenting fixes, comment fixes, ... [Jiapeng Chong & Jilin Yuan]"
* tag 'fbdev-for-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
fbdev: fbcon: Properly revert changes when vc_resize() failed
fbdev: Move fbdev drivers from strlcpy to strscpy
fbdev: omap: Remove unnecessary print function dev_err()
fbdev: chipsfb: Add missing pci_disable_device() in chipsfb_pci_init()
fbdev: fbcon: Destroy mutex on freeing struct fb_info
fbdev: radeon: Clean up some inconsistent indenting
fbdev: sisfb: Clean up some inconsistent indenting
fbdev: fb_pm2fb: Avoid potential divide by zero error
fbdev: ssd1307fb: Fix repeated words in comments
fbdev: omapfb: Fix tests for platform_get_irq() failure
Mikulas Patocka [Fri, 26 Aug 2022 20:43:51 +0000 (16:43 -0400)]
provide arch_test_bit_acquire for architectures that define test_bit
Some architectures define their own arch_test_bit and they also need
arch_test_bit_acquire, otherwise they won't compile. We also clean up
the code by using the generic test_bit if that is equivalent to the
arch-specific version.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 8238b4579866 ("wait_on_bit: add an acquire memory barrier")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Zhengjun Xing [Thu, 25 Aug 2022 01:54:58 +0000 (09:54 +0800)]
perf stat: Capitalize topdown metrics' names
Capitalize topdown metrics' names to follow the intel SDM.
Before:
# ./perf stat -a sleep 1
Performance counter stats for 'system wide':
228,094.05 msec cpu-clock # 225.026 CPUs utilized
842 context-switches # 3.691 /sec
224 cpu-migrations # 0.982 /sec
70 page-faults # 0.307 /sec
23,164,105 cycles # 0.000 GHz
29,403,446 instructions # 1.27 insn per cycle
5,268,185 branches # 23.097 K/sec
33,239 branch-misses # 0.63% of all branches
136,248,990 slots # 597.337 K/sec
32,976,450 topdown-retiring # 24.2% retiring
4,651,918 topdown-bad-spec # 3.4% bad speculation
26,148,695 topdown-fe-bound # 19.2% frontend bound
72,515,776 topdown-be-bound # 53.2% backend bound
6,008,540 topdown-heavy-ops # 4.4% heavy operations # 19.8% light operations
3,934,049 topdown-br-mispredict # 2.9% branch mispredict # 0.5% machine clears
16,655,439 topdown-fetch-lat # 12.2% fetch latency # 7.0% fetch bandwidth
41,635,972 topdown-mem-bound # 30.5% memory bound # 22.7% Core bound
1.
013634593 seconds time elapsed
After:
# ./perf stat -a sleep 1
Performance counter stats for 'system wide':
228,081.94 msec cpu-clock # 225.003 CPUs utilized
824 context-switches # 3.613 /sec
224 cpu-migrations # 0.982 /sec
67 page-faults # 0.294 /sec
22,647,423 cycles # 0.000 GHz
28,870,551 instructions # 1.27 insn per cycle
5,167,099 branches # 22.655 K/sec
32,383 branch-misses # 0.63% of all branches
133,411,074 slots # 584.926 K/sec
32,352,607 topdown-retiring # 24.3% Retiring
4,456,977 topdown-bad-spec # 3.3% Bad Speculation
25,626,487 topdown-fe-bound # 19.2% Frontend Bound
70,955,316 topdown-be-bound # 53.2% Backend Bound
5,834,844 topdown-heavy-ops # 4.4% Heavy Operations # 19.9% Light Operations
3,738,781 topdown-br-mispredict # 2.8% Branch Mispredict # 0.5% Machine Clears
16,286,803 topdown-fetch-lat # 12.2% Fetch Latency # 7.0% Fetch Bandwidth
40,802,069 topdown-mem-bound # 30.6% Memory Bound # 22.6% Core Bound
1.
013683125 seconds time elapsed
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220825015458.3252239-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Tue, 16 Aug 2022 12:56:12 +0000 (05:56 -0700)]
perf docs: Update the documentation for the save_type filter
Update the documentation to reflect the kernel changes.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220816125612.2042397-2-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Wed, 24 Aug 2022 14:57:33 +0000 (07:57 -0700)]
perf sched: Fix memory leaks in __cmd_record detected with -fsanitize=address
An array of strings is passed to cmd_record but not freed. As
cmd_record modifies the array, add another array as a copy that can be
mutated allowing the original array contents to all be freed.
Detected with -fsanitize=address.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220824145733.409005-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Thu, 18 Aug 2022 10:01:27 +0000 (03:01 -0700)]
perf record: Fix manpage formatting of description of support to hybrid systems
The Intel hybrid description is written in a different style than the
rest of the perf record man page. There were some new command line
options added after it which resulted in very strange section ordering.
Move the hybrid include last.
Also the sub sections in the hybrid document don't fit the record
manpage well (especially since it talks about all kinds of unrelated
commands). I left this for now, but would be better to separate this
properly in the different man pages.
It would be better to use sub sections for the other sections, but these
don't seem to be supported in AsciiDoc?
Some of the examples are still misrendered in the manpage with an
indented troff command, but I don't know how to fix that.
In any case it's now better than before.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: zhengjun.xing@intel.com
Link: https://lore.kernel.org/r/20220818100127.249401-1-ak@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Mon, 22 Aug 2022 21:33:52 +0000 (14:33 -0700)]
perf test: Stat test for repeat with a weak group
Breaking a weak group requires multiple passes of an evlist, with
multiple runs this can introduce bugs ultimately leading to
segfaults. Add a test to cover this.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220822213352.75721-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Mon, 22 Aug 2022 21:33:51 +0000 (14:33 -0700)]
perf stat: Clear evsel->reset_group for each stat run
If a weak group is broken then the reset_group flag remains set for
the next run. Having reset_group set means the counter isn't created
and ultimately a segfault.
A simple reproduction of this is:
# perf stat -r2 -e '{cycles,cycles,cycles,cycles,cycles,cycles,cycles,cycles,cycles,cycles}:W
which will be added as a test in the next patch.
Fixes: 4804e0111662d7d8 ("perf stat: Use affinity for opening events")
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220822213352.75721-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Mon, 21 Dec 2020 15:53:44 +0000 (12:53 -0300)]
tools kvm headers arm64: Update KVM header from the kernel sources
To pick the changes from:
ae3b1da95413614f ("KVM: arm64: Fix compile error due to sign extension")
That doesn't result in any changes in tooling (when built on x86), only
addresses this perf build warning:
Warning: Kernel ABI header at 'tools/arch/arm64/include/uapi/asm/kvm.h' differs from latest version at 'arch/arm64/include/uapi/asm/kvm.h'
diff -u tools/arch/arm64/include/uapi/asm/kvm.h arch/arm64/include/uapi/asm/kvm.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/all/YwOMCCc4E79FuvDe@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Thu, 28 Jul 2022 09:39:46 +0000 (10:39 +0100)]
perf python: Fix build when PYTHON_CONFIG is user supplied
The previous change to Python autodetection had a small mistake where
the auto value was used to determine the Python binary, rather than the
user supplied value. The Python binary is only used for one part of the
build process, rather than the final linking, so it was producing
correct builds in most scenarios, especially when the auto detected
value matched what the user wanted, or the system only had a valid set
of Pythons.
Change it so that the Python binary path is derived from either the
PYTHON_CONFIG value or PYTHON value, depending on what is specified by
the user. This was the original intention.
This error was spotted in a build failure an odd cross compilation
environment after commit
4c41cb46a732fe82 ("perf python: Prefer
python3") was merged.
Fixes: 630af16eee495f58 ("perf tools: Use Python devtools for version autodetection rather than runtime")
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220728093946.1337642-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rafael J. Wysocki [Sat, 27 Aug 2022 13:07:58 +0000 (15:07 +0200)]
Merge branch 'thermal-core'
Merge thermal control core fixes for 6.0-rc3:
- Fix missing required property for thermal zone description (Daniel
Lezcano).
- Add missing export symbol for
thermal_zone_device_register_with_trips() (Daniel Lezcano).
* thermal-core:
dt-bindings: thermal: Fix missing required property
thermal/core: Add missing EXPORT_SYMBOL_GPL
Rafael J. Wysocki [Sat, 27 Aug 2022 12:43:18 +0000 (14:43 +0200)]
Merge branch 'acpi-processor' into acpi
* acpi-processor:
ACPI: processor: Remove freq Qos request for all CPUs
Stephane Eranian [Wed, 3 Aug 2022 16:00:31 +0000 (09:00 -0700)]
perf/x86/intel/uncore: Fix broken read_counter() for SNB IMC PMU
Existing code was generating bogus counts for the SNB IMC bandwidth counters:
$ perf stat -a -I 1000 -e uncore_imc/data_reads/,uncore_imc/data_writes/
1.
000327813 1,024.03 MiB uncore_imc/data_reads/
1.
000327813 20.73 MiB uncore_imc/data_writes/
2.
000580153 261,120.00 MiB uncore_imc/data_reads/
2.
000580153 23.28 MiB uncore_imc/data_writes/
The problem was introduced by commit:
07ce734dd8ad ("perf/x86/intel/uncore: Clean up client IMC")
Where the read_counter callback was replace to point to the generic
uncore_mmio_read_counter() function.
The SNB IMC counters are freerunnig 32-bit counters laid out contiguously in
MMIO. But uncore_mmio_read_counter() is using a readq() call to read from
MMIO therefore reading 64-bit from MMIO. Although this is okay for the
uncore_perf_event_update() function because it is shifting the value based
on the actual counter width to compute a delta, it is not okay for the
uncore_pmu_event_start() which is simply reading the counter and therefore
priming the event->prev_count with a bogus value which is responsible for
causing bogus deltas in the perf stat command above.
The fix is to reintroduce the custom callback for read_counter for the SNB
IMC PMU and use readl() instead of readq(). With the change the output of
perf stat is back to normal:
$ perf stat -a -I 1000 -e uncore_imc/data_reads/,uncore_imc/data_writes/
1.
000120987 296.94 MiB uncore_imc/data_reads/
1.
000120987 138.42 MiB uncore_imc/data_writes/
2.
000403144 175.91 MiB uncore_imc/data_reads/
2.
000403144 68.50 MiB uncore_imc/data_writes/
Fixes: 07ce734dd8ad ("perf/x86/intel/uncore: Clean up client IMC")
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20220803160031.1379788-1-eranian@google.com
Linus Torvalds [Fri, 26 Aug 2022 18:32:53 +0000 (11:32 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"A bumper crop of arm64 fixes for -rc3.
The largest change is fixing our parsing of the 'rodata=full' command
line option, which kstrtobool() started treating as 'rodata=false'.
The fix actually makes the parsing of that option much less fragile
and updates the documentation at the same time.
We still have a boot issue pending when KASLR is disabled at compile
time, but there's a fresh fix on the list which I'll send next week if
it holds up to testing.
Summary:
- Fix workaround for Cortex-A76 erratum #
1286807
- Add workaround for AMU erratum #
2457168 on Cortex-A510
- Drop reference to removed CONFIG_ARCH_RANDOM #define
- Fix parsing of the "rodata=full" cmdline option
- Fix a bunch of issues in the SME register state switching and sigframe code
- Fix incorrect extraction of the CTR_EL0.CWG register field
- Fix ACPI cache topology probing when the PPTT is not present
- Trivial comment and whitespace fixes"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/sme: Don't flush SVE register state when handling SME traps
arm64/sme: Don't flush SVE register state when allocating SME storage
arm64/signal: Flush FPSIMD register state when disabling streaming mode
arm64/signal: Raise limit on stack frames
arm64/cache: Fix cache_type_cwg() for register generation
arm64/sysreg: Guard SYS_FIELD_ macros for asm
arm64/sysreg: Directly include bitfield.h
arm64: cacheinfo: Fix incorrect assignment of signed error value to unsigned fw_level
arm64: errata: add detection for AMEVCNTR01 incrementing incorrectly
arm64: fix rodata=full
arm64: Fix comment typo
docs/arm64: elf_hwcaps: unify newlines in HWCAP lists
arm64: adjust KASLR relocation after ARCH_RANDOM removal
arm64: Fix match_list for erratum
1286807 on Arm Cortex-A76
Linus Torvalds [Fri, 26 Aug 2022 18:26:27 +0000 (11:26 -0700)]
Merge tag 'riscv-for-linus-6.0-rc3' of git://git./linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- A handful of fixes for the Microchip device trees
- A pair of fixes to eliminate build warnings
* tag 'riscv-for-linus-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: dts: microchip: mpfs: remove pci axi address translation property
riscv: dts: microchip: mpfs: remove bogus card-detect-delay
riscv: dts: microchip: mpfs: remove ti,fifo-depth property
riscv: dts: microchip: mpfs: fix incorrect pcie child node name
riscv: traps: add missing prototype
riscv: signal: fix missing prototype warning
riscv: dts: microchip: correct L2 cache interrupts
Linus Torvalds [Fri, 26 Aug 2022 18:21:18 +0000 (11:21 -0700)]
Merge tag 'loongarch-fixes-6.0-1' of git://git./linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"Fix a bunch of build errors/warnings, a poweroff error and an
unbalanced locking in do_page_fault()"
* tag 'loongarch-fixes-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: mm: Avoid unnecessary page fault retires on shared memory types
LoongArch: Add subword xchg/cmpxchg emulation
LoongArch: Cleanup headers to avoid circular dependency
LoongArch: Cleanup reset routines with new API
LoongArch: Fix build warnings in VDSO
LoongArch: Select PCI_QUIRKS to avoid build error
Linus Torvalds [Fri, 26 Aug 2022 18:15:37 +0000 (11:15 -0700)]
Merge tag 'drm-fixes-2022-08-26-1' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Weekly fixes, lots of amdgpu fixes mostly for IP blocks introduced in
6.0-rc1, otherwise vc4, nouveau fixes.
gem:
- Fix handle release leak
nouveau:
- Fix fencing when moving BO
vc4:
- HDMI fixes
amdgpu:
- GFX 11.0 fixes
- PSP XGMI handling fixes
- GFX9 fix for compute-only IPs
- Drop duplicated function call
- Fix warning due to missing header
- NBIO 7.7 fixes
- DCN 3.1.4 fixes
- SDMA 6.0 fixes
- SMU 13.0 fixes
- Arcturus GPUVM page table fix
- MMHUB 1.0 fix
amdkfd:
- GC 10.3.7 fix
radeon:
- Delayed work flush fix"
* tag 'drm-fixes-2022-08-26-1' of git://anongit.freedesktop.org/drm/drm: (21 commits)
drm/amdgpu: mmVM_L2_CNTL3 register not initialized correctly
drm/amdgpu: add MGCG perfmon setting for gfx11
drm/amdkfd: Fix isa version for the GC 10.3.7
drm/amdgpu: Fix page table setup on Arcturus
drm/amd/pm: update SMU 13.0.0 driver_if header
drm/amdgpu: add sdma instance check for gfx11 CGCG
drm/amd/display: enable PCON support for dcn314
drm/amdgpu: enable NBIO IP v7.7.0 Clock Gating
drm/amdgpu: add NBIO IP v7.7.0 Clock Gating support
drm/amdgpu: add TX_POWER_CTRL_1 macro definitions for NBIO IP v7.7.0
nouveau: explicitly wait on the fence in nouveau_bo_move_m2mf
drm/radeon: add a force flush to delay work when radeon
drm/amd/display: Include missing header
drm/amdgpu: Remove the additional kfd pre reset call for sriov
drm/amdgpu: Check num_gfx_rings for gfx v9_0 rb setup.
drm/amdgpu: fix hive reference leak when adding xgmi device
drm/amdgpu: Move psp_xgmi_terminate call from amdgpu_xgmi_remove_device to psp_hw_fini
drm/amdgpu: enable GFXOFF allow control for GC IP v11.0.1
drm/gem: Fix GEM handle release errors
drm/vc4: hdmi: Rework power up
...
Linus Torvalds [Fri, 26 Aug 2022 18:05:54 +0000 (11:05 -0700)]
Merge tag 'block-6.0-2022-08-26' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- MD pull request via Song:
- Fix for clustered raid (Guoqing Jiang)
- req_op fix (Bart Van Assche)
- Fix race condition in raid recreate (David Sloan)
- loop configuration overflow fix (Siddh)
- Fix missing commit_rqs call for certain conditions (Yu)
* tag 'block-6.0-2022-08-26' of git://git.kernel.dk/linux-block:
md: call __md_stop_writes in md_stop
Revert "md-raid: destroy the bitmap after destroying the thread"
md: Flush workqueue md_rdev_misc_wq in md_alloc()
md/raid10: Fix the data type of an r10_sync_page_io() argument
loop: Check for overflow while configuring loop
blk-mq: fix io hung due to missing commit_rqs
Linus Torvalds [Fri, 26 Aug 2022 18:01:52 +0000 (11:01 -0700)]
Merge tag 'io_uring-6.0-2022-08-26' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
- Add missing header file to the MAINTAINERS entry for io_uring (Ammar)
- liburing and the kernel ship the same io_uring.h header, but one
change we've had for a long time only in liburing is to ensure it's
C++ safe. Add extern C around it, so we can more easily sync them in
the future (Ammar)
- Fix an off-by-one in the sync cancel added in this merge window (me)
- Error handling fix for passthrough (Kanchan)
- Fix for address saving for async execution for the zc tx support
(Pavel)
- Fix ordering for TCP zc notifications, so we always have them ordered
correctly between "data was sent" and "data was acked". This isn't
strictly needed with the notification slots, but we've been pondering
disabling the slot support for 6.0 - and if we do, then we do require
the ordering to be sane. Regardless of that, it's the sane thing to
do in terms of API (Pavel)
- Minor cleanup for indentation and lockdep annotation (Pavel)
* tag 'io_uring-6.0-2022-08-26' of git://git.kernel.dk/linux-block:
io_uring/net: save address for sendzc async execution
io_uring: conditional ->async_data allocation
io_uring/notif: order notif vs send CQEs
io_uring/net: fix indentation
io_uring/net: fix zc send link failing
io_uring/net: fix must_hold annotation
io_uring: fix submission-failure handling for uring-cmd
io_uring: fix off-by-one in sync cancelation file check
io_uring: uapi: Add `extern "C"` in io_uring.h for liburing
MAINTAINERS: Add `include/linux/io_uring_types.h`
Shigeru Yoshida [Thu, 18 Aug 2022 18:13:36 +0000 (03:13 +0900)]
fbdev: fbcon: Properly revert changes when vc_resize() failed
fbcon_do_set_font() calls vc_resize() when font size is changed.
However, if if vc_resize() failed, current implementation doesn't
revert changes for font size, and this causes inconsistent state.
syzbot reported unable to handle page fault due to this issue [1].
syzbot's repro uses fault injection which cause failure for memory
allocation, so vc_resize() failed.
This patch fixes this issue by properly revert changes for font
related date when vc_resize() failed.
Link: https://syzkaller.appspot.com/bug?id=3443d3a1fa6d964dd7310a0cb1696d165a3e07c4
Reported-by: syzbot+a168dbeaaa7778273c1b@syzkaller.appspotmail.com
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Signed-off-by: Helge Deller <deller@gmx.de>
CC: stable@vger.kernel.org # 5.15+
Linus Torvalds [Fri, 26 Aug 2022 17:29:56 +0000 (10:29 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Ten fixes.
Of the three core changes, the two large ones are a complete reversion
of the async rework and an ALUA timing rework (the latter shouldn't
affect non-ALUA paths).
The remaining patches are all small and all but one in drivers"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: sd: Revert "Rework asynchronous resume support"
scsi: core: Fix passthrough retry counter handling
scsi: ufs: core: Reduce the power mode change timeout
scsi: storvsc: Remove WQ_MEM_RECLAIM from storvsc_error_wq
scsi: ufs: host: ufs-exynos: Make fsd_ufs_drvs static
scsi: megaraid_sas: Remove unnecessary kfree()
scsi: megaraid_sas: Fix double kfree()
scsi: ufs: core: Enable link lost interrupt
scsi: core: Allow the ALUA transitioning state enough time
scsi: qla2xxx: Disable ATIO interrupt coalesce for quad port ISP27XX
Mikulas Patocka [Fri, 26 Aug 2022 13:17:08 +0000 (09:17 -0400)]
wait_on_bit: add an acquire memory barrier
There are several places in the kernel where wait_on_bit is not followed
by a memory barrier (for example, in drivers/md/dm-bufio.c:new_read).
On architectures with weak memory ordering, it may happen that memory
accesses that follow wait_on_bit are reordered before wait_on_bit and
they may return invalid data.
Fix this class of bugs by introducing a new function "test_bit_acquire"
that works like test_bit, but has acquire memory ordering semantics.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Airlie [Thu, 25 Aug 2022 23:56:53 +0000 (09:56 +1000)]
Merge tag 'amd-drm-fixes-6.0-2022-08-25' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.0-2022-08-25:
amdgpu:
- GFX 11.0 fixes
- PSP XGMI handling fixes
- GFX9 fix for compute-only IPs
- Drop duplicated function call
- Fix warning due to missing header
- NBIO 7.7 fixes
- DCN 3.1.4 fixes
- SDMA 6.0 fixes
- SMU 13.0 fixes
- Arcturus GPUVM page table fix
- MMHUB 1.0 fix
amdkfd:
- GC 10.3.7 fix
radeon:
- Delayed work flush fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220825181243.5853-1-alexander.deucher@amd.com
Dave Airlie [Thu, 25 Aug 2022 23:47:49 +0000 (09:47 +1000)]
Merge tag 'drm-misc-fixes-2022-08-25' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Short summary of fixes pull:
* gem: Fixes handle release leak
* nouveau: Fix fencing when moving BO
* vc4: HDMI fixes
* Backmerging for v6.0-rc1
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YwclSWheC+Ai+u+v@linux-uq9g
Palmer Dabbelt [Thu, 25 Aug 2022 23:37:17 +0000 (16:37 -0700)]
Merge branch 'riscv-variable_fixes_without_kvm' of git://git./linux/kernel/git/palmer/linux.git into fixes
This contains a pair of fixes for build-time warnings.
* 'riscv-variable_fixes_without_kvm' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux.git:
riscv: traps: add missing prototype
riscv: signal: fix missing prototype warning
Palmer Dabbelt [Thu, 25 Aug 2022 23:06:49 +0000 (16:06 -0700)]
Merge tag 'dt-fixes-for-palmer-6.0-rc3' of git://git./linux/kernel/git/conor/linux.git into fixes
Microchip RISC-V devicetree fixes for 6.0-rc3
Two sets of fixes this time around:
- A fix for the interrupt ordering of the l2-cache controller. If the
driver is enabled, it would spam the console /constantly/, rendering
the system useless.
- General cleanup for some bogus properties in the dt, part of my quest
for zero dtbs_check warnings.
On that note, the interrupt ordering adds a dtbs_check warning - but I
considered that fixing the potentially useless system was more of a
priority.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
* tag 'dt-fixes-for-palmer-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/conor/linux.git:
riscv: dts: microchip: mpfs: remove pci axi address translation property
riscv: dts: microchip: mpfs: remove bogus card-detect-delay
riscv: dts: microchip: mpfs: remove ti,fifo-depth property
riscv: dts: microchip: mpfs: fix incorrect pcie child node name
riscv: dts: microchip: correct L2 cache interrupts
Richard Guy Briggs [Thu, 25 Aug 2022 19:32:40 +0000 (15:32 -0400)]
audit: move audit_return_fixup before the filters
The success and return_code are needed by the filters. Move
audit_return_fixup() before the filters. This was causing syscall
auditing events to be missed.
Link: https://github.com/linux-audit/audit-kernel/issues/138
Cc: stable@vger.kernel.org
Fixes: 12c5e81d3fd0 ("audit: prepare audit_context for use in calling contexts beyond syscalls")
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
[PM: manual merge required]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Linus Torvalds [Thu, 25 Aug 2022 21:03:58 +0000 (14:03 -0700)]
Merge tag 'net-6.0-rc3' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from ipsec and netfilter (with one broken Fixes tag).
Current release - new code bugs:
- dsa: don't dereference NULL extack in dsa_slave_changeupper()
- dpaa: fix <1G ethernet on LS1046ARDB
- neigh: don't call kfree_skb() under spin_lock_irqsave()
Previous releases - regressions:
- r8152: fix the RX FIFO settings when suspending
- dsa: microchip: keep compatibility with device tree blobs with no
phy-mode
- Revert "net: macsec: update SCI upon MAC address change."
- Revert "xfrm: update SA curlft.use_time", comply with RFC 2367
Previous releases - always broken:
- netfilter: conntrack: work around exceeded TCP receive window
- ipsec: fix a null pointer dereference of dst->dev on a metadata dst
in xfrm_lookup_with_ifid
- moxa: get rid of asymmetry in DMA mapping/unmapping
- dsa: microchip: make learning configurable and keep it off while
standalone
- ice: xsk: prohibit usage of non-balanced queue id
- rxrpc: fix locking in rxrpc's sendmsg
Misc:
- another chunk of sysctl data race silencing"
* tag 'net-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits)
net: lantiq_xrx200: restore buffer if memory allocation failed
net: lantiq_xrx200: fix lock under memory pressure
net: lantiq_xrx200: confirm skb is allocated before using
net: stmmac: work around sporadic tx issue on link-up
ionic: VF initial random MAC address if no assigned mac
ionic: fix up issues with handling EAGAIN on FW cmds
ionic: clear broken state on generation change
rxrpc: Fix locking in rxrpc's sendmsg
net: ethernet: mtk_eth_soc: fix hw hash reporting for MTK_NETSYS_V2
MAINTAINERS: rectify file entry in BONDING DRIVER
i40e: Fix incorrect address type for IPv6 flow rules
ixgbe: stop resetting SYSTIME in ixgbe_ptp_start_cyclecounter
net: Fix a data-race around sysctl_somaxconn.
net: Fix a data-race around netdev_unregister_timeout_secs.
net: Fix a data-race around gro_normal_batch.
net: Fix data-races around sysctl_devconf_inherit_init_net.
net: Fix data-races around sysctl_fb_tunnels_only_for_init_net.
net: Fix a data-race around netdev_budget_usecs.
net: Fix data-races around sysctl_max_skb_frags.
net: Fix a data-race around netdev_budget.
...
Jakub Kicinski [Thu, 25 Aug 2022 19:41:41 +0000 (12:41 -0700)]
Merge branch 'net-lantiq_xrx200-fix-errors-under-memory-pressure'
Aleksander Jan Bajkowski says:
====================
net: lantiq_xrx200: fix errors under memory pressure
This series fixes issues that can occur in the driver under memory pressure.
Situations when the system cannot allocate memory are rare, so the mentioned
bugs have been fixed recently. The patches have been tested on a BT Home
router with the Lantiq xRX200 chipset.
Changelog:
v3: - removed netdev_err() log from the first patch
v2:
- the second patch has been changed, so that under memory pressure situation
the driver will not receive packets indefinitely regardless of the NAPI budget,
- the third patch has been added.
====================
Link: https://lore.kernel.org/r/20220824215408.4695-1-olek2@wp.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Aleksander Jan Bajkowski [Wed, 24 Aug 2022 21:54:08 +0000 (23:54 +0200)]
net: lantiq_xrx200: restore buffer if memory allocation failed
In a situation where memory allocation fails, an invalid buffer address
is stored. When this descriptor is used again, the system panics in the
build_skb() function when accessing memory.
Fixes: 7ea6cd16f159 ("lantiq: net: fix duplicated skb in rx descriptor ring")
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Aleksander Jan Bajkowski [Wed, 24 Aug 2022 21:54:07 +0000 (23:54 +0200)]
net: lantiq_xrx200: fix lock under memory pressure
When the xrx200_hw_receive() function returns -ENOMEM, the NAPI poll
function immediately returns an error.
This is incorrect for two reasons:
* the function terminates without enabling interrupts or scheduling NAPI,
* the error code (-ENOMEM) is returned instead of the number of received
packets.
After the first memory allocation failure occurs, packet reception is
locked due to disabled interrupts from DMA..
Fixes: fe1a56420cf2 ("net: lantiq: Add Lantiq / Intel VRX200 Ethernet driver")
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Aleksander Jan Bajkowski [Wed, 24 Aug 2022 21:54:06 +0000 (23:54 +0200)]
net: lantiq_xrx200: confirm skb is allocated before using
xrx200_hw_receive() assumes build_skb() always works and goes straight
to skb_reserve(). However, build_skb() can fail under memory pressure.
Add a check in case build_skb() failed to allocate and return NULL.
Fixes: e015593573b3 ("net: lantiq_xrx200: convert to build_skb")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Heiner Kallweit [Wed, 24 Aug 2022 20:34:49 +0000 (22:34 +0200)]
net: stmmac: work around sporadic tx issue on link-up
This is a follow-up to the discussion in [0]. It seems to me that
at least the IP version used on Amlogic SoC's sometimes has a problem
if register MAC_CTRL_REG is written whilst the chip is still processing
a previous write. But that's just a guess.
Adding a delay between two writes to this register helps, but we can
also simply omit the offending second write. This patch uses the second
approach and is based on a suggestion from Qi Duan.
Benefit of this approach is that we can save few register writes, also
on not affected chip versions.
[0] https://www.spinics.net/lists/netdev/msg831526.html
Fixes: bfab27a146ed ("stmmac: add the experimental PCI support")
Suggested-by: Qi Duan <qi.duan@amlogic.com>
Suggested-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/e99857ce-bd90-5093-ca8c-8cd480b5a0a2@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 25 Aug 2022 19:40:29 +0000 (12:40 -0700)]
Merge branch '10GbE' of git://git./linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2022-08-24 (ixgbe, i40e)
This series contains updates to ixgbe and i40e drivers.
Jake stops incorrect resetting of SYSTIME registers when starting
cyclecounter for ixgbe.
Sylwester corrects a check on source IP address when validating destination
for i40e.
* '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
i40e: Fix incorrect address type for IPv6 flow rules
ixgbe: stop resetting SYSTIME in ixgbe_ptp_start_cyclecounter
====================
Link: https://lore.kernel.org/r/20220824193748.874343-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 25 Aug 2022 19:40:17 +0000 (12:40 -0700)]
Merge branch 'ionic-bug-fixes'
Shannon Nelson says:
====================
ionic: bug fixes
These are a couple of maintenance bug fixes for the Pensando ionic
networking driver.
Mohamed takes care of a "plays well with others" issue where the
VF spec is a bit vague on VF mac addresses, but certain customers
have come to expect behavior based on other vendor drivers.
Shannon addresses a couple of corner cases seen in internal
stress testing.
====================
Link: https://lore.kernel.org/r/20220824165051.6185-1-snelson@pensando.io
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
R Mohamed Shah [Wed, 24 Aug 2022 16:50:51 +0000 (09:50 -0700)]
ionic: VF initial random MAC address if no assigned mac
Assign a random mac address to the VF interface station
address if it boots with a zero mac address in order to match
similar behavior seen in other VF drivers. Handle the errors
where the older firmware does not allow the VF to set its own
station address.
Newer firmware will allow the VF to set the station mac address
if it hasn't already been set administratively through the PF.
Setting it will also be allowed if the VF has trust.
Fixes: fbb39807e9ae ("ionic: support sr-iov operations")
Signed-off-by: R Mohamed Shah <mohamed@pensando.io>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Shannon Nelson [Wed, 24 Aug 2022 16:50:50 +0000 (09:50 -0700)]
ionic: fix up issues with handling EAGAIN on FW cmds
In looping on FW update tests we occasionally see the
FW_ACTIVATE_STATUS command fail while it is in its EAGAIN loop
waiting for the FW activate step to finsh inside the FW. The
firmware is complaining that the done bit is set when a new
dev_cmd is going to be processed.
Doing a clean on the cmd registers and doorbell before exiting
the wait-for-done and cleaning the done bit before the sleep
prevents this from occurring.
Fixes: fbfb8031533c ("ionic: Add hardware init and device commands")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Shannon Nelson [Wed, 24 Aug 2022 16:50:49 +0000 (09:50 -0700)]
ionic: clear broken state on generation change
There is a case found in heavy testing where a link flap happens just
before a firmware Recovery event and the driver gets stuck in the
BROKEN state. This comes from the driver getting interrupted by a FW
generation change when coming back up from the link flap, and the call
to ionic_start_queues() in ionic_link_status_check() fails. This can be
addressed by having the fw_up code clear the BROKEN bit if seen, rather
than waiting for a user to manually force the interface down and then
back up.
Fixes: 9e8eaf8427b6 ("ionic: stop watchdog when in broken state")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Howells [Wed, 24 Aug 2022 16:35:45 +0000 (17:35 +0100)]
rxrpc: Fix locking in rxrpc's sendmsg
Fix three bugs in the rxrpc's sendmsg implementation:
(1) rxrpc_new_client_call() should release the socket lock when returning
an error from rxrpc_get_call_slot().
(2) rxrpc_wait_for_tx_window_intr() will return without the call mutex
held in the event that we're interrupted by a signal whilst waiting
for tx space on the socket or relocking the call mutex afterwards.
Fix this by: (a) moving the unlock/lock of the call mutex up to
rxrpc_send_data() such that the lock is not held around all of
rxrpc_wait_for_tx_window*() and (b) indicating to higher callers
whether we're return with the lock dropped. Note that this means
recvmsg() will not block on this call whilst we're waiting.
(3) After dropping and regaining the call mutex, rxrpc_send_data() needs
to go and recheck the state of the tx_pending buffer and the
tx_total_len check in case we raced with another sendmsg() on the same
call.
Thinking on this some more, it might make sense to have different locks for
sendmsg() and recvmsg(). There's probably no need to make recvmsg() wait
for sendmsg(). It does mean that recvmsg() can return MSG_EOR indicating
that a call is dead before a sendmsg() to that call returns - but that can
currently happen anyway.
Without fix (2), something like the following can be induced:
WARNING: bad unlock balance detected!
5.16.0-rc6-syzkaller #0 Not tainted
-------------------------------------
syz-executor011/3597 is trying to release lock (&call->user_mutex) at:
[<
ffffffff885163a3>] rxrpc_do_sendmsg+0xc13/0x1350 net/rxrpc/sendmsg.c:748
but there are no more locks to release!
other info that might help us debug this:
no locks held by syz-executor011/3597.
...
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_unlock_imbalance_bug include/trace/events/lock.h:58 [inline]
__lock_release kernel/locking/lockdep.c:5306 [inline]
lock_release.cold+0x49/0x4e kernel/locking/lockdep.c:5657
__mutex_unlock_slowpath+0x99/0x5e0 kernel/locking/mutex.c:900
rxrpc_do_sendmsg+0xc13/0x1350 net/rxrpc/sendmsg.c:748
rxrpc_sendmsg+0x420/0x630 net/rxrpc/af_rxrpc.c:561
sock_sendmsg_nosec net/socket.c:704 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:724
____sys_sendmsg+0x6e8/0x810 net/socket.c:2409
___sys_sendmsg+0xf3/0x170 net/socket.c:2463
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2492
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
[Thanks to Hawkins Jiawei and Khalid Masum for their attempts to fix this]
Fixes: bc5e3a546d55 ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals")
Reported-by: syzbot+7f0483225d0c94cb3441@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Tested-by: syzbot+7f0483225d0c94cb3441@syzkaller.appspotmail.com
cc: Hawkins Jiawei <yin31149@gmail.com>
cc: Khalid Masum <khalid.masum.92@gmail.com>
cc: Dan Carpenter <dan.carpenter@oracle.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/166135894583.600315.7170979436768124075.stgit@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Qu Huang [Tue, 23 Aug 2022 06:44:06 +0000 (14:44 +0800)]
drm/amdgpu: mmVM_L2_CNTL3 register not initialized correctly
The mmVM_L2_CNTL3 register is not assigned an initial value
Signed-off-by: Qu Huang <jinsdb@126.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Likun Gao [Tue, 23 Aug 2022 07:34:10 +0000 (15:34 +0800)]
drm/amdgpu: add MGCG perfmon setting for gfx11
Enable GFX11 MGCG perfmon setting.
V2: set rlc to saft mode before setting.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Prike Liang [Wed, 24 Aug 2022 03:16:51 +0000 (11:16 +0800)]
drm/amdkfd: Fix isa version for the GC 10.3.7
Correct the isa version for handling KFD test.
Fixes: 7c4f4f197e0c ("drm/amdkfd: Add GC 10.3.6 and 10.3.7 KFD definitions")
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mukul Joshi [Fri, 19 Aug 2022 21:15:08 +0000 (17:15 -0400)]
drm/amdgpu: Fix page table setup on Arcturus
When translate_further is enabled, page table depth needs to
be updated. This was missing on Arcturus MMHUB init. This was
causing address translations to fail for SDMA user-mode queues.
Fixes: 352e683b72e7 ("drm/amdgpu: Enable translate_further to extend UTCL2 reach")
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Mon, 8 Aug 2022 02:41:26 +0000 (10:41 +0800)]
drm/amd/pm: update SMU 13.0.0 driver_if header
To fit the latest 78.53 PMFW.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tim Huang [Mon, 22 Aug 2022 05:30:44 +0000 (13:30 +0800)]
drm/amdgpu: add sdma instance check for gfx11 CGCG
For some ASICs, like GFX IP v11.0.1, only have one SDMA instance,
so not need to configure SDMA1_RLC_CGCG_CTRL for this case.
Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Roman Li [Mon, 22 Aug 2022 16:37:10 +0000 (12:37 -0400)]
drm/amd/display: enable PCON support for dcn314
[Why]
DCN314 supports PCON.
[How]
Explicitly enable it in dcn314 resources.
Signed-off-by: Roman Li <roman.li@amd.com>
Reviewed-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tim Huang [Mon, 15 Aug 2022 05:50:46 +0000 (13:50 +0800)]
drm/amdgpu: enable NBIO IP v7.7.0 Clock Gating
Enable AMD_CG_SUPPORT_BIF_MGCG and AMD_CG_SUPPORT_BIF_LS support.
Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tim Huang [Mon, 15 Aug 2022 05:12:21 +0000 (13:12 +0800)]
drm/amdgpu: add NBIO IP v7.7.0 Clock Gating support
Add BIF Clock Gating MGCG and LS support for NBIO IP v7.7.0.
Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tim Huang [Mon, 15 Aug 2022 05:03:49 +0000 (13:03 +0800)]
drm/amdgpu: add TX_POWER_CTRL_1 macro definitions for NBIO IP v7.7.0
Add the BIF0_PCIE_TX_POWER_CTRL_1 register offset and mask macro
definitions for AMD_CG_SUPPORT_BIF_LS.
Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Linus Torvalds [Thu, 25 Aug 2022 17:52:16 +0000 (10:52 -0700)]
Merge tag 'cgroup-for-6.0-rc2-fixes-2' of git://git./linux/kernel/git/tj/cgroup
Pull another cgroup fix from Tejun Heo:
"Commit
4f7e7236435c ("cgroup: Fix threadgroup_rwsem <->
cpus_read_lock() deadlock") required the cgroup
core to grab cpus_read_lock() before invoking ->attach().
Unfortunately, it missed adding cpus_read_lock() in
cgroup_attach_task_all(). Fix it"
* tag 'cgroup-for-6.0-rc2-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: Add missing cpus_read_lock() to cgroup_attach_task_all()
Tetsuo Handa [Thu, 25 Aug 2022 08:38:38 +0000 (17:38 +0900)]
cgroup: Add missing cpus_read_lock() to cgroup_attach_task_all()
syzbot is hitting percpu_rwsem_assert_held(&cpu_hotplug_lock) warning at
cpuset_attach() [1], for commit
4f7e7236435ca0ab ("cgroup: Fix
threadgroup_rwsem <-> cpus_read_lock() deadlock") missed that
cpuset_attach() is also called from cgroup_attach_task_all().
Add cpus_read_lock() like what cgroup_procs_write_start() does.
Link: https://syzkaller.appspot.com/bug?extid=29d3a3b4d86c8136ad9e
Reported-by: syzbot <syzbot+29d3a3b4d86c8136ad9e@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: 4f7e7236435ca0ab ("cgroup: Fix threadgroup_rwsem <-> cpus_read_lock() deadlock")
Signed-off-by: Tejun Heo <tj@kernel.org>
Juergen Gross [Thu, 25 Aug 2022 14:19:18 +0000 (16:19 +0200)]
xen/privcmd: fix error exit of privcmd_ioctl_dm_op()
The error exit of privcmd_ioctl_dm_op() is calling unlock_pages()
potentially with pages being NULL, leading to a NULL dereference.
Additionally lock_pages() doesn't check for pin_user_pages_fast()
having been completely successful, resulting in potentially not
locking all pages into memory. This could result in sporadic failures
when using the related memory in user mode.
Fix all of that by calling unlock_pages() always with the real number
of pinned pages, which will be zero in case pages being NULL, and by
checking the number of pages pinned by pin_user_pages_fast() matching
the expected number of pages.
Cc: <stable@vger.kernel.org>
Fixes: ab520be8cd5d ("xen/privcmd: Add IOCTL_PRIVCMD_DM_OP")
Reported-by: Rustam Subkhankulov <subkhankulov@ispras.ru>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Link: https://lore.kernel.org/r/20220825141918.3581-1-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Salvatore Bonaccorso [Mon, 1 Aug 2022 09:15:30 +0000 (11:15 +0200)]
Documentation/ABI: Mention retbleed vulnerability info file for sysfs
While reporting for the AMD retbleed vulnerability was added in
6b80b59b3555 ("x86/bugs: Report AMD retbleed vulnerability")
the new sysfs file was not mentioned so far in the ABI documentation for
sysfs-devices-system-cpu. Fix that.
Fixes: 6b80b59b3555 ("x86/bugs: Report AMD retbleed vulnerability")
Signed-off-by: Salvatore Bonaccorso <carnil@debian.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220801091529.325327-1-carnil@debian.org
Borislav Petkov [Wed, 24 Aug 2022 15:13:26 +0000 (17:13 +0200)]
x86/sev: Mark snp_abort() noreturn
Mark both the function prototype and definition as noreturn in order to
prevent the compiler from doing transformations which confuse objtool
like so:
vmlinux.o: warning: objtool: sme_enable+0x71: unreachable instruction
This triggers with gcc-12.
Add it and sev_es_terminate() to the objtool noreturn tracking array
too. Sort it while at it.
Suggested-by: Michael Matz <matz@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220824152420.20547-1-bp@alien8.de
Pavel Begunkov [Wed, 24 Aug 2022 12:07:43 +0000 (13:07 +0100)]
io_uring/net: save address for sendzc async execution
We usually copy all bits that a request needs from the userspace for
async execution, so the userspace can keep them on the stack. However,
send zerocopy violates this pattern for addresses and may reloads it
e.g. from io-wq. Save the address if any in ->async_data as usual.
Reported-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/d7512d7aa9abcd36e9afe1a4d292a24cb2d157e5.1661342812.git.asml.silence@gmail.com
[axboe: fold in incremental fix]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Gerald Schaefer [Wed, 17 Aug 2022 13:26:03 +0000 (15:26 +0200)]
s390/mm: do not trigger write fault when vma does not allow VM_WRITE
For non-protection pXd_none() page faults in do_dat_exception(), we
call do_exception() with access == (VM_READ | VM_WRITE | VM_EXEC).
In do_exception(), vma->vm_flags is checked against that before
calling handle_mm_fault().
Since commit
92f842eac7ee3 ("[S390] store indication fault optimization"),
we call handle_mm_fault() with FAULT_FLAG_WRITE, when recognizing that
it was a write access. However, the vma flags check is still only
checking against (VM_READ | VM_WRITE | VM_EXEC), and therefore also
calling handle_mm_fault() with FAULT_FLAG_WRITE in cases where the vma
does not allow VM_WRITE.
Fix this by changing access check in do_exception() to VM_WRITE only,
when recognizing write access.
Link: https://lkml.kernel.org/r/20220811103435.188481-3-david@redhat.com
Fixes: 92f842eac7ee3 ("[S390] store indication fault optimization")
Cc: <stable@vger.kernel.org>
Reported-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Brian Foster [Tue, 16 Aug 2022 15:54:07 +0000 (11:54 -0400)]
s390: fix double free of GS and RI CBs on fork() failure
The pointers for guarded storage and runtime instrumentation control
blocks are stored in the thread_struct of the associated task. These
pointers are initially copied on fork() via arch_dup_task_struct()
and then cleared via copy_thread() before fork() returns. If fork()
happens to fail after the initial task dup and before copy_thread(),
the newly allocated task and associated thread_struct memory are
freed via free_task() -> arch_release_task_struct(). This results in
a double free of the guarded storage and runtime info structs
because the fields in the failed task still refer to memory
associated with the source task.
This problem can manifest as a BUG_ON() in set_freepointer() (with
CONFIG_SLAB_FREELIST_HARDENED enabled) or KASAN splat (if enabled)
when running trinity syscall fuzz tests on s390x. To avoid this
problem, clear the associated pointer fields in
arch_dup_task_struct() immediately after the new task is copied.
Note that the RI flag is still cleared in copy_thread() because it
resides in thread stack memory and that is where stack info is
copied.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Fixes: 8d9047f8b967c ("s390/runtime instrumentation: simplify task exit handling")
Fixes: 7b83c6297d2fc ("s390/guarded storage: simplify task exit handling")
Cc: <stable@vger.kernel.org> # 4.15
Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Link: https://lore.kernel.org/r/20220816155407.537372-1-bfoster@redhat.com
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Wolfram Sang [Thu, 18 Aug 2022 21:01:22 +0000 (23:01 +0200)]
xen: move from strlcpy with unused retval to strscpy
Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.
Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Link: https://lore.kernel.org/r/20220818210122.7613-1-wsa+renesas@sang-engineering.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Lukas Bulwahn [Wed, 17 Aug 2022 04:43:33 +0000 (06:43 +0200)]
xen: x86: remove setting the obsolete config XEN_MAX_DOMAIN_MEMORY
Commit
c70727a5bc18 ("xen: allow more than 512 GB of RAM for 64 bit
pv-domains") from July 2015 replaces the config XEN_MAX_DOMAIN_MEMORY with
a new config XEN_512GB, but misses to adjust arch/x86/configs/xen.config.
As XEN_512GB defaults to yes, there is no need to explicitly set any config
in xen.config.
Just remove setting the obsolete config XEN_MAX_DOMAIN_MEMORY.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220817044333.22310-1-lukas.bulwahn@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Huacai Chen [Thu, 25 Aug 2022 11:34:59 +0000 (19:34 +0800)]
LoongArch: mm: Avoid unnecessary page fault retires on shared memory types
Commit
d92725256b4f22d0 ("mm: avoid unnecessary page fault retires on
shared memory types") modifies do_page_fault() to handle the VM_FAULT_
COMPLETED case, but forget to change for LoongArch, so fix it as other
architectures does.
Fixes: d92725256b4f22d0 ("mm: avoid unnecessary page fault retires on shared memory types")
Reviewed-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Huacai Chen [Thu, 25 Aug 2022 11:34:59 +0000 (19:34 +0800)]
LoongArch: Add subword xchg/cmpxchg emulation
LoongArch only support 32-bit/64-bit xchg/cmpxchg in native. But percpu
operation, qspinlock and some drivers need 8-bit/16-bit xchg/cmpxchg. We
add subword xchg/cmpxchg emulation in this patch because the emulation
has better performance than the generic implementation (on NUMA system),
and it can fix some build errors meanwhile [1].
LoongArch's guarantee for forward progress (avoid many ll/sc happening
at the same time and no one succeeds):
We have the "exclusive access (with timeout) of ll" feature to avoid
simultaneous ll (which also blocks other memory load/store on the same
address), and the "random delay of sc" feature to avoid simultaneous
sc. It is a mandatory requirement for multi-core LoongArch processors
to implement such features, only except those single-core and dual-core
processors (they also don't support multi-chip interconnection).
Feature bits are introduced in CPUCFG3, bit 3 and bit 4 [2].
[1] https://lore.kernel.org/loongarch/CAAhV-H6vvkuOzy8OemWdYK3taj5Jn3bFX0ZTwE=twM8ywpBUYA@mail.gmail.com/T/#t
[2] https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#_cpucfg
Reported-by: Sudip Mukherjee (Codethink) <sudipm.mukherjee@gmail.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rui Wang <wangrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Huacai Chen [Thu, 25 Aug 2022 11:34:59 +0000 (19:34 +0800)]
LoongArch: Cleanup headers to avoid circular dependency
When enable GENERIC_IOREMAP, there will be circular dependency to cause
build errors. The root cause is that pgtable.h shouldn't include io.h
but pgtable.h need some macros defined in io.h. So cleanup those macros
and remove the unnecessary inclusions, as other architectures do.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Huacai Chen [Thu, 25 Aug 2022 11:34:59 +0000 (19:34 +0800)]
LoongArch: Cleanup reset routines with new API
Cleanup reset routines by using new do_kernel_power_off() instead of old
pm_power_off(), and then simplify the whole file (reset.c) organization
by inlining some functions. This cleanup also fix a poweroff error if EFI
runtime is disabled.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Huacai Chen [Thu, 25 Aug 2022 11:34:59 +0000 (19:34 +0800)]
LoongArch: Fix build warnings in VDSO
Fix build warnings in VDSO as below:
arch/loongarch/vdso/vgettimeofday.c:9:5: warning: no previous prototype for '__vdso_clock_gettime' [-Wmissing-prototypes]
9 | int __vdso_clock_gettime(clockid_t clock,
| ^~~~~~~~~~~~~~~~~~~~
arch/loongarch/vdso/vgettimeofday.c:15:5: warning: no previous prototype for '__vdso_gettimeofday' [-Wmissing-prototypes]
15 | int __vdso_gettimeofday(struct __kernel_old_timeval *tv,
| ^~~~~~~~~~~~~~~~~~~
arch/loongarch/vdso/vgettimeofday.c:21:5: warning: no previous prototype for '__vdso_clock_getres' [-Wmissing-prototypes]
21 | int __vdso_clock_getres(clockid_t clock_id,
| ^~~~~~~~~~~~~~~~~~~
arch/loongarch/vdso/vgetcpu.c:27:5: warning: no previous prototype for '__vdso_getcpu' [-Wmissing-prototypes]
27 | int __vdso_getcpu(unsigned int *cpu, unsigned int *node, struct getcpu_cache *unused)
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Huacai Chen [Thu, 25 Aug 2022 11:34:59 +0000 (19:34 +0800)]
LoongArch: Select PCI_QUIRKS to avoid build error
PCI_LOONGSON is a mandatory for LoongArch and it is selected in Kconfig
unconditionally, but its dependency PCI_QUIRKS is missing and may cause
a build error when "make randconfig":
arch/loongarch/pci/acpi.c: In function 'pci_acpi_setup_ecam_mapping':
>> arch/loongarch/pci/acpi.c:103:29: error: 'loongson_pci_ecam_ops' undeclared (first use in this function)
103 | ecam_ops = &loongson_pci_ecam_ops;
| ^~~~~~~~~~~~~~~~~~~~~
arch/loongarch/pci/acpi.c:103:29: note: each undeclared identifier is reported only once for each function it appears in
Kconfig warnings: (for reference only)
WARNING: unmet direct dependencies detected for PCI_LOONGSON
Depends on [n]: PCI [=y] && (MACH_LOONGSON64 [=y] || COMPILE_TEST [=y]) && (OF [=y] || ACPI [=y]) && PCI_QUIRKS [=n]
Selected by [y]:
- LOONGARCH [=y]
Fix it by selecting PCI_QUIRKS unconditionally, too.
Reported-by: kernel test robot <lkp@intel.com>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Sakari Ailus [Thu, 25 Aug 2022 11:17:15 +0000 (14:17 +0300)]
ACPI: property: Remove default association from integer maximum values
Remove the default association from integer maximum value checks. It is
not necessary and has caused a bug in other associations being unnoticed.
Fixes: 923044133367 ("ACPI: property: Unify integer value reading functions")
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sakari Ailus [Wed, 24 Aug 2022 11:59:56 +0000 (14:59 +0300)]
ACPI: property: Ignore already existing data node tags
ACPI node pointers are attached to data node handles, in order to resolve
string references to them. _DSD guide allows the same node to be reached
from multiple parent nodes, leading the node enumeration algorithm to each
such nodes more than once. As attached data already already exists,
attaching data with the same tag will fail. Address this problem by
ignoring nodes that have been already tagged.
Fixes: 1d52f10917a7 ("ACPI: property: Tie data nodes to acpi handles")
Reported-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Tested-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stefan Binding [Fri, 12 Aug 2022 13:06:45 +0000 (14:06 +0100)]
ACPI: property: Fix type detection of unified integer reading functions
The current code expects the type of the value to be an integer type,
instead the value passed to the macro is a pointer.
Ensure the size comparison uses the correct pointer type to choose the
max value, instead of using the integer type.
Fixes: 923044133367 ("ACPI: property: Unify integer value reading functions")
Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Tested-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Tested-by: John Garry <john.garry@huawei.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lorenzo Bianconi [Tue, 23 Aug 2022 12:24:07 +0000 (14:24 +0200)]
net: ethernet: mtk_eth_soc: fix hw hash reporting for MTK_NETSYS_V2
Properly report hw rx hash for mt7986 chipset accroding to the new dma
descriptor layout.
Fixes: 197c9e9b17b11 ("net: ethernet: mtk_eth_soc: introduce support for mt7986 chipset")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/091394ea4e705fbb35f828011d98d0ba33808f69.1661257293.git.lorenzo@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Zhang Xiaoxu [Tue, 23 Aug 2022 12:52:02 +0000 (20:52 +0800)]
cifs: Add helper function to check smb1+ server
SMB1 server's header_preamble_size is not 0, add use is_smb1 function
to simplify the code, no actual functional changes.
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Zhang Xiaoxu [Tue, 23 Aug 2022 12:52:01 +0000 (20:52 +0800)]
cifs: Use help macro to get the mid header size
It's better to use MID_HEADER_SIZE because the unfolded expression
too long. No actual functional changes, minor readability improvement.
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
Zhang Xiaoxu [Tue, 23 Aug 2022 12:52:00 +0000 (20:52 +0800)]
cifs: Use help macro to get the header preamble size
It's better to use HEADER_PREAMBLE_SIZE because the unfolded expression
too long. No actual functional changes, minor readability improvement.
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
Jakub Kicinski [Thu, 25 Aug 2022 02:18:09 +0000 (19:18 -0700)]
Merge git://git./linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
1) Fix crash with malformed ebtables blob which do not provide all
entry points, from Florian Westphal.
2) Fix possible TCP connection clogging up with default 5-days
timeout in conntrack, from Florian.
3) Fix crash in nf_tables tproxy with unsupported chains, also from Florian.
4) Do not allow to update implicit chains.
5) Make table handle allocation per-netns to fix data race.
6) Do not truncated payload length and offset, and checksum offset.
Instead report EINVAl.
7) Enable chain stats update via static key iff no error occurs.
8) Restrict osf expression to ip, ip6 and inet families.
9) Restrict tunnel expression to netdev family.
10) Fix crash when trying to bind again an already bound chain.
11) Flowtable garbage collector might leave behind pending work to
delete entries. This patch comes with a previous preparation patch
as dependency.
12) Allow net.netfilter.nf_conntrack_frag6_high_thresh to be lowered,
from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_defrag_ipv6: allow nf_conntrack_frag6_high_thresh increases
netfilter: flowtable: fix stuck flows on cleanup due to pending work
netfilter: flowtable: add function to invoke garbage collection immediately
netfilter: nf_tables: disallow binding to already bound chain
netfilter: nft_tunnel: restrict it to netdev family
netfilter: nft_osf: restrict osf to ipv4, ipv6 and inet families
netfilter: nf_tables: do not leave chain stats enabled on error
netfilter: nft_payload: do not truncate csum_offset and csum_type
netfilter: nft_payload: report ERANGE for too long offset and length
netfilter: nf_tables: make table handle allocation per-netns friendly
netfilter: nf_tables: disallow updates of implicit chain
netfilter: nft_tproxy: restrict to prerouting hook
netfilter: conntrack: work around exceeded receive window
netfilter: ebtables: reject blobs that don't provide all entry points
====================
Link: https://lore.kernel.org/r/20220824220330.64283-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Lukas Bulwahn [Wed, 24 Aug 2022 07:29:45 +0000 (09:29 +0200)]
MAINTAINERS: rectify file entry in BONDING DRIVER
Commit
c078290a2b76 ("selftests: include bonding tests into the kselftest
infra") adds the bonding tests in the directory:
tools/testing/selftests/drivers/net/bonding/
The file entry in MAINTAINERS for the BONDING DRIVER however refers to:
tools/testing/selftests/net/bonding/
Hence, ./scripts/get_maintainer.pl --self-test=patterns complains about a
broken file pattern.
Repair this file entry in BONDING DRIVER.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: Jonathan Toppins <jtoppins@redhat.com>
Link: https://lore.kernel.org/r/20220824072945.28606-1-lukas.bulwahn@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Wolfram Sang [Thu, 18 Aug 2022 21:01:17 +0000 (23:01 +0200)]
fbdev: Move fbdev drivers from strlcpy to strscpy
Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.
Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Jens Axboe [Wed, 24 Aug 2022 19:58:37 +0000 (13:58 -0600)]
Merge branch 'md-fixes' of https://git./linux/kernel/git/song/md into block-6.0
Pull MD fixes from Song:
"1. Fix for clustered raid, by Guoqing Jiang.
2. req_op fix, by Bart Van Assche.
3. Fix race condition in raid recreate, by David Sloan."
* 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
md: call __md_stop_writes in md_stop
Revert "md-raid: destroy the bitmap after destroying the thread"
md: Flush workqueue md_rdev_misc_wq in md_alloc()
md/raid10: Fix the data type of an r10_sync_page_io() argument
Jiapeng Chong [Wed, 24 Aug 2022 11:44:55 +0000 (19:44 +0800)]
fbdev: omap: Remove unnecessary print function dev_err()
The print function dev_err() is redundant because platform_get_irq()
already prints an error.
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=1957
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Yang Yingliang [Fri, 19 Aug 2022 08:57:52 +0000 (16:57 +0800)]
fbdev: chipsfb: Add missing pci_disable_device() in chipsfb_pci_init()
Add missing pci_disable_device() in error path in chipsfb_pci_init().
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Shigeru Yoshida [Sun, 21 Aug 2022 11:17:31 +0000 (20:17 +0900)]
fbdev: fbcon: Destroy mutex on freeing struct fb_info
It's needed to destroy bl_curve_mutex on freeing struct fb_info since
the mutex is embedded in the structure and initialized when it's
allocated.
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Jiapeng Chong [Fri, 19 Aug 2022 11:06:59 +0000 (19:06 +0800)]
fbdev: radeon: Clean up some inconsistent indenting
No functional modification involved.
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=1932
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Jiapeng Chong [Fri, 19 Aug 2022 11:04:14 +0000 (19:04 +0800)]
fbdev: sisfb: Clean up some inconsistent indenting
No functional modification involved.
drivers/video/fbdev/sis/sis_main.c:6165 sisfb_probe() warn: inconsistent indenting.
drivers/video/fbdev/sis/sis_main.c:4266 sisfb_post_300_rwtest() warn: inconsistent indenting.
drivers/video/fbdev/sis/sis_main.c:2388 SISDoSense() warn: inconsistent indenting.
drivers/video/fbdev/sis/sis_main.c:2531 SiS_Sense30x() warn: inconsistent indenting.
drivers/video/fbdev/sis/sis_main.c:2382 SISDoSense() warn: inconsistent indenting.
drivers/video/fbdev/sis/sis_main.c:2250 sisfb_sense_crt1() warn: inconsistent indenting.
drivers/video/fbdev/sis/sis_main.c:672 sisfb_validate_mode() warn: inconsistent indenting.
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=1934
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Letu Ren [Thu, 18 Aug 2022 10:44:24 +0000 (18:44 +0800)]
fbdev: fb_pm2fb: Avoid potential divide by zero error
In `do_fb_ioctl()` of fbmem.c, if cmd is FBIOPUT_VSCREENINFO, var will be
copied from user, then go through `fb_set_var()` and
`info->fbops->fb_check_var()` which could may be `pm2fb_check_var()`.
Along the path, `var->pixclock` won't be modified. This function checks
whether reciprocal of `var->pixclock` is too high. If `var->pixclock` is
zero, there will be a divide by zero error. So, it is necessary to check
whether denominator is zero to avoid crash. As this bug is found by
Syzkaller, logs are listed below.
divide error in pm2fb_check_var
Call Trace:
<TASK>
fb_set_var+0x367/0xeb0 drivers/video/fbdev/core/fbmem.c:1015
do_fb_ioctl+0x234/0x670 drivers/video/fbdev/core/fbmem.c:1110
fb_ioctl+0xdd/0x130 drivers/video/fbdev/core/fbmem.c:1189
Reported-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Letu Ren <fantasquex@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Jilin Yuan [Tue, 16 Aug 2022 13:07:13 +0000 (21:07 +0800)]
fbdev: ssd1307fb: Fix repeated words in comments
Delete the redundant word 'set'.
Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Yu Zhe [Fri, 12 Aug 2022 06:52:23 +0000 (14:52 +0800)]
fbdev: omapfb: Fix tests for platform_get_irq() failure
The platform_get_irq() returns negative error codes. It can't actually
return zero.
Signed-off-by: Yu Zhe <yuzhe@nfschina.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Sylwester Dziedziuch [Fri, 19 Aug 2022 10:45:52 +0000 (12:45 +0200)]
i40e: Fix incorrect address type for IPv6 flow rules
It was not possible to create 1-tuple flow director
rule for IPv6 flow type. It was caused by incorrectly
checking for source IP address when validating user provided
destination IP address.
Fix this by changing ip6src to correct ip6dst address
in destination IP address validation for IPv6 flow type.
Fixes: efca91e89b67 ("i40e: Add flow director support for IPv6")
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Jacob Keller [Tue, 2 Aug 2022 00:24:19 +0000 (17:24 -0700)]
ixgbe: stop resetting SYSTIME in ixgbe_ptp_start_cyclecounter
The ixgbe_ptp_start_cyclecounter is intended to be called whenever the
cyclecounter parameters need to be changed.
Since commit
a9763f3cb54c ("ixgbe: Update PTP to support X550EM_x
devices"), this function has cleared the SYSTIME registers and reset the
TSAUXC DISABLE_SYSTIME bit.
While these need to be cleared during ixgbe_ptp_reset, it is wrong to clear
them during ixgbe_ptp_start_cyclecounter. This function may be called
during both reset and link status change. When link changes, the SYSTIME
counter is still operating normally, but the cyclecounter should be updated
to account for the possibly changed parameters.
Clearing SYSTIME when link changes causes the timecounter to jump because
the cycle counter now reads zero.
Extract the SYSTIME initialization out to a new function and call this
during ixgbe_ptp_reset. This prevents the timecounter adjustment and avoids
an unnecessary reset of the current time.
This also restores the original SYSTIME clearing that occurred during
ixgbe_ptp_reset before the commit above.
Reported-by: Steve Payne <spayne@aurora.tech>
Reported-by: Ilya Evenbach <ievenbach@aurora.tech>
Fixes: a9763f3cb54c ("ixgbe: Update PTP to support X550EM_x devices")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Guoqing Jiang [Wed, 17 Aug 2022 12:05:14 +0000 (20:05 +0800)]
md: call __md_stop_writes in md_stop
From the link [1], we can see raid1d was running even after the path
raid_dtr -> md_stop -> __md_stop.
Let's stop write first in destructor to align with normal md-raid to
fix the KASAN issue.
[1]. https://lore.kernel.org/linux-raid/CAPhsuW5gc4AakdGNdF8ubpezAuDLFOYUO_sfMZcec6hQFm8nhg@mail.gmail.com/T/#m7f12bf90481c02c6d2da68c64aeed4779b7df74a
Fixes: 48df498daf62 ("md: move bitmap_destroy to the beginning of __md_stop")
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: Song Liu <song@kernel.org>
Guoqing Jiang [Wed, 17 Aug 2022 12:05:13 +0000 (20:05 +0800)]
Revert "md-raid: destroy the bitmap after destroying the thread"
This reverts commit
e151db8ecfb019b7da31d076130a794574c89f6f. Because it
obviously breaks clustered raid as noticed by Neil though it fixed KASAN
issue for dm-raid, let's revert it and fix KASAN issue in next commit.
[1]. https://lore.kernel.org/linux-raid/
a6657e08-b6a7-358b-2d2a-
0ac37d49d23a@linux.dev/T/#m95ac225cab7409f66c295772483d091084a6d470
Fixes: e151db8ecfb0 ("md-raid: destroy the bitmap after destroying the thread")
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: Song Liu <song@kernel.org>