Daniel Bristot de Oliveira [Wed, 24 Apr 2024 14:36:56 +0000 (16:36 +0200)]
 
rtla/timerlat: Make user-space threads the default
After ther -u addition, most of the known users are setting it. And
it makes sense, as it adds more information, and inherits the default
setup for the threads - e.g., cgroups configs.
Thus, if the user-space interface is available, enable -u. Otherwise,
use the in-kernel thread.
Add the -k option to allow the user to request kernel-threads.
Link: https://lkml.kernel.org/r/9241d3089de4091b124f780ed832a0e6646cadaa.1713968967.git.bristot@kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Daniel Bristot de Oliveira [Wed, 24 Apr 2024 14:36:55 +0000 (16:36 +0200)]
 
rtla: Add the --warm-up option
On many cases, the results right after the startup are different
from the rest of the execution, biasing the results. For example,
on osnoise, the scheduler might take some time to adapt to the new
busy-loop workload.
Add the --warm-up <seconds> option, adding a warm-up phase (in
seconds) where the workload is set, but the results are discarded.
Link: https://lkml.kernel.org/r/e682d5ce5af90f123bd13220f63d5c3d118a92be.1713968967.git.bristot@kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Daniel Bristot de Oliveira [Wed, 24 Apr 2024 14:36:54 +0000 (16:36 +0200)]
 
rtla/timerlat: Add a summary for hist mode
Like on rtla timerlat top, add an overall summary at the bottom
of timerlat hist. For instance:
  # timerlat hist -c 0-1 -d 10s -E 20
  # RTLA timerlat histogram
  # Time unit is microseconds (us)
  # Duration:   0 00:00:10
  Index   IRQ-000   Thr-000   IRQ-001   Thr-001
  6             1         0         0         0
  7             1         0         0         0
  8             1         0         1         0
  9             7         0         0         0
  10           16         0         0         0
  11            1         0         3         0
  15            0         0         3         0
  16            0         0        12         0
  17            0         0        28         0
  18            0         2        26         0
  19            1         1        80         1
  over:      9973      9998      9848     10000
  count:    10001     10001     10001     10001
  min:          6        18         8        19
  avg:        185       204        95       113
  max:        428       450       341       371
  ALL:        IRQ       Thr
  count:    20002     20002
  min:          6        18
  avg:        140       159
  max:        428       450
Link: https://lkml.kernel.org/r/a6bc06c798f72127edc57d1f99da8d57e1187cee.1713968967.git.bristot@kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Suggested-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Daniel Bristot de Oliveira [Wed, 24 Apr 2024 14:36:53 +0000 (16:36 +0200)]
 
rtla/timerlat: Add a summary for top mode
While the per-cpu values are the results to take into consideration, the
overall system values are also useful.
Add a summary at the bottom of rtla timerlat top showing the overall
results. For instance:
                                       Timer Latency
    0 00:00:10   |          IRQ Timer Latency (us)        |         Thread Timer Latency (us)
  CPU COUNT      |      cur       min       avg       max |      cur       min       avg       max
    0 #10003     |      113        19       150       441 |      134        35       170       459
    1 #10003     |       63         8        99       462 |       84        15       119       481
    2 #10003     |        3         2        89       396 |       21         8       108       414
    3 #10002     |      206        11       210       394 |      223        21       228       415
  ---------------|----------------------------------------|---------------------------------------
  ALL #40011  e0 |                  2       137       462 |                  8       156       481
Link: https://lkml.kernel.org/r/5eb510d6faeb4ce745e09395196752df75a2dd1a.1713968967.git.bristot@kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Suggested-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Daniel Bristot de Oliveira [Wed, 24 Apr 2024 14:36:52 +0000 (16:36 +0200)]
 
rtla/timerlat: Use pretty formatting only on interactive tty
timerlat top does some background/font color formatting. While useful
on terminal, it breaks the output on other formats. For example, when
piping the output for pastebin tools, the format strings are printed
as characters. For instance:
  [2;37;40m                                     Timer Latency                                              [0;0;0m
    0 00:00:01   |          IRQ Timer Latency (us)        |         Thread Timer Latency (us)
  [2;30;47mCPU COUNT      |      cur       min       avg       max |      cur       min       avg       max[0;0;0m
    0 #1013      |        1         0         1        54 |        5         2         4        57
    1 #1013      |        3         0         1        10 |        6         2         4        15
To avoid this problem, do the formatting only if running on a tty,
and in !quiet mode.
Link: https://lkml.kernel.org/r/8288e1544ceab21557d5dda93a0f00339497c649.1713968967.git.bristot@kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Daniel Bristot de Oliveira [Wed, 24 Apr 2024 14:36:51 +0000 (16:36 +0200)]
 
rtla/auto-analysis: Replace \t with spaces
When copying timerlat auto-analysis from a terminal to some web pages or
chats, the \t are being replaced with a single ' ' or '    ', breaking
the output.
For example:
  ## CPU 3 hit stop tracing, analyzing it ##
    IRQ handler delay:                        1.30 us (0.11 %)
    IRQ latency:           1.90 us
    Timerlat IRQ duration:         3.00 us (0.24 %)
    Blocking thread:       1223.16 us (99.00 %)
                     insync:4048         1223.16 us
    IRQ interference          4.93 us (0.40 %)
                local_timer:236        4.93 us
  ------------------------------------------------------------------------
     Thread latency:       1235.47 us (100%)
Replace \t with spaces to avoid this problem.
Link: https://lkml.kernel.org/r/ec7ed2b2809c22ab0dfc8eb7c805ab9cddc4254a.1713968967.git.bristot@kernel.org
Cc: stable@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juri Lelli <juri.lelli@redhat.com>
Fixes: 27e348b221f6 ("rtla/timerlat: Add auto-analysis core")
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Daniel Bristot de Oliveira [Wed, 24 Apr 2024 14:36:50 +0000 (16:36 +0200)]
 
rtla/timerlat: Simplify "no value" printing on top
Instead of printing three times the same output, print it only once,
reducing lines and being sure that all no values have the same length.
It also fixes an extra '\n' when running the with kernel threads, like
here:
     =============== %< ==============
                                      Timer Latency
   0 00:00:01   |          IRQ Timer Latency (us)        |         Thread Timer Latency (us)
 CPU COUNT      |      cur       min       avg       max |      cur       min       avg       max
   2 #0         |        -         -         -         - |      161       161       161       161
   3 #0         |        -         -         -         - |      161       161       161       161
   8 #1         |       54        54        54        54 |        -         -         -         -'\n'
 ---------------|----------------------------------------|---------------------------------------
 ALL #1      e0 |                 54        54        54 |                161       161       161
     =============== %< ==============
This '\n' should have been removed with the user-space support that
added another '\n' if not running with kernel threads.
Link: https://lkml.kernel.org/r/0a4d8085e7cd706733a5dc10a81ca38b82bd4992.1713968967.git.bristot@kernel.org
Cc: stable@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juri Lelli <juri.lelli@redhat.com>
Fixes: cdca4f4e5e8e ("rtla/timerlat_top: Add timerlat user-space support")
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Linus Torvalds [Sun, 12 May 2024 21:12:29 +0000 (14:12 -0700)]
 
Linux 6.9
Linus Torvalds [Sun, 12 May 2024 20:01:59 +0000 (13:01 -0700)]
 
Merge tag 'kselftest-fix-vfork-2024-05-12' of git://git./linux/kernel/git/mic/linux
Pull Kselftest fixes from Mickaël Salaün:
 "Fix Kselftest's vfork() side effects.
  As reported by Kernel Test Robot and Sean Christopherson, some
  tests fail since v6.9-rc1 . This is due to the use of vfork() which
  introduced some side effects. Similarly, while making it more generic,
  a previous commit made some Landlock file system tests flaky, and
  subject to the host's file system mount configuration.
  This fixes all these side effects by replacing vfork() with clone3()
  and CLONE_VFORK, which is cleaner (no arbitrary shared memory) and
  makes the Kselftest framework more robust"
Link: https://lore.kernel.org/oe-lkp/202403291015.1fcfa957-oliver.sang@intel.com
Link: https://lore.kernel.org/r/ZjPelW6-AbtYvslu@google.com
Link: https://lore.kernel.org/r/20240511171445.904356-1-mic@digikod.net
* tag 'kselftest-fix-vfork-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  selftests/harness: Handle TEST_F()'s explicit exit codes
  selftests/harness: Fix vfork() side effects
  selftests/harness: Share _metadata between forked processes
  selftests/pidfd: Fix wrong expectation
  selftests/harness: Constify fixture variants
  selftests/landlock: Do not allocate memory in fixture data
  selftests/harness: Fix interleaved scheduling leading to race conditions
  selftests/harness: Fix fixture teardown
  selftests/landlock: Fix FS tests when run on a private mount point
  selftests/pidfd: Fix config for pidfd_setns_test
Linus Torvalds [Sun, 12 May 2024 19:15:39 +0000 (12:15 -0700)]
 
Merge tag 'for-linus-6.9' of git://git./virt/kvm/kvm
Pull kvm fix from Paolo Bonzini:
 - Fix NULL pointer read on s390 in ioctl(KVM_CHECK_EXTENSION) for
   /dev/kvm
* tag 'for-linus-6.9' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: s390: Check kvm pointer when testing KVM_CAP_S390_HPAGE_1M
Linus Torvalds [Sun, 12 May 2024 16:09:27 +0000 (09:09 -0700)]
 
Merge tag 'edac_urgent_for_v6.9' of git://git./linux/kernel/git/ras/ras
Pull EDAC fix from Borislav Petkov:
 - Fix a race condition when clearing error count bits and toggling the
   error interrupt throug the same register, in synopsys_edac
* tag 'edac_urgent_for_v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/synopsys: Fix ECC status and IRQ control race condition
Linus Torvalds [Sun, 12 May 2024 15:54:28 +0000 (08:54 -0700)]
 
Merge tag 'x86_urgent_for_v6.9' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
 - Add a new PCI ID which belongs to a new AMD CPU family 0x1a
 - Ensure that that last level cache ID is set in all cases, in the AMD
   CPU topology parsing code, in order to prevent invalid scheduling
   domain CPU masks
* tag 'x86_urgent_for_v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/topology/amd: Ensure that LLC ID is initialized
  x86/amd_nb: Add new PCI IDs for AMD family 0x1a
Mickaël Salaün [Sat, 11 May 2024 17:14:45 +0000 (19:14 +0200)]
 
selftests/harness: Handle TEST_F()'s explicit exit codes
If TEST_F() explicitly calls exit(code) with code different than 0, then
_metadata->exit_code is set to this code (e.g. KVM_ONE_VCPU_TEST()).  We
need to keep in mind that _metadata->exit_code can be KSFT_SKIP while
the process exit code is 0.
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Will Drewry <wad@chromium.org>
Reported-by: Sean Christopherson <seanjc@google.com>
Tested-by: Sean Christopherson <seanjc@google.com>
Closes: https://lore.kernel.org/r/ZjPelW6-AbtYvslu@google.com
Fixes: 0710a1a73fb4 ("selftests/harness: Merge TEST_F_FORK() into TEST_F()")
Link: https://lore.kernel.org/r/20240511171445.904356-11-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Mickaël Salaün [Sat, 11 May 2024 17:14:44 +0000 (19:14 +0200)]
 
selftests/harness: Fix vfork() side effects
Setting the time namespace with CLONE_NEWTIME returns -EUSERS if the
calling thread shares memory with another thread (because of the shared
vDSO), which is the case when it is created with vfork().
Fix pidfd_setns_test by replacing test harness's vfork() call with a
clone3() call with CLONE_VFORK, and an explicit sharing of the
_metadata and self objects.
Replace _metadata->teardown_parent with a new FIXTURE_TEARDOWN_PARENT()
helper that can replace FIXTURE_TEARDOWN().  This is a cleaner approach
and it enables to selectively share the fixture data between the child
process running tests and the parent process running the fixture
teardown.  This also avoids updating several tests to not rely on the
self object's copy-on-write property (e.g. storing the returned value of
a fork() call).
Cc: Christian Brauner <brauner@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Günther Noack <gnoack@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Will Drewry <wad@chromium.org>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202403291015.1fcfa957-oliver.sang@intel.com
Fixes: 0710a1a73fb4 ("selftests/harness: Merge TEST_F_FORK() into TEST_F()")
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20240511171445.904356-10-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Mickaël Salaün [Sat, 11 May 2024 17:14:43 +0000 (19:14 +0200)]
 
selftests/harness: Share _metadata between forked processes
Unconditionally share _metadata between all forked processes, which
enables to actually catch errors which were previously ignored.
This is required for a following commit replacing vfork() with clone3()
and CLONE_VFORK (i.e. not sharing the full memory) .  It should also be
useful to share _metadata to extend expectations to test process's
forks.  For instance, this change identified a wrong expectation in
pidfd_setns_test.
Because this _metadata is used by the new XFAIL_ADD(), use a global
pointer initialized in TEST_F().  This is OK because only XFAIL_ADD()
use it, and XFAIL_ADD() already depends on TEST_F().
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Will Drewry <wad@chromium.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20240511171445.904356-9-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Mickaël Salaün [Sat, 11 May 2024 17:14:42 +0000 (19:14 +0200)]
 
selftests/pidfd: Fix wrong expectation
Replace a wrong EXPECT_GT(self->child_pid_exited, 0) with EXPECT_GE(),
which will be actually tested on the parent and child sides with a
following commit.
Cc: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/r/20240511171445.904356-8-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Mickaël Salaün [Sat, 11 May 2024 17:14:41 +0000 (19:14 +0200)]
 
selftests/harness: Constify fixture variants
FIXTURE_VARIANT_ADD() types are passed as const pointers to
FIXTURE_TEARDOWN().  Make that explicit by constifying the variants
declarations.
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Will Drewry <wad@chromium.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20240511171445.904356-7-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Mickaël Salaün [Sat, 11 May 2024 17:14:40 +0000 (19:14 +0200)]
 
selftests/landlock: Do not allocate memory in fixture data
Do not allocate self->dir_path in the test process because this would
not be visible in the FIXTURE_TEARDOWN() process when relying on
fork()/clone3() instead of vfork().
This change is required for a following commit removing vfork() call to
not break the layout3_fs.* test cases.
Cc: Günther Noack <gnoack@google.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20240511171445.904356-6-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Mickaël Salaün [Sat, 11 May 2024 17:14:39 +0000 (19:14 +0200)]
 
selftests/harness: Fix interleaved scheduling leading to race conditions
Fix a race condition when running several FIXTURE_TEARDOWN() managing
the same resource.  This fixes a race condition in the Landlock file
system tests when creating or unmounting the same directory.
Using clone3() with CLONE_VFORK guarantees that the child and grandchild
test processes are sequentially scheduled.  This is implemented with a
new clone3_vfork() helper replacing the fork() call.
This avoids triggering this error in __wait_for_test():
  Test ended in some other way [127]
Cc: Christian Brauner <brauner@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Günther Noack <gnoack@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Will Drewry <wad@chromium.org>
Fixes: 41cca0542d7c ("selftests/harness: Fix TEST_F()'s vfork handling")
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20240511171445.904356-5-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Mickaël Salaün [Sat, 11 May 2024 17:14:38 +0000 (19:14 +0200)]
 
selftests/harness: Fix fixture teardown
Make sure fixture teardowns are run when test cases failed, including
when _metadata->teardown_parent is set to true.
Make sure only one fixture teardown is run per test case, handling the
case where the test child forks.
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Shengyu Li <shengyu.li.evgeny@gmail.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Fixes: 72d7cb5c190b ("selftests/harness: Prevent infinite loop due to Assert in FIXTURE_TEARDOWN")
Fixes: 0710a1a73fb4 ("selftests/harness: Merge TEST_F_FORK() into TEST_F()")
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20240511171445.904356-4-mic@digikod.net
Rule: add
Link: https://lore.kernel.org/stable/20240506165518.474504-4-mic%40digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Mickaël Salaün [Sat, 11 May 2024 17:14:37 +0000 (19:14 +0200)]
 
selftests/landlock: Fix FS tests when run on a private mount point
According to the test environment, the mount point of the test's working
directory may be shared or not, which changes the visibility of the
nested "tmp" mount point for the test's parent process calling
umount("tmp").
This was spotted while running tests in containers [1], where mount
points are private.
Cc: Günther Noack <gnoack@google.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Link: https://github.com/landlock-lsm/landlock-test-tools/pull/4
Fixes: 41cca0542d7c ("selftests/harness: Fix TEST_F()'s vfork handling")
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20240511171445.904356-3-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Mickaël Salaün [Sat, 11 May 2024 17:14:36 +0000 (19:14 +0200)]
 
selftests/pidfd: Fix config for pidfd_setns_test
Required by switch_timens() to open /proc/self/ns/time_for_children.
CONFIG_GENERIC_VDSO_TIME_NS is not available on UML, so pidfd_setns_test
cannot be run successfully on this architecture.
Cc: Shuah Khan <skhan@linuxfoundation.org>
Fixes: 2b40c5db73e2 ("selftests/pidfd: add pidfd setns tests")
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/r/20240511171445.904356-2-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Linus Torvalds [Fri, 10 May 2024 21:37:05 +0000 (14:37 -0700)]
 
Merge tag 'drm-fixes-2024-05-11' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
 "This should be the last set of fixes for 6.9, i915, xe and amdgpu are
  the bulk here, one of the previous nouveau fixes turned up an issue,
  so reverting it, otherwise one core and a couple of meson fixes.
  core:
   - fix connector debugging output
  i915:
   - Automate CCS Mode setting during engine resets
   - Fix audio time stamp programming for DP
   - Fix parsing backlight BDB data
  xe:
   - Fix use zero-length element array
   - Move more from system wq to ordered private wq
   - Do not ignore return for drmm_mutex_init
  amdgpu:
   - DCN 3.5 fix
   - MST DSC fixes
   - S0i3 fix
   - S4 fix
   - HDP MMIO mapping fix
   - Fix a regression in visible vram handling
  amdkfd:
   - Spatial partition fix
  meson:
   - dw-hdmi: power-up fixes
   - dw-hdmi: add badngap setting for g12
  nouveau:
   - revert SG_DEBUG fix that has a side effect"
* tag 'drm-fixes-2024-05-11' of https://gitlab.freedesktop.org/drm/kernel:
  Revert "drm/nouveau/firmware: Fix SG_DEBUG error with nvkm_firmware_ctor()"
  drm/amdgpu: Fix comparison in amdgpu_res_cpu_visible
  drm/amdkfd: don't allow mapping the MMIO HDP page with large pages
  drm/xe: Use ordered WQ for G2H handler
  drm/xe/guc: Check error code when initializing the CT mutex
  drm/xe/ads: Use flexible-array
  Revert "drm/amdkfd: Add partition id field to location_id"
  dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 users
  drm/amd/display: MST DSC check for older devices
  drm/amd/display: Fix idle optimization checks for multi-display and dual eDP
  drm/amd/display: Fix DSC-re-computing
  drm/amd/display: Enable urgent latency adjustments for DCN35
  drm/connector: Add \n to message about demoting connector force-probes
  drm/i915/bios: Fix parsing backlight BDB data
  drm/i915/audio: Fix audio time stamp programming for DP
  drm/i915/gt: Automate CCS Mode setting during engine resets
  drm/meson: dw-hdmi: add bandgap setting for g12
  drm/meson: dw-hdmi: power up phy on device init
Linus Torvalds [Fri, 10 May 2024 21:16:03 +0000 (14:16 -0700)]
 
Merge tag 'mm-hotfixes-stable-2024-05-10-13-14' of git://git./linux/kernel/git/akpm/mm
Pull MM fixes from Andrew Morton:
 "18 hotfixes, 7 of which are cc:stable.
  More fixups for this cycle's page_owner updates. And a few userfaultfd
  fixes. Otherwise, random singletons - see the individual changelogs
  for details"
* tag 'mm-hotfixes-stable-2024-05-10-13-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mailmap: add entry for Barry Song
  selftests/mm: fix powerpc ARCH check
  mailmap: add entry for John Garry
  XArray: set the marks correctly when splitting an entry
  selftests/vDSO: fix runtime errors on LoongArch
  selftests/vDSO: fix building errors on LoongArch
  mm,page_owner: don't remove __GFP_NOLOCKDEP in add_stack_record_to_list
  fs/proc/task_mmu: fix uffd-wp confusion in pagemap_scan_pmd_entry()
  fs/proc/task_mmu: fix loss of young/dirty bits during pagemap scan
  mm/vmalloc: fix return value of vb_alloc if size is 0
  mm: use memalloc_nofs_save() in page_cache_ra_order()
  kmsan: compiler_types: declare __no_sanitize_or_inline
  lib/test_xarray.c: fix error assumptions on check_xa_multi_store_adv_add()
  tools: fix userspace compilation with new test_xarray changes
  MAINTAINERS: update URL's for KEYS/KEYRINGS_INTEGRITY and TPM DEVICE DRIVER
  mm: page_owner: fix wrong information in dump_page_owner
  maple_tree: fix mas_empty_area_rev() null pointer dereference
  mm/userfaultfd: reset ptes when close() for wr-protected ones
Dave Airlie [Fri, 10 May 2024 21:01:31 +0000 (07:01 +1000)]
 
Revert "drm/nouveau/firmware: Fix SG_DEBUG error with nvkm_firmware_ctor()"
This reverts commit 
52a6947bf576b97ff8e14bb0a31c5eaf2d0d96e2.
This causes loading failures in
[    0.367379] nouveau 0000:01:00.0: NVIDIA GP104 (
134000a1)
[    0.474499] nouveau 0000:01:00.0: bios: version 86.04.50.80.13
[    0.474620] nouveau 0000:01:00.0: pmu: firmware unavailable
[    0.474977] nouveau 0000:01:00.0: fb: 8192 MiB GDDR5
[    0.484371] nouveau 0000:01:00.0: sec2(acr): mbox 
00000001 00000000
[    0.484377] nouveau 0000:01:00.0: sec2(acr):load: boot failed: -5
[    0.484379] nouveau 0000:01:00.0: acr: init failed, -5
[    0.484466] nouveau 0000:01:00.0: init failed with -5
[    0.484468] nouveau: DRM-master:
00000000:
00000080: init failed with -5
[    0.484470] nouveau 0000:01:00.0: DRM-master: Device allocation failed: -5
[    0.485078] nouveau 0000:01:00.0: probe with driver nouveau failed with error -50
I tried tracking it down but ran out of time this week, will revisit next week.
Reported-by: Dan Moulding <dan@danm.net>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 10 May 2024 20:55:38 +0000 (06:55 +1000)]
 
Merge tag 'drm-misc-fixes-2024-05-10' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:
core:
- fix connector debugging output
meson:
- dw-hdmi: power-up fixes
- dw-hdmi: add badngap setting for g12
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240510072027.GA9131@linux.fritz.box
Linus Torvalds [Fri, 10 May 2024 21:01:00 +0000 (14:01 -0700)]
 
Merge tag 'gpio-fixes-for-v6.9' of git://git./linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
 "Some last-minute fixes for this release from the GPIO subsystem.
  The first two address a regression in performance reported to me after
  the conversion to using SRCU in GPIOLIB that was merged during the
  v6.9 merge window. The second patch is not technically a fix but since
  after the first one we no longer need to use a per-descriptor SRCU
  struct, I think it's worth to simplify the code before it gets
  released on Sunday.
  The next two commits fix two memory issues: one use-after-free bug and
  one instance of possibly leaking kernel stack memory to user-space.
  Summary:
   - fix a performance regression in GPIO requesting and releasing after
     the conversion to SRCU
   - fix a use-after-free bug due to a race-condition
   - fix leaking stack memory to user-space in a GPIO uABI corner case"
* tag 'gpio-fixes-for-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpiolib: cdev: fix uninitialised kfifo
  gpiolib: cdev: Fix use after free in lineinfo_changed_notify
  gpiolib: use a single SRCU struct for all GPIO descriptors
  gpiolib: fix the speed of descriptor label setting with SRCU
Dave Airlie [Fri, 10 May 2024 20:41:13 +0000 (06:41 +1000)]
 
Merge tag 'amd-drm-fixes-6.9-2024-05-10' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.9-2024-05-10:
amdgpu:
- DCN 3.5 fix
- MST DSC fixes
- S0i3 fix
- S4 fix
- HDP MMIO mapping fix
- Fix a regression in visible vram handling
amdkfd:
- Spatial partition fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240510171110.1394940-1-alexander.deucher@amd.com
Barry Song [Mon, 6 May 2024 04:20:09 +0000 (16:20 +1200)]
 
mailmap: add entry for Barry Song
Include a .mailmap entry to synchronize with both my past and current
emails.  Among them, three business mailboxes are dead.
Link: https://lkml.kernel.org/r/20240506042009.10854-1-21cnbao@gmail.com
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Michael Ellerman [Mon, 6 May 2024 11:58:25 +0000 (21:58 +1000)]
 
selftests/mm: fix powerpc ARCH check
In commit 
0518dbe97fe6 ("selftests/mm: fix cross compilation with LLVM")
the logic to detect the machine architecture in the Makefile was changed
to use ARCH, and only fallback to uname -m if ARCH is unset.  However the
tests of ARCH were not updated to account for the fact that ARCH is
"powerpc" for powerpc builds, not "ppc64".
Fix it by changing the checks to look for "powerpc", and change the
uname -m logic to convert "ppc64.*" into "powerpc".
With that fixed the following tests now build for powerpc again:
 * protection_keys
 * va_high_addr_switch
 * virtual_address_range
 * write_to_hugetlbfs
Link: https://lkml.kernel.org/r/20240506115825.66415-1-mpe@ellerman.id.au
Fixes: 0518dbe97fe6 ("selftests/mm: fix cross compilation with LLVM")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mark Brown <broonie@kernel.org>
Cc: <stable@vger.kernel.org>	[6.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Linus Torvalds [Fri, 10 May 2024 17:24:16 +0000 (10:24 -0700)]
 
Merge tag 'block-6.9-
20240510' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
 - NVMe pull request via Keith:
     - nvme target fixes (Sagi, Dan, Maurizo)
     - new vendor quirk for broken MSI (Sean)
 - Virtual boundary fix for a regression in this merge window (Ming)
* tag 'block-6.9-
20240510' of git://git.kernel.dk/linux:
  nvmet-rdma: fix possible bad dereference when freeing rsps
  nvmet: prevent sprintf() overflow in nvmet_subsys_nsid_exists()
  nvmet: make nvmet_wq unbound
  nvmet-auth: return the error code to the nvmet_auth_ctrl_hash() callers
  nvme-pci: Add quirk for broken MSIs
  block: set default max segment size in case of virt_boundary
Linus Torvalds [Fri, 10 May 2024 17:20:49 +0000 (10:20 -0700)]
 
Merge tag 'spi-fix-v6.9-rc7' of git://git./linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
 "Two device specific fixes here, one avoiding glitches on chip select
  with the STM32 driver and one for incorrectly configured clocks on the
  Microchip QSPI controller"
* tag 'spi-fix-v6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: microchip-core-qspi: fix setting spi bus clock rate
  spi: stm32: enable controller before asserting CS
Linus Torvalds [Fri, 10 May 2024 17:18:31 +0000 (10:18 -0700)]
 
Merge tag 'regulator-fix-v6.9-rc7' of git://git./linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
 "Two fixes here, one from Johan which fixes error handling when we
  attempt to create duplicate debugfs files and one for an incorrect
  specification of ramp_delay with the rtq2208"
* tag 'regulator-fix-v6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: core: fix debugfs creation regression
  regulator: rtq2208: Fix the BUCK ramp_delay range to maximum of 16mVstep/us
Linus Torvalds [Fri, 10 May 2024 17:15:58 +0000 (10:15 -0700)]
 
Merge tag 'timers-urgent-2024-05-10' of git://git./linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
 "Fix possible (but unlikely) out-of-bounds access in the timer
  migration per-CPU-init code"
* tag 'timers-urgent-2024-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timers/migration: Prevent out of bounds access on failure
Linus Torvalds [Fri, 10 May 2024 17:10:21 +0000 (10:10 -0700)]
 
Merge tag 'iommu-fixes-v6.9-rc7' of git://git./linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
 - Fix offset miscalculation on ARM-SMMU driver
 - AMD IOMMU fix for initializing state of untrusted devices
* tag 'iommu-fixes-v6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/arm-smmu: Use the correct type in nvidia_smmu_context_fault()
  iommu/amd: Enhance def_domain_type to handle untrusted device
Michel Dänzer [Wed, 8 May 2024 13:19:16 +0000 (15:19 +0200)]
 
drm/amdgpu: Fix comparison in amdgpu_res_cpu_visible
It incorrectly claimed a resource isn't CPU visible if it's located at
the very end of CPU visible VRAM.
Fixes: a6ff969fe9cb ("drm/amdgpu: fix visible VRAM handling during faults")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3343
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reported-and-Tested-by: Jeremy Day <jsday@noreason.ca>
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CC: stable@vger.kernel.org
Alex Deucher [Sun, 14 Apr 2024 17:06:39 +0000 (13:06 -0400)]
 
drm/amdkfd: don't allow mapping the MMIO HDP page with large pages
We don't get the right offset in that case.  The GPU has
an unused 4K area of the register BAR space into which you can
remap registers.  We remap the HDP flush registers into this
space to allow userspace (CPU or GPU) to flush the HDP when it
updates VRAM.  However, on systems with >4K pages, we end up
exposing PAGE_SIZE of MMIO space.
Fixes: d8e408a82704 ("drm/amdkfd: Expose HDP registers to user space")
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Thomas Gleixner [Wed, 8 May 2024 19:53:47 +0000 (21:53 +0200)]
 
x86/topology/amd: Ensure that LLC ID is initialized
The original topology evaluation code initialized cpu_data::topo::llc_id
with the die ID initialy and then eventually overwrite it with information
gathered from a CPUID leaf.
The conversion analysis failed to spot that particular detail and omitted
this initial assignment under the assumption that each topology evaluation
path will set it up. That assumption is mostly correct, but turns out to be
wrong in case that the CPUID leaf 0x80000006 does not provide a LLC ID.
In that case, LLC ID is invalid and as a consequence the setup of the
scheduling domain CPU masks is incorrect which subsequently causes the
scheduler core to complain about it during CPU hotplug:
  BUG: arch topology borken
       the CLS domain not a subset of the MC domain
Cure it by reusing legacy_set_llc() and assigning the die ID if the LLC ID
is invalid after all possible parsers have been tried.
Fixes: f7fb3b2dd92c ("x86/cpu: Provide an AMD/HYGON specific topology parser")
Reported-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Link: https://lore.kernel.org/r/PUZPR04MB63168AC442C12627E827368581292@PUZPR04MB6316.apcprd04.prod.outlook.com
Kent Gibson [Fri, 10 May 2024 06:53:42 +0000 (14:53 +0800)]
 
gpiolib: cdev: fix uninitialised kfifo
If a line is requested with debounce, and that results in debouncing
in software, and the line is subsequently reconfigured to enable edge
detection then the allocation of the kfifo to contain edge events is
overlooked.  This results in events being written to and read from an
uninitialised kfifo.  Read events are returned to userspace.
Initialise the kfifo in the case where the software debounce is
already active.
Fixes: 65cff7046406 ("gpiolib: cdev: support setting debounce")
Signed-off-by: Kent Gibson <warthog618@gmail.com>
Link: https://lore.kernel.org/r/20240510065342.36191-1-warthog618@gmail.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Shyam Sundar S K [Fri, 10 May 2024 11:18:28 +0000 (16:48 +0530)]
 
x86/amd_nb: Add new PCI IDs for AMD family 0x1a
Add the new PCI Device IDs to the MISC IDs list to support new
generation of AMD 1Ah family 70h Models of processors.
  [ bp: Massage commit message. ]
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240510111829.969501-1-Shyam-sundar.S-k@amd.com
Jason Gunthorpe [Thu, 9 May 2024 17:45:51 +0000 (14:45 -0300)]
 
iommu/arm-smmu: Use the correct type in nvidia_smmu_context_fault()
This was missed because of the function pointer indirection.
nvidia_smmu_context_fault() is also installed as a irq function, and the
'void *' was changed to a struct arm_smmu_domain. Since the iommu_domain
is embedded at a non-zero offset this causes nvidia_smmu_context_fault()
to miscompute the offset. Fixup the types.
  Unable to handle kernel NULL pointer dereference at virtual address 
0000000000000120
  Mem abort info:
    ESR = 0x0000000096000004
    EC = 0x25: DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
    FSC = 0x04: level 0 translation fault
  Data abort info:
    ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
    CM = 0, WnR = 0, TnD = 0, TagAccess = 0
    GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
  user pgtable: 4k pages, 48-bit VAs, pgdp=
0000000107c9f000
  [
0000000000000120] pgd=
0000000000000000, p4d=
0000000000000000
  Internal error: Oops: 
0000000096000004 [#1] SMP
  Modules linked in:
  CPU: 1 PID: 47 Comm: kworker/u25:0 Not tainted 6.9.0-0.rc7.58.eln136.aarch64 #1
  Hardware name: Unknown NVIDIA Jetson Orin NX/NVIDIA Jetson Orin NX, BIOS 3.1-
32827747 03/19/2023
  Workqueue: events_unbound deferred_probe_work_func
  pstate: 
604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : nvidia_smmu_context_fault+0x1c/0x158
  lr : __free_irq+0x1d4/0x2e8
  sp : 
ffff80008044b6f0
  x29: 
ffff80008044b6f0 x28: 
ffff000080a60b18 x27: 
ffffd32b5172e970
  x26: 
0000000000000000 x25: 
ffff0000802f5aac x24: 
ffff0000802f5a30
  x23: 
ffff0000802f5b60 x22: 
0000000000000057 x21: 
0000000000000000
  x20: 
ffff0000802f5a00 x19: 
ffff000087d4cd80 x18: 
ffffffffffffffff
  x17: 
6234362066666666 x16: 
6630303078302d30 x15: 
ffff00008156d888
  x14: 
0000000000000000 x13: 
ffff0000801db910 x12: 
ffff00008156d6d0
  x11: 
0000000000000003 x10: 
ffff0000801db918 x9 : 
ffffd32b50f94d9c
  x8 : 
1fffe0001032fda1 x7 : 
ffff00008197ed00 x6 : 
000000000000000f
  x5 : 
000000000000010e x4 : 
000000000000010e x3 : 
0000000000000000
  x2 : 
ffffd32b51720cd8 x1 : 
ffff000087e6f700 x0 : 
0000000000000057
  Call trace:
   nvidia_smmu_context_fault+0x1c/0x158
   __free_irq+0x1d4/0x2e8
   free_irq+0x3c/0x80
   devm_free_irq+0x64/0xa8
   arm_smmu_domain_free+0xc4/0x158
   iommu_domain_free+0x44/0xa0
   iommu_deinit_device+0xd0/0xf8
   __iommu_group_remove_device+0xcc/0xe0
   iommu_bus_notifier+0x64/0xa8
   notifier_call_chain+0x78/0x148
   blocking_notifier_call_chain+0x4c/0x90
   bus_notify+0x44/0x70
   device_del+0x264/0x3e8
   pci_remove_bus_device+0x84/0x120
   pci_remove_root_bus+0x5c/0xc0
   dw_pcie_host_deinit+0x38/0xe0
   tegra_pcie_config_rp+0xc0/0x1f0
   tegra_pcie_dw_probe+0x34c/0x700
   platform_probe+0x70/0xe8
   really_probe+0xc8/0x3a0
   __driver_probe_device+0x84/0x160
   driver_probe_device+0x44/0x130
   __device_attach_driver+0xc4/0x170
   bus_for_each_drv+0x90/0x100
   __device_attach+0xa8/0x1c8
   device_initial_probe+0x1c/0x30
   bus_probe_device+0xb0/0xc0
   deferred_probe_work_func+0xbc/0x120
   process_one_work+0x194/0x490
   worker_thread+0x284/0x3b0
   kthread+0xf4/0x108
   ret_from_fork+0x10/0x20
  Code: 
a9b97bfd 910003fd a9025bf5 f85a0035 (
b94122a1)
Cc: stable@vger.kernel.org
Fixes: e0976331ad11 ("iommu/arm-smmu: Pass arm_smmu_domain to internal functions")
Reported-by: Jerry Snitselaar <jsnitsel@redhat.com>
Closes: https://lore.kernel.org/all/jto5e3ili4auk6sbzpnojdvhppgwuegir7mpd755anfhwcbkfz@2u5gh7bxb4iv
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Jerry Snitselaar <jsnitsel@redhat.com>
Acked-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/0-v1-24ce064de41f+4ac-nvidia_smmu_fault_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Dave Airlie [Fri, 10 May 2024 00:06:02 +0000 (10:06 +1000)]
 
Merge tag 'drm-xe-fixes-2024-05-09' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
- Fix use zero-length element array
- Move more from system wq to ordered private wq
- Do not ignore return for drmm_mutex_init
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/c3rduifdp5wipkljdpuq4x6uowkc2uyzgdoft4txvp6mgvzjaj@7zw7c6uw4wrf
Dave Airlie [Thu, 9 May 2024 22:34:14 +0000 (08:34 +1000)]
 
Merge tag 'drm-intel-fixes-2024-05-08' of https://anongit.freedesktop.org/git/drm/drm-intel into drm-fixes
- Automate CCS Mode setting during engine resets (Andi)
- Fix audio time stamp programming for DP (Chaitanya)
- Fix parsing backlight BDB data (Karthikeyan)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZjvTVEmQeVKVB2jx@intel.com
Jens Axboe [Thu, 9 May 2024 17:49:18 +0000 (11:49 -0600)]
 
Merge tag 'nvme-6.9-2024-05-09' of git://git.infradead.org/nvme into block-6.9
Pull NVMe fixes from Keith:
"nvme fixes for Linux 6.9
 - nvme target fixes (Sagi, Dan, Maurizo)
 - new vendor quirk for broken MSI (Sean)"
* tag 'nvme-6.9-2024-05-09' of git://git.infradead.org/nvme:
  nvmet-rdma: fix possible bad dereference when freeing rsps
  nvmet: prevent sprintf() overflow in nvmet_subsys_nsid_exists()
  nvmet: make nvmet_wq unbound
  nvmet-auth: return the error code to the nvmet_auth_ctrl_hash() callers
  nvme-pci: Add quirk for broken MSIs
Linus Torvalds [Thu, 9 May 2024 17:17:22 +0000 (10:17 -0700)]
 
Merge tag 'hwmon-for-v6.9-rc8' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
 - pmbus/ucd9000: Increase chip access delay to avoid random access
   errors
 - corsair-cpro: Protect kernel code against parallel hidraw access from
   userspace
* tag 'hwmon-for-v6.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (pmbus/ucd9000) Increase delay from 250 to 500us
  hwmon: (corsair-cpro) Protect ccp->wait_input_report with a spinlock
  hwmon: (corsair-cpro) Use complete_all() instead of complete() in ccp_raw_event()
  hwmon: (corsair-cpro) Use a separate buffer for sending commands
Lakshmi Yadlapati [Tue, 7 May 2024 19:46:03 +0000 (14:46 -0500)]
 
hwmon: (pmbus/ucd9000) Increase delay from 250 to 500us
Following the failure observed with a delay of 250us, experiments were
conducted with various delays. It was found that a delay of 350us
effectively mitigated the issue.
To provide a more optimal solution while still allowing a margin for
stability, the delay is being adjusted to 500us.
Signed-off-by: Lakshmi Yadlapati <lakshmiy@us.ibm.com>
Link: https://lore.kernel.org/r/20240507194603.1305750-1-lakshmiy@us.ibm.com
Fixes: 8d655e6523764 ("hwmon: (ucd90320) Add minimum delay between bus accesses")
Reviewed-by: Eddie James <eajames@linux.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Linus Torvalds [Thu, 9 May 2024 15:48:57 +0000 (08:48 -0700)]
 
Merge tag 'net-6.9-rc8' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
 "Including fixes from bluetooth and IPsec.
  The bridge patch is actually a follow-up to a recent fix in the same
  area. We have a pending v6.8 AF_UNIX regression; it should be solved
  soon, but not in time for this PR.
  Current release - regressions:
   - eth: ks8851: Queue RX packets in IRQ handler instead of disabling
     BHs
   - net: bridge: fix corrupted ethernet header on multicast-to-unicast
  Current release - new code bugs:
   - xfrm: fix possible bad pointer derferencing in error path
  Previous releases - regressionis:
   - core: fix out-of-bounds access in ops_init
   - ipv6:
      - fix potential uninit-value access in __ip6_make_skb()
      - fib6_rules: avoid possible NULL dereference in fib6_rule_action()
   - tcp: use refcount_inc_not_zero() in tcp_twsk_unique().
   - rtnetlink: correct nested IFLA_VF_VLAN_LIST attribute validation
   - rxrpc: fix congestion control algorithm
   - bluetooth:
      - l2cap: fix slab-use-after-free in l2cap_connect()
      - msft: fix slab-use-after-free in msft_do_close()
   - eth: hns3: fix kernel crash when devlink reload during
     initialization
   - eth: dsa: mv88e6xxx: add phylink_get_caps for the mv88e6320/21
     family
  Previous releases - always broken:
   - xfrm: preserve vlan tags for transport mode software GRO
   - tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets
   - eth: hns3: keep using user config after hardware reset"
* tag 'net-6.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits)
  net: dsa: mv88e6xxx: read cmode on mv88e6320/21 serdes only ports
  net: dsa: mv88e6xxx: add phylink_get_caps for the mv88e6320/21 family
  net: hns3: fix kernel crash when devlink reload during initialization
  net: hns3: fix port vlan filter not disabled issue
  net: hns3: use appropriate barrier function after setting a bit value
  net: hns3: release PTP resources if pf initialization failed
  net: hns3: change type of numa_node_mask as nodemask_t
  net: hns3: direct return when receive a unknown mailbox message
  net: hns3: using user configure after hardware reset
  net/smc: fix neighbour and rtable leak in smc_ib_find_route()
  ipv6: prevent NULL dereference in ip6_output()
  hsr: Simplify code for announcing HSR nodes timer setup
  ipv6: fib6_rules: avoid possible NULL dereference in fib6_rule_action()
  dt-bindings: net: mediatek: remove wrongly added clocks and SerDes
  rxrpc: Only transmit one ACK per jumbo packet received
  rxrpc: Fix congestion control algorithm
  selftests: test_bridge_neigh_suppress.sh: Fix failures due to duplicate MAC
  ipv6: Fix potential uninit-value access in __ip6_make_skb()
  net: phy: marvell-88q2xxx: add support for Rev B1 and B2
  appletalk: Improve handling of broadcast packets
  ...
Linus Torvalds [Thu, 9 May 2024 15:44:13 +0000 (08:44 -0700)]
 
Merge tag 'for-linus' of git://git./linux/kernel/git/rmk/linux
Pull ARM fix from Russell King:
 - clear stale KASan stack poison when a CPU resumes
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux:
  ARM: 9381/1: kasan: clear stale stack poison
Johan Hovold [Thu, 9 May 2024 13:33:04 +0000 (15:33 +0200)]
 
regulator: core: fix debugfs creation regression
regulator_get() may sometimes be called more than once for the same
consumer device, something which before commit 
dbe954d8f163 ("regulator:
core: Avoid debugfs: Directory ...  already present! error") resulted in
errors being logged.
A couple of recent commits broke the handling of such cases so that
attributes are now erroneously created in the debugfs root directory the
second time a regulator is requested and the log is filled with errors
like:
	debugfs: File 'uA_load' in directory '/' already present!
	debugfs: File 'min_uV' in directory '/' already present!
	debugfs: File 'max_uV' in directory '/' already present!
	debugfs: File 'constraint_flags' in directory '/' already present!
on any further calls.
Fixes: 2715bb11cfff ("regulator: core: Fix more error checking for debugfs_create_dir()")
Fixes: 08880713ceec ("regulator: core: Streamline debugfs operations")
Cc: stable@vger.kernel.org
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20240509133304.8883-1-johan+linaro@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Linus Torvalds [Thu, 9 May 2024 15:39:10 +0000 (08:39 -0700)]
 
Merge tag 'pull-fixes' of git://git./linux/kernel/git/viro/vfs
Pull dentry leak fix from Al Viro:
 "Dentry leak fix in the qibfs driver that I forgot to send a pull
  request for ;-/
  My apologies - it actually sat in vfs.git#fixes for more than two
  months..."
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  qibfs: fix dentry leak
Matthew Brost [Mon, 6 May 2024 03:47:58 +0000 (20:47 -0700)]
 
drm/xe: Use ordered WQ for G2H handler
System work queues are shared, use a dedicated work queue for G2H
processing to avoid G2H processing getting block behind system tasks.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: <stable@vger.kernel.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240506034758.3697397-1-matthew.brost@intel.com
(cherry picked from commit 
50aec9665e0babd62b9eee4e613d9a1ef8d2b7de)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Daniele Ceraolo Spurio [Thu, 21 Mar 2024 19:55:12 +0000 (12:55 -0700)]
 
drm/xe/guc: Check error code when initializing the CT mutex
The initialization via drmm_mutex_init can fail, so we need to check the
return code and escalate the failure.
The mutex initialization has been moved after all the other init steps
that can't fail, so we're always guaranteed to have those done and don't
have to check in the cleanup code.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240321195512.274210-1-daniele.ceraolospurio@intel.com
(cherry picked from commit 
b4abeb5545bb3ddcdda3c19067680ad0b2259be4)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Lucas De Marchi [Mon, 6 May 2024 14:19:17 +0000 (07:19 -0700)]
 
drm/xe/ads: Use flexible-array
Zero-length arrays are deprecated and flexible arrays should be
used instead: https://www.kernel.org/doc/html/v6.9-rc7/process/deprecated.html#zero-length-and-one-element-arrays
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Julia Lawall <julia.lawall@inria.fr>
Closes: https://lore.kernel.org/r/202405051824.AmjAI5Pg-lkp@intel.com/
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240506141917.205714-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 
ee7284230644e21fef0e38fc5bf8f907b6bb7f7c)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Zhongqiu Han [Sun, 5 May 2024 14:11:56 +0000 (22:11 +0800)]
 
gpiolib: cdev: Fix use after free in lineinfo_changed_notify
The use-after-free issue occurs as follows: when the GPIO chip device file
is being closed by invoking gpio_chrdev_release(), watched_lines is freed
by bitmap_free(), but the unregistration of lineinfo_changed_nb notifier
chain failed due to waiting write rwsem. Additionally, one of the GPIO
chip's lines is also in the release process and holds the notifier chain's
read rwsem. Consequently, a race condition leads to the use-after-free of
watched_lines.
Here is the typical stack when issue happened:
[free]
gpio_chrdev_release()
  --> bitmap_free(cdev->watched_lines)                  <-- freed
  --> blocking_notifier_chain_unregister()
    --> down_write(&nh->rwsem)                          <-- waiting rwsem
          --> __down_write_common()
            --> rwsem_down_write_slowpath()
                  --> schedule_preempt_disabled()
                    --> schedule()
[use]
st54spi_gpio_dev_release()
  --> gpio_free()
    --> gpiod_free()
      --> gpiod_free_commit()
        --> gpiod_line_state_notify()
          --> blocking_notifier_call_chain()
            --> down_read(&nh->rwsem);                  <-- held rwsem
            --> notifier_call_chain()
              --> lineinfo_changed_notify()
                --> test_bit(xxxx, cdev->watched_lines) <-- use after free
The side effect of the use-after-free issue is that a GPIO line event is
being generated for userspace where it shouldn't. However, since the chrdev
is being closed, userspace won't have the chance to read that event anyway.
To fix the issue, call the bitmap_free() function after the unregistration
of lineinfo_changed_nb notifier chain.
Fixes: 51c1064e82e7 ("gpiolib: add new ioctl() for monitoring changes in line info")
Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com>
Link: https://lore.kernel.org/r/20240505141156.2944912-1-quic_zhonhan@quicinc.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Bartosz Golaszewski [Tue, 7 May 2024 17:24:14 +0000 (19:24 +0200)]
 
gpiolib: use a single SRCU struct for all GPIO descriptors
We used a per-descriptor SRCU struct in order to not impose a wait with
synchronize_srcu() for descriptor X on read-only operations of
descriptor Y. Now that we no longer call synchronize_srcu() on
descriptor label change but only when releasing descriptor resources, we
can use a single SRCU structure for all GPIO descriptors in a given chip.
Suggested-by: "Paul E. McKenney" <paulmck@kernel.org>
Acked-by: "Paul E. McKenney" <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20240507172414.28513-1-brgl@bgdev.pl
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Steffen Bätz [Wed, 8 May 2024 07:29:44 +0000 (09:29 +0200)]
 
net: dsa: mv88e6xxx: read cmode on mv88e6320/21 serdes only ports
On the mv88e6320 and 6321 switch family, port 0/1 are serdes only ports.
Modified the mv88e6352_get_port4_serdes_cmode function to pass a port
number since the register set of the 6352 is equal on the 6320/21.
Signed-off-by: Steffen Bätz <steffen@innosonix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Link: https://lore.kernel.org/r/20240508072944.54880-3-steffen@innosonix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Steffen Bätz [Wed, 8 May 2024 07:29:43 +0000 (09:29 +0200)]
 
net: dsa: mv88e6xxx: add phylink_get_caps for the mv88e6320/21 family
As of commit 
de5c9bf40c45 ("net: phylink: require supported_interfaces to
be filled")
Marvell 
88e6320/21 switches fail to be probed:
...
mv88e6085 
30be0000.ethernet-1:00: phylink: error: empty supported_interfaces
error creating PHYLINK: -22
...
The problem stems from the use of mv88e6185_phylink_get_caps() to get
the device capabilities.
Since there are serdes only ports 0/1 included, create a new dedicated
phylink_get_caps for the 6320 and 6321 to properly support their
set of capabilities.
Fixes: de5c9bf40c45 ("net: phylink: require supported_interfaces to be filled")
Signed-off-by: Steffen Bätz <steffen@innosonix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Link: https://lore.kernel.org/r/20240508072944.54880-2-steffen@innosonix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 9 May 2024 08:47:34 +0000 (10:47 +0200)]
 
Merge branch 'there-are-some-bugfix-for-the-hns3-ethernet-driver'
Jijie Shao says:
====================
There are some bugfix for the HNS3 ethernet driver
====================
Link: https://lore.kernel.org/r/20240507134224.2646246-1-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Yonglong Liu [Tue, 7 May 2024 13:42:24 +0000 (21:42 +0800)]
 
net: hns3: fix kernel crash when devlink reload during initialization
The devlink reload process will access the hardware resources,
but the register operation is done before the hardware is initialized.
So, processing the devlink reload during initialization may lead to kernel
crash.
This patch fixes this by registering the devlink after
hardware initialization.
Fixes: cd6242991d2e ("net: hns3: add support for registering devlink for VF")
Fixes: 93305b77ffcb ("net: hns3: fix kernel crash when devlink reload during pf initialization")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Yonglong Liu [Tue, 7 May 2024 13:42:23 +0000 (21:42 +0800)]
 
net: hns3: fix port vlan filter not disabled issue
According to hardware limitation, for device support modify
VLAN filter state but not support bypass port VLAN filter,
it should always disable the port VLAN filter. but the driver
enables port VLAN filter when initializing, if there is no
VLAN(except VLAN 0) id added, the driver will disable it
in service task. In most time, it works fine. But there is
a time window before the service task shceduled and net device
being registered. So if user adds VLAN at this time, the driver
will not update the VLAN filter state,  and the port VLAN filter
remains enabled.
To fix the problem, if support modify VLAN filter state but not
support bypass port VLAN filter, set the port vlan filter to "off".
Fixes: 184cd221a863 ("net: hns3: disable port VLAN filter when support function level VLAN filter control")
Fixes: 2ba306627f59 ("net: hns3: add support for modify VLAN filter state")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Peiyang Wang [Tue, 7 May 2024 13:42:22 +0000 (21:42 +0800)]
 
net: hns3: use appropriate barrier function after setting a bit value
There is a memory barrier in followed case. When set the port down,
hclgevf_set_timmer will set DOWN in state. Meanwhile, the service task has
different behaviour based on whether the state is DOWN. Thus, to make sure
service task see DOWN, use smp_mb__after_atomic after calling set_bit().
          CPU0                        CPU1
========================== ===================================
hclgevf_set_timer_task()    hclgevf_periodic_service_task()
  set_bit(DOWN,state)         test_bit(DOWN,state)
pf also has this issue.
Fixes: ff200099d271 ("net: hns3: remove unnecessary work in hclgevf_main")
Fixes: 1c6dfe6fc6f7 ("net: hns3: remove mailbox and reset work in hclge_main")
Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Peiyang Wang [Tue, 7 May 2024 13:42:21 +0000 (21:42 +0800)]
 
net: hns3: release PTP resources if pf initialization failed
During the PF initialization process, hclge_update_port_info may return an
error code for some reason. At this point,  the ptp initialization has been
completed. To void memory leaks, the resources that are applied by ptp
should be released. Therefore, when hclge_update_port_info returns an error
code, hclge_ptp_uninit is called to release the corresponding resources.
Fixes: eaf83ae59e18 ("net: hns3: add querying fec ability from firmware")
Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Peiyang Wang [Tue, 7 May 2024 13:42:20 +0000 (21:42 +0800)]
 
net: hns3: change type of numa_node_mask as nodemask_t
It provides nodemask_t to describe the numa node mask in kernel. To
improve transportability, change the type of numa_node_mask as nodemask_t.
Fixes: 38caee9d3ee8 ("net: hns3: Add support of the HNAE3 framework")
Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jian Shen [Tue, 7 May 2024 13:42:19 +0000 (21:42 +0800)]
 
net: hns3: direct return when receive a unknown mailbox message
Currently, the driver didn't return when receive a unknown
mailbox message, and continue checking whether need to
generate a response. It's unnecessary and may be incorrect.
Fixes: bb5790b71bad ("net: hns3: refactor mailbox response scheme between PF and VF")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Peiyang Wang [Tue, 7 May 2024 13:42:18 +0000 (21:42 +0800)]
 
net: hns3: using user configure after hardware reset
When a reset occurring, it's supposed to recover user's configuration.
Currently, the port info(speed, duplex and autoneg) is stored in hclge_mac
and will be scheduled updated. Consider the case that reset was happened
consecutively. During the first reset, the port info is configured with
a temporary value cause the PHY is reset and looking for best link config.
Second reset start and use pervious configuration which is not the user's.
The specific process is as follows:
+------+               +----+                +----+
| USER |               | PF |                | HW |
+---+--+               +-+--+                +-+--+
    |  ethtool --reset   |                     |
    +------------------->|    reset command    |
    |  ethtool --reset   +-------------------->|
    +------------------->|                     +---+
    |                    +---+                 |   |
    |                    |   |reset currently  |   | HW RESET
    |                    |   |and wait to do   |   |
    |                    |<--+                 |   |
    |                    | send pervious cfg   |<--+
    |                    | (1000M FULL AN_ON)  |
    |                    +-------------------->|
    |                    | read cfg(time task) |
    |                    | (10M HALF AN_OFF)   +---+
    |                    |<--------------------+   | cfg take effect
    |                    |    reset command    |<--+
    |                    +-------------------->|
    |                    |                     +---+
    |                    | send pervious cfg   |   | HW RESET
    |                    | (10M HALF AN_OFF)   |<--+
    |                    +-------------------->|
    |                    | read cfg(time task) |
    |                    |  (10M HALF AN_OFF)  +---+
    |                    |<--------------------+   | cfg take effect
    |                    |                     |   |
    |                    | read cfg(time task) |<--+
    |                    |  (10M HALF AN_OFF)  |
    |                    |<--------------------+
    |                    |                     |
    v                    v                     v
To avoid aboved situation, this patch introduced req_speed, req_duplex,
req_autoneg to store user's configuration and it only be used after
hardware reset and to recover user's configuration
Fixes: f5f2b3e4dcc0 ("net: hns3: add support for imp-controlled PHYs")
Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Wen Gu [Tue, 7 May 2024 12:53:31 +0000 (20:53 +0800)]
 
net/smc: fix neighbour and rtable leak in smc_ib_find_route()
In smc_ib_find_route(), the neighbour found by neigh_lookup() and rtable
resolved by ip_route_output_flow() are not released or put before return.
It may cause the refcount leak, so fix it.
Link: https://lore.kernel.org/r/20240506015439.108739-1-guwen@linux.alibaba.com
Fixes: e5c4744cfb59 ("net/smc: add SMC-Rv2 connection establishment")
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240507125331.2808-1-guwen@linux.alibaba.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Conor Dooley [Wed, 8 May 2024 15:46:51 +0000 (16:46 +0100)]
 
spi: microchip-core-qspi: fix setting spi bus clock rate
Before ORing the new clock rate with the control register value read
from the hardware, the existing clock rate needs to be masked off as
otherwise the existing value will interfere with the new one.
CC: stable@vger.kernel.org
Fixes: 8596124c4c1b ("spi: microchip-core-qspi: Add support for microchip fpga qspi controllers")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Link: https://lore.kernel.org/r/20240508-fox-unpiloted-b97e1535627b@spud
Signed-off-by: Mark Brown <broonie@kernel.org>
Eric Dumazet [Tue, 7 May 2024 16:18:42 +0000 (16:18 +0000)]
 
ipv6: prevent NULL dereference in ip6_output()
According to syzbot, there is a chance that ip6_dst_idev()
returns NULL in ip6_output(). Most places in IPv6 stack
deal with a NULL idev just fine, but not here.
syzbot reported:
general protection fault, probably for non-canonical address 0xdffffc00000000bc: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x00000000000005e0-0x00000000000005e7]
CPU: 0 PID: 9775 Comm: syz-executor.4 Not tainted 
6.9.0-rc5-syzkaller-00157-g6a30653b604a #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
 RIP: 0010:ip6_output+0x231/0x3f0 net/ipv6/ip6_output.c:237
Code: 3c 1e 00 49 89 df 74 08 4c 89 ef e8 19 58 db f7 48 8b 44 24 20 49 89 45 00 49 89 c5 48 8d 9d e0 05 00 00 48 89 d8 48 c1 e8 03 <42> 0f b6 04 38 84 c0 4c 8b 74 24 28 0f 85 61 01 00 00 8b 1b 31 ff
RSP: 0018:
ffffc9000927f0d8 EFLAGS: 
00010202
RAX: 
00000000000000bc RBX: 
00000000000005e0 RCX: 
0000000000040000
RDX: 
ffffc900131f9000 RSI: 
0000000000004f47 RDI: 
0000000000004f48
RBP: 
0000000000000000 R08: 
ffffffff8a1f0b9a R09: 
1ffffffff1f51fad
R10: 
dffffc0000000000 R11: 
fffffbfff1f51fae R12: 
ffff8880293ec8c0
R13: 
ffff88805d7fc000 R14: 
1ffff1100527d91a R15: 
dffffc0000000000
FS:  
00007f135c6856c0(0000) GS:
ffff8880b9400000(0000) knlGS:
0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 
0000000080050033
CR2: 
0000000020000080 CR3: 
0000000064096000 CR4: 
00000000003506f0
DR0: 
0000000000000000 DR1: 
0000000000000000 DR2: 
0000000000000000
DR3: 
0000000000000000 DR6: 
00000000fffe0ff0 DR7: 
0000000000000400
Call Trace:
 <TASK>
  NF_HOOK include/linux/netfilter.h:314 [inline]
  ip6_xmit+0xefe/0x17f0 net/ipv6/ip6_output.c:358
  sctp_v6_xmit+0x9f2/0x13f0 net/sctp/ipv6.c:248
  sctp_packet_transmit+0x26ad/0x2ca0 net/sctp/output.c:653
  sctp_packet_singleton+0x22c/0x320 net/sctp/outqueue.c:783
  sctp_outq_flush_ctrl net/sctp/outqueue.c:914 [inline]
  sctp_outq_flush+0x6d5/0x3e20 net/sctp/outqueue.c:1212
  sctp_side_effects net/sctp/sm_sideeffect.c:1198 [inline]
  sctp_do_sm+0x59cc/0x60c0 net/sctp/sm_sideeffect.c:1169
  sctp_primitive_ASSOCIATE+0x95/0xc0 net/sctp/primitive.c:73
  __sctp_connect+0x9cd/0xe30 net/sctp/socket.c:1234
  sctp_connect net/sctp/socket.c:4819 [inline]
  sctp_inet_connect+0x149/0x1f0 net/sctp/socket.c:4834
  __sys_connect_file net/socket.c:2048 [inline]
  __sys_connect+0x2df/0x310 net/socket.c:2065
  __do_sys_connect net/socket.c:2075 [inline]
  __se_sys_connect net/socket.c:2072 [inline]
  __x64_sys_connect+0x7a/0x90 net/socket.c:2072
  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
  do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Fixes: 778d80be5269 ("ipv6: Add disable_ipv6 sysctl to disable IPv6 operaion on specific interface.")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Link: https://lore.kernel.org/r/20240507161842.773961-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Lukasz Majewski [Tue, 7 May 2024 11:12:14 +0000 (13:12 +0200)]
 
hsr: Simplify code for announcing HSR nodes timer setup
Up till now the code to start HSR announce timer, which triggers sending
supervisory frames, was assuming that hsr_netdev_notify() would be called
at least twice for hsrX interface. This was required to have different
values for old and current values of network device's operstate.
This is problematic for a case where hsrX interface is already in the
operational state when hsr_netdev_notify() is called, so timer is not
configured to trigger and as a result the hsrX is not sending supervisory
frames to HSR ring.
This error has been discovered when hsr_ping.sh script was run. To be
more specific - for the hsr1 and hsr2 the hsr_netdev_notify() was
called at least twice with different IF_OPER_{LOWERDOWN|DOWN|UP} states
assigned in hsr_check_carrier_and_operstate(hsr). As a result there was
no issue with sending supervisory frames.
However, with hsr3, the notify function was called only once with
operstate set to IF_OPER_UP and timer responsible for triggering
supervisory frames was not fired.
The solution is to use netif_oper_up() and netif_running() helper
functions to assess if network hsrX device is up.
Only then, when the timer is not already pending, it is started.
Otherwise it is deactivated.
Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
Signed-off-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240507111214.3519800-1-lukma@denx.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Tue, 7 May 2024 16:31:45 +0000 (16:31 +0000)]
 
ipv6: fib6_rules: avoid possible NULL dereference in fib6_rule_action()
syzbot is able to trigger the following crash [1],
caused by unsafe ip6_dst_idev() use.
Indeed ip6_dst_idev() can return NULL, and must always be checked.
[1]
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 0 PID: 31648 Comm: syz-executor.0 Not tainted 6.9.0-rc4-next-
20240417-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
 RIP: 0010:__fib6_rule_action net/ipv6/fib6_rules.c:237 [inline]
 RIP: 0010:fib6_rule_action+0x241/0x7b0 net/ipv6/fib6_rules.c:267
Code: 02 00 00 49 8d 9f d8 00 00 00 48 89 d8 48 c1 e8 03 42 80 3c 20 00 74 08 48 89 df e8 f9 32 bf f7 48 8b 1b 48 89 d8 48 c1 e8 03 <42> 80 3c 20 00 74 08 48 89 df e8 e0 32 bf f7 4c 8b 03 48 89 ef 4c
RSP: 0018:
ffffc9000fc1f2f0 EFLAGS: 
00010246
RAX: 
0000000000000000 RBX: 
0000000000000000 RCX: 
1a772f98c8186700
RDX: 
0000000000000003 RSI: 
ffffffff8bcac4e0 RDI: 
ffffffff8c1f9760
RBP: 
ffff8880673fb980 R08: 
ffffffff8fac15ef R09: 
1ffffffff1f582bd
R10: 
dffffc0000000000 R11: 
fffffbfff1f582be R12: 
dffffc0000000000
R13: 
0000000000000080 R14: 
ffff888076509000 R15: 
ffff88807a029a00
FS:  
00007f55e82ca6c0(0000) GS:
ffff8880b9400000(0000) knlGS:
0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 
0000000080050033
CR2: 
0000001b31d23000 CR3: 
0000000022b66000 CR4: 
00000000003506f0
DR0: 
0000000000000000 DR1: 
0000000000000000 DR2: 
0000000000000000
DR3: 
0000000000000000 DR6: 
00000000fffe0ff0 DR7: 
0000000000000400
Call Trace:
 <TASK>
  fib_rules_lookup+0x62c/0xdb0 net/core/fib_rules.c:317
  fib6_rule_lookup+0x1fd/0x790 net/ipv6/fib6_rules.c:108
  ip6_route_output_flags_noref net/ipv6/route.c:2637 [inline]
  ip6_route_output_flags+0x38e/0x610 net/ipv6/route.c:2649
  ip6_route_output include/net/ip6_route.h:93 [inline]
  ip6_dst_lookup_tail+0x189/0x11a0 net/ipv6/ip6_output.c:1120
  ip6_dst_lookup_flow+0xb9/0x180 net/ipv6/ip6_output.c:1250
  sctp_v6_get_dst+0x792/0x1e20 net/sctp/ipv6.c:326
  sctp_transport_route+0x12c/0x2e0 net/sctp/transport.c:455
  sctp_assoc_add_peer+0x614/0x15c0 net/sctp/associola.c:662
  sctp_connect_new_asoc+0x31d/0x6c0 net/sctp/socket.c:1099
  __sctp_connect+0x66d/0xe30 net/sctp/socket.c:1197
  sctp_connect net/sctp/socket.c:4819 [inline]
  sctp_inet_connect+0x149/0x1f0 net/sctp/socket.c:4834
  __sys_connect_file net/socket.c:2048 [inline]
  __sys_connect+0x2df/0x310 net/socket.c:2065
  __do_sys_connect net/socket.c:2075 [inline]
  __se_sys_connect net/socket.c:2072 [inline]
  __x64_sys_connect+0x7a/0x90 net/socket.c:2072
  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
  do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Fixes: 5e5f3f0f8013 ("[IPV6] ADDRCONF: Convert ipv6_get_saddr() to ipv6_dev_get_saddr().")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20240507163145.835254-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Daniel Golle [Tue, 7 May 2024 12:20:43 +0000 (13:20 +0100)]
 
dt-bindings: net: mediatek: remove wrongly added clocks and SerDes
Several clocks as well as both sgmiisys phandles were added by mistake
to the Ethernet bindings for MT7988. Also, the total number of clocks
didn't match with the actual number of items listed.
This happened because the vendor driver which served as a reference uses
a high number of syscon phandles to access various parts of the SoC
which wasn't acceptable upstream. Hence several parts which have never
previously been supported (such SerDes PHY and USXGMII PCS) are going to
be implemented by separate drivers. As a result the device tree will
look much more sane.
Quickly align the bindings with the upcoming reality of the drivers
actually adding support for the remaining Ethernet-related features of
the MT7988 SoC.
Fixes: c94a9aabec36 ("dt-bindings: net: mediatek,net: add mt7988-eth binding")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/1569290b21cc787a424469ed74456a7e976b102d.1715084326.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Lijo Lazar [Tue, 23 Apr 2024 11:22:40 +0000 (16:52 +0530)]
 
Revert "drm/amdkfd: Add partition id field to location_id"
This reverts commit 
c37ce764cd492f044dcdbb39616298f02b0dbc7f.
RCCL library is currently not treating spatial partitions differently,
hence this change is causing issues. Revert temporarily till RCCL
implementation is ready for spatial partitions.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mario Limonciello [Thu, 2 May 2024 18:32:17 +0000 (13:32 -0500)]
 
dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 users
Limit the workaround introduced by commit 
31729e8c21ec ("drm/amd/pm: fixes
a random hang in S4 for SMU v13.0.4/11") to only run in the s4 path.
Cc: Tim Huang <Tim.Huang@amd.com>
Fixes: 31729e8c21ec ("drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3351
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Agustin Gutierrez [Thu, 25 Apr 2024 14:37:36 +0000 (10:37 -0400)]
 
drm/amd/display: MST DSC check for older devices
[Why]
Some older MST hubs do not report DPCD registers according to
specification.
[How]
This change re-applies commit 
c53655545141 ("drm/amd/display: dsc mst
re-compute pbn for changes on hub").
With an additional check for these older MST devices.
Reviewed-by: Swapnil Patel <swapnil.patel@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Agustin Gutierrez <agustin.gutierrez@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Kazlauskas [Thu, 25 Apr 2024 15:26:59 +0000 (11:26 -0400)]
 
drm/amd/display: Fix idle optimization checks for multi-display and dual eDP
[Why]
Idle optimizations are blocked if there's more than one eDP connector
on the board - blocking S0i3 and IPS2 for static screen.
[How]
Fix the checks to correctly detect number of active eDP.
Also restrict the eDP support to panels that have correct feature
support.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Agustin Gutierrez [Fri, 19 Apr 2024 17:53:52 +0000 (13:53 -0400)]
 
drm/amd/display: Fix DSC-re-computing
[Why]
This fixes a bug introduced by commit 
c53655545141 ("drm/amd/display: dsc
mst re-compute pbn for changes on hub").
The change caused light-up issues with a second display that required
DSC on some MST docks.
[How]
Use Virtual DPCD for DSC caps in MST case.
[Limitations]
This change only affects MST DSC devices that follow specifications
additional changes are required to check for old MST DSC devices such as
ones which do not check for Virtual DPCD registers.
Reviewed-by: Swapnil Patel <swapnil.patel@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Agustin Gutierrez <agustin.gutierrez@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicholas Susanto [Wed, 24 Apr 2024 17:34:11 +0000 (13:34 -0400)]
 
drm/amd/display: Enable urgent latency adjustments for DCN35
[Why]
Underflow occurs when running Netflix in a 4k144 eDP + 4k60 HDMI FRL
setup. It is caused by latency varying based on the DCFCLK/FCLK state.
[How]
Enable urgent latency adjustment and match the reference to existing
ASIC that also see increased latency at low FCLK.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Nicholas Susanto <nicholas.susanto@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Linus Torvalds [Wed, 8 May 2024 17:39:53 +0000 (10:39 -0700)]
 
Merge tag '6.9-rc7-ksmbd-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
 "Five ksmbd server fixes, all also for stable
   - Three fixes related to SMB3 leases (fixes two xfstests, and a
     locking issue)
   - Unitialized variable fix
   - Socket creation fix when bindv6only is set"
* tag '6.9-rc7-ksmbd-fixes' of git://git.samba.org/ksmbd:
  ksmbd: do not grant v2 lease if parent lease key and epoch are not set
  ksmbd: use rwsem instead of rwlock for lease break
  ksmbd: avoid to send duplicate lease break notifications
  ksmbd: off ipv6only for both ipv4/ipv6 binding
  ksmbd: fix uninitialized symbol 'share' in smb2_tree_connect()
Linus Torvalds [Wed, 8 May 2024 17:33:55 +0000 (10:33 -0700)]
 
Merge tag 'fuse-fixes-6.9-final' of git://git./linux/kernel/git/mszeredi/fuse
Pull fuse fixes from Miklos Szeredi:
 "Two one-liner fixes for issues introduced in -rc1"
* tag 'fuse-fixes-6.9-final' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  virtiofs: include a newline in sysfs tag
  fuse: verify zero padding in fuse_backing_map
Linus Torvalds [Wed, 8 May 2024 17:30:13 +0000 (10:30 -0700)]
 
Merge tag 'exfat-for-6.9-rc8' of git://git./linux/kernel/git/linkinjeon/exfat
Pull exfat fixes from Namjae Jeon:
 - Fix xfstests generic/013 test failure with dirsync mount option
 - Initialize the reserved fields of deleted file and stream extension
   dentries to zero
* tag 'exfat-for-6.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
  exfat: zero the reserved fields of file and stream extension dentries
  exfat: fix timing of synchronizing bitmap and inode
Linus Torvalds [Wed, 8 May 2024 17:23:18 +0000 (10:23 -0700)]
 
Merge tag 'bcachefs-2024-05-07.2' of https://evilpiepirate.org/git/bcachefs
Pull bcachefs fixes from Kent Overstreet:
 - Various syzbot fixes; mainly small gaps in validation
 - Fix an integer overflow in fiemap() which was preventing filefrag
   from returning the full list of extents
 - Fix a refcounting bug on the device refcount, turned up by new
   assertions in the development branch
 - Fix a device removal/readd bug; write_super() was repeatedly dropping
   and retaking bch_dev->io_ref references
* tag 'bcachefs-2024-05-07.2' of https://evilpiepirate.org/git/bcachefs:
  bcachefs: Add missing sched_annotate_sleep() in bch2_journal_flush_seq_async()
  bcachefs: Fix race in bch2_write_super()
  bcachefs: BCH_SB_LAYOUT_SIZE_BITS_MAX
  bcachefs: Add missing skcipher_request_set_callback() call
  bcachefs: Fix snapshot_t() usage in bch2_fs_quota_read_inode()
  bcachefs: Fix shift-by-64 in bformat_needs_redo()
  bcachefs: Guard against unknown k.k->type in __bkey_invalid()
  bcachefs: Add missing validation for superblock section clean
  bcachefs: Fix assert in bch2_alloc_v4_invalid()
  bcachefs: fix overflow in fiemap
  bcachefs: Add a better limit for maximum number of buckets
  bcachefs: Fix lifetime issue in device iterator helpers
  bcachefs: Fix bch2_dev_lookup() refcounting
  bcachefs: Initialize bch_write_op->failed in inline data path
  bcachefs: Fix refcount put in sb_field_resize error path
  bcachefs: Inodes need extra padding for varint_decode_fast()
  bcachefs: Fix early error path in bch2_fs_btree_key_cache_exit()
  bcachefs: bucket_pos_to_bp_noerror()
  bcachefs: don't free error pointers
  bcachefs: Fix a scheduler splat in __bch2_next_write_buffer_flush_journal_buf()
Linus Torvalds [Wed, 8 May 2024 17:15:40 +0000 (10:15 -0700)]
 
Merge tag 'soc-fixes-6.9-3' of git://git./linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
 "These are a couple of last minute fixes that came in over the previous
  week, addressing:
   - A pin configuration bug on a qualcomm board that caused issues with
     ethernet and mmc
   - Two minor code fixes for misleading console output in the microchip
     firmware driver
   - A build warning in the sifive cache driver"
* tag 'soc-fixes-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  firmware: microchip: clarify that sizes and addresses are in hex
  firmware: microchip: don't unconditionally print validation success
  arm64: dts: qcom: sa8155p-adp: fix SDHC2 CD pin configuration
  cache: sifive_ccache: Silence unused variable warning
Linus Torvalds [Wed, 8 May 2024 16:37:58 +0000 (09:37 -0700)]
 
Merge tag 'pci-v6.9-fixes-2' of git://git./linux/kernel/git/pci/pci
Pull pci fixes from Bjorn Helgaas:
 - Update kernel-parameters doc to describe "pcie_aspm=off" more
   accurately (Bjorn Helgaas)
 - Restore the parent's (not the child's) ASPM state to the parent
   during resume, which fixes a reboot during resume (Kai-Heng Feng)
* tag 'pci-v6.9-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI/ASPM: Restore parent state to parent, child state to child
  PCI/ASPM: Clarify that pcie_aspm=off means leave ASPM untouched
Jakub Kicinski [Wed, 8 May 2024 15:05:12 +0000 (08:05 -0700)]
 
Merge branch 'rxrpc-miscellaneous-fixes'
David Howells says:
====================
rxrpc: Miscellaneous fixes (part)
Here some miscellaneous fixes for AF_RXRPC:
 (1) Fix the congestion control algorithm to start cwnd at 4 and to not cut
     ssthresh when the peer cuts its rwind size.
 (2) Only transmit a single ACK for all the DATA packets glued together
     into a jumbo packet to reduce the number of ACKs being generated.
====================
Link: https://lore.kernel.org/r/20240503150749.1001323-1-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Howells [Fri, 3 May 2024 15:07:40 +0000 (16:07 +0100)]
 
rxrpc: Only transmit one ACK per jumbo packet received
Only generate one ACK packet for all the subpackets in a jumbo packet.  If
we would like to generate more than one ACK, we prioritise them base on
their reason code, in the order, highest first:
   OutOfSeq > NoSpace > ExceedsWin > Duplicate > Requested > Delay > Idle
For the first four, we reference the lowest offending subpacket; for the
last three, the highest.
This reduces the number of ACKs we end up transmitting to one per UDP
packet transmitted to reduce network loading and packet parsing.
Fixes: 5d7edbc9231e ("rxrpc: Get rid of the Rx ring")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Reviewed-by: Jeffrey Altman <jaltman@auristor.com <mailto:jaltman@auristor.com>>
Link: https://lore.kernel.org/r/20240503150749.1001323-3-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Howells [Fri, 3 May 2024 15:07:39 +0000 (16:07 +0100)]
 
rxrpc: Fix congestion control algorithm
Make the following fixes to the congestion control algorithm:
 (1) Don't vary the cwnd starting value by the size of RXRPC_TX_SMSS since
     that's currently held constant - set to the size of a jumbo subpacket
     payload so that we can create jumbo packets on the fly.  The current
     code invariably picks 3 as the starting value.
     Further, the starting cwnd needs to be an even number because we ack
     every other packet, so set it to 4.
 (2) Don't cut ssthresh when we see an ACK come from the peer with a
     receive window (rwind) less than ssthresh.  ssthresh keeps track of
     characteristics of the connection whereas rwind may be reduced by the
     peer for any reason - and may be reduced to 0.
Fixes: 1fc4fa2ac93d ("rxrpc: Fix congestion management")
Fixes: 0851115090a3 ("rxrpc: Reduce ssthresh to peer's receive window")
Signed-off-by: David Howells <dhowells@redhat.com>
Suggested-by: Simon Wilkinson <sxw@auristor.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Reviewed-by: Jeffrey Altman <jaltman@auristor.com <mailto:jaltman@auristor.com>>
Link: https://lore.kernel.org/r/20240503150749.1001323-2-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Tue, 7 May 2024 11:30:33 +0000 (14:30 +0300)]
 
selftests: test_bridge_neigh_suppress.sh: Fix failures due to duplicate MAC
When creating the topology for the test, three veth pairs are created in
the initial network namespace before being moved to one of the network
namespaces created by the test.
On systems where systemd-udev uses MACAddressPolicy=persistent (default
since systemd version 242), this will result in some net devices having
the same MAC address since they were created with the same name in the
initial network namespace. In turn, this leads to arping / ndisc6
failing since packets are dropped by the bridge's loopback filter.
Fix by creating each net device in the correct network namespace instead
of moving it there from the initial network namespace.
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/20240426074015.251854d4@kernel.org/
Fixes: 7648ac72dcd7 ("selftests: net: Add bridge neighbor suppression test")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20240507113033.1732534-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sagi Grimberg [Wed, 8 May 2024 07:53:06 +0000 (10:53 +0300)]
 
nvmet-rdma: fix possible bad dereference when freeing rsps
It is possible that the host connected and saw a cm established
event and started sending nvme capsules on the qp, however the
ctrl did not yet see an established event. This is why the
rsp_wait_list exists (for async handling of these cmds, we move
them to a pending list).
Furthermore, it is possible that the ctrl cm times out, resulting
in a connect-error cm event. in this case we hit a bad deref [1]
because in nvmet_rdma_free_rsps we assume that all the responses
are in the free list.
We are freeing the cmds array anyways, so don't even bother to
remove the rsp from the free_list. It is also guaranteed that we
are not racing anything when we are releasing the queue so no
other context accessing this array should be running.
[1]:
--
Workqueue: nvmet-free-wq nvmet_rdma_free_queue_work [nvmet_rdma]
[...]
pc : nvmet_rdma_free_rsps+0x78/0xb8 [nvmet_rdma]
lr : nvmet_rdma_free_queue_work+0x88/0x120 [nvmet_rdma]
 Call trace:
 nvmet_rdma_free_rsps+0x78/0xb8 [nvmet_rdma]
 nvmet_rdma_free_queue_work+0x88/0x120 [nvmet_rdma]
 process_one_work+0x1ec/0x4a0
 worker_thread+0x48/0x490
 kthread+0x158/0x160
 ret_from_fork+0x10/0x18
--
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Dan Carpenter [Wed, 8 May 2024 07:43:04 +0000 (10:43 +0300)]
 
nvmet: prevent sprintf() overflow in nvmet_subsys_nsid_exists()
The nsid value is a u32 that comes from nvmet_req_find_ns().  It's
endian data and we're on an error path and both of those raise red
flags.  So let's make this safer.
1) Make the buffer large enough for any u32.
2) Remove the unnecessary initialization.
3) Use snprintf() instead of sprintf() for even more safety.
4) The sprintf() function returns the number of bytes printed, not
   counting the NUL terminator. It is impossible for the return value to
   be <= 0 so delete that.
Fixes: 505363957fad ("nvmet: fix nvme status code when namespace is disabled")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Shigeru Yoshida [Mon, 6 May 2024 14:11:29 +0000 (23:11 +0900)]
 
ipv6: Fix potential uninit-value access in __ip6_make_skb()
As it was done in commit 
fc1092f51567 ("ipv4: Fix uninit-value access in
__ip_make_skb()") for IPv4, check FLOWI_FLAG_KNOWN_NH on fl6->flowi6_flags
instead of testing HDRINCL on the socket to avoid a race condition which
causes uninit-value access.
Fixes: ea30388baebc ("ipv6: Fix an uninit variable access bug in __ip6_make_skb()")
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gregor Herburger [Mon, 6 May 2024 06:24:33 +0000 (08:24 +0200)]
 
net: phy: marvell-88q2xxx: add support for Rev B1 and B2
Different revisions of the Marvell 88q2xxx phy needs different init
sequences.
Add init sequence for Rev B1 and Rev B2. Rev B2 init sequence skips one
register write.
Tested-by: Dimitri Fedrau <dima.fedrau@gmail.com>
Signed-off-by: Gregor Herburger <gregor.herburger@ew.tq-group.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vincent Duvert [Sun, 5 May 2024 18:54:57 +0000 (11:54 -0700)]
 
appletalk: Improve handling of broadcast packets
When a broadcast AppleTalk packet is received, prefer queuing it on the
socket whose address matches the address of the interface that received
the packet (and is listening on the correct port). Userspace
applications that handle such packets will usually send a response on
the same socket that received the packet; this fix allows the response
to be sent on the correct interface.
If a socket matching the interface's address is not found, an arbitrary
socket listening on the correct port will be used, if any. This matches
the implementation's previous behavior.
Fixes atalkd's responses to network information requests when multiple
network interfaces are configured to use AppleTalk.
Link: https://lore.kernel.org/netdev/20200722113752.1218-2-vincent.ldev@duvert.net/
Link: https://gist.github.com/VinDuv/4db433b6dce39d51a5b7847ee749b2a4
Signed-off-by: Vincent Duvert <vincent.ldev@duvert.net>
Signed-off-by: Doug Brown <doug@schmorgal.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Felix Fietkau [Sun, 5 May 2024 18:42:38 +0000 (20:42 +0200)]
 
net: bridge: fix corrupted ethernet header on multicast-to-unicast
The change from skb_copy to pskb_copy unfortunately changed the data
copying to omit the ethernet header, since it was pulled before reaching
this point. Fix this by calling __skb_push/pull around pskb_copy.
Fixes: 59c878cbcdd8 ("net: bridge: fix multicast-to-unicast with fraglist GSO")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Levi Yun [Mon, 6 May 2024 04:10:59 +0000 (05:10 +0100)]
 
timers/migration: Prevent out of bounds access on failure
When tmigr_setup_groups() fails the level 0 group allocation, then the
cleanup derefences index -1 of the local stack array.
Prevent this by checking the loop condition first.
Fixes: 7ee988770326 ("timers: Implement the hierarchical pull model")
Signed-off-by: Levi Yun <ppbuk5246@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Link: https://lore.kernel.org/r/20240506041059.86877-1-ppbuk5246@gmail.com
Brian Foster [Thu, 25 Apr 2024 10:44:00 +0000 (06:44 -0400)]
 
virtiofs: include a newline in sysfs tag
The internal tag string doesn't contain a newline. Append one when
emitting the tag via sysfs.
[Stefan] Orthogonal to the newline issue, sysfs_emit(buf, "%s", fs->tag) is
needed to prevent format string injection.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Fixes: a8f62f50b4e4 ("virtiofs: export filesystem tags through sysfs")
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Gregory Detal [Mon, 6 May 2024 15:35:28 +0000 (17:35 +0200)]
 
mptcp: only allow set existing scheduler for net.mptcp.scheduler
The current behavior is to accept any strings as inputs, this results in
an inconsistent result where an unexisting scheduler can be set:
  # sysctl -w net.mptcp.scheduler=notdefault
  net.mptcp.scheduler = notdefault
This patch changes this behavior by checking for existing scheduler
before accepting the input.
Fixes: e3b2870b6d22 ("mptcp: add a new sysctl scheduler")
Cc: stable@vger.kernel.org
Signed-off-by: Gregory Detal <gregory.detal@gmail.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Tested-by: Geliang Tang <geliang@kernel.org>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240506-upstream-net-20240506-mptcp-sched-exist-v1-1-2ed1529e521e@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tetsuo Handa [Sun, 5 May 2024 10:36:49 +0000 (19:36 +0900)]
 
nfc: nci: Fix kcov check in nci_rx_work()
Commit 
7e8cdc97148c ("nfc: Add KCOV annotations") added
kcov_remote_start_common()/kcov_remote_stop() pair into nci_rx_work(),
with an assumption that kcov_remote_stop() is called upon continue of
the for loop. But commit 
d24b03535e5e ("nfc: nci: Fix uninit-value in
nci_dev_up and nci_ntf_packet") forgot to call kcov_remote_stop() before
break of the for loop.
Reported-by: syzbot <syzbot+0438378d6f157baae1a2@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=0438378d6f157baae1a2
Fixes: d24b03535e5e ("nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet")
Suggested-by: Andrey Konovalov <andreyknvl@gmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/6d10f829-5a0c-405a-b39a-d7266f3a1a0b@I-love.SAKURA.ne.jp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Bonzini [Tue, 7 May 2024 17:01:39 +0000 (13:01 -0400)]
 
Merge tag 'kvm-s390-master-6.9-1' of git://git./linux/kernel/git/kvms390/linux into HEAD
KVM: s390: Fix for 6.9
Fix wild read on capability check.
Douglas Anderson [Thu, 2 May 2024 22:32:35 +0000 (15:32 -0700)]
 
drm/connector: Add \n to message about demoting connector force-probes
The debug print clearly lacks a \n at the end. Add it.
Fixes: 8f86c82aba8b ("drm/connector: demote connector force-probes for non-master clients")
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240502153234.1.I2052f01c8d209d9ae9c300b87c6e4f60bd3cc99e@changeid
Bartosz Golaszewski [Tue, 7 May 2024 12:13:46 +0000 (14:13 +0200)]
 
gpiolib: fix the speed of descriptor label setting with SRCU
Commit 
1f2bcb8c8ccd ("gpio: protect the descriptor label with SRCU")
caused a massive drop in performance of requesting GPIO lines due to the
call to synchronize_srcu() on each label change. Rework the code to not
wait until all read-only users are done with reading the label but
instead atomically replace the label pointer and schedule its release
after all read-only critical sections are done.
To that end wrap the descriptor label in a struct that also contains the
rcu_head struct required for deferring tasks using call_srcu() and stop
using kstrdup_const() as we're required to allocate memory anyway. Just
allocate enough for the label string and rcu_head in one go.
Reported-by: Neil Armstrong <neil.armstrong@linaro.org>
Closes: https://lore.kernel.org/linux-gpio/CAMRc=Mfig2oooDQYTqo23W3PXSdzhVO4p=G4+P8y1ppBOrkrJQ@mail.gmail.com/
Fixes: 1f2bcb8c8ccd ("gpio: protect the descriptor label with SRCU")
Suggested-by: "Paul E. McKenney" <paulmck@kernel.org>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD
Acked-by: "Paul E. McKenney" <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20240507121346.16969-1-brgl@bgdev.pl
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>