ptp: vclock: use mutex to fix "sleep on atomic" bug
authorÍñigo Huguet <ihuguet@redhat.com>
Tue, 21 Feb 2023 13:06:16 +0000 (14:06 +0100)
committerJakub Kicinski <kuba@kernel.org>
Thu, 23 Feb 2023 05:23:48 +0000 (21:23 -0800)
commit67d93ffc0f3c47094750bde6d62e7c5765dc47a6
treea2e688d8de0990c9bd3497512f31b3925ff7af22
parent5b7c4cabbb65f5c469464da6c5f614cbd7f730f2
ptp: vclock: use mutex to fix "sleep on atomic" bug

vclocks were using spinlocks to protect access to its timecounter and
cyclecounter. Access to timecounter/cyclecounter is backed by the same
driver callbacks that are used for non-virtual PHCs, but the usage of
the spinlock imposes a new limitation that didn't exist previously: now
they're called in atomic context so they mustn't sleep.

Some drivers like sfc or ice may sleep on these callbacks, causing
errors like "BUG: scheduling while atomic: ptp5/25223/0x00000002"

Fix it replacing the vclock's spinlock by a mutex. It fix the mentioned
bug and it doesn't introduce longer delays.

I've tested synchronizing various different combinations of clocks:
- vclock->sysclock
- sysclock->vclock
- vclock->vclock
- hardware PHC in different NIC -> vclock
- created 4 vclocks and launch 4 parallel phc2sys processes with
  lockdep enabled

In all cases, comparing the delays reported by phc2sys, they are in the
same range of values than before applying the patch.

Link: https://lore.kernel.org/netdev/69d0ff33-bd32-6aa5-d36c-fbdc3c01337c@redhat.com/
Fixes: 5d43f951b1ac ("ptp: add ptp virtual clock driver framework")
Reported-by: Yalin Li <yalli@redhat.com>
Suggested-by: Richard Cochran <richardcochran@gmail.com>
Tested-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Link: https://lore.kernel.org/r/20230221130616.21837-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
drivers/ptp/ptp_private.h
drivers/ptp/ptp_vclock.c