io_uring: Fix sqpoll utilization check racing with dying sqpoll
Commit
3fcb9d17206e ("io_uring/sqpoll: statistics of the true
utilization of sq threads"), currently in Jens for-next branch, peeks at
io_sq_data->thread to report utilization statistics. But, If
io_uring_show_fdinfo races with sqpoll terminating, even though we hold
the ctx lock, sqd->thread might be NULL and we hit the Oops below.
Note that we could technically just protect the getrusage() call and the
sq total/work time calculations. But showing some sq
information (pid/cpu) and not other information (utilization) is more
confusing than not reporting anything, IMO. So let's hide it all if we
happen to race with a dying sqpoll.
This can be triggered consistently in my vm setup running
sqpoll-cancel-hang.t in a loop.
BUG: kernel NULL pointer dereference, address:
00000000000007b0
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 16587 Comm: systemd-coredum Not tainted
6.8.0-rc3-g3fcb9d17206e-dirty #69
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 2/2/2022
RIP: 0010:getrusage+0x21/0x3e0
Code: 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 48 89 d1 48 89 e5 41 57 41 56 41 55 41 54 49 89 fe 41 52 53 48 89 d3 48 83 ec 30 <4c> 8b a7 b0 07 00 00 48 8d 7a 08 65 48 8b 04 25 28 00 00 00 48 89
RSP: 0018:
ffffa166c671bb80 EFLAGS:
00010282
RAX:
00000000000040ca RBX:
ffffa166c671bc60 RCX:
ffffa166c671bc60
RDX:
ffffa166c671bc60 RSI:
0000000000000000 RDI:
0000000000000000
RBP:
ffffa166c671bbe0 R08:
ffff9448cc3930c0 R09:
0000000000000000
R10:
ffffa166c671bd50 R11:
ffffffff9ee89260 R12:
0000000000000000
R13:
ffff9448ce099480 R14:
0000000000000000 R15:
ffff9448cff5b000
FS:
00007f786e225900(0000) GS:
ffff94493bc00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00000000000007b0 CR3:
000000010d39c000 CR4:
0000000000750ef0
PKRU:
55555554
Call Trace:
<TASK>
? __die_body+0x1a/0x60
? page_fault_oops+0x154/0x440
? srso_alias_return_thunk+0x5/0xfbef5
? do_user_addr_fault+0x174/0x7c0
? srso_alias_return_thunk+0x5/0xfbef5
? exc_page_fault+0x63/0x140
? asm_exc_page_fault+0x22/0x30
? getrusage+0x21/0x3e0
? seq_printf+0x4e/0x70
io_uring_show_fdinfo+0x9db/0xa10
? srso_alias_return_thunk+0x5/0xfbef5
? vsnprintf+0x101/0x4d0
? srso_alias_return_thunk+0x5/0xfbef5
? seq_vprintf+0x34/0x50
? srso_alias_return_thunk+0x5/0xfbef5
? seq_printf+0x4e/0x70
? seq_show+0x16b/0x1d0
? __pfx_io_uring_show_fdinfo+0x10/0x10
seq_show+0x16b/0x1d0
seq_read_iter+0xd7/0x440
seq_read+0x102/0x140
vfs_read+0xae/0x320
? srso_alias_return_thunk+0x5/0xfbef5
? __do_sys_newfstat+0x35/0x60
ksys_read+0xa5/0xe0
do_syscall_64+0x50/0x110
entry_SYSCALL_64_after_hwframe+0x6e/0x76
RIP: 0033:0x7f786ec1db4d
Code: e8 46 e3 01 00 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 80 3d d9 ce 0e 00 00 74 17 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 5b c3 66 2e 0f 1f 84 00 00 00 00 00 48 83 ec
RSP: 002b:
00007ffcb361a4b8 EFLAGS:
00000246 ORIG_RAX:
0000000000000000
RAX:
ffffffffffffffda RBX:
000055a4c8fe42f0 RCX:
00007f786ec1db4d
RDX:
0000000000000400 RSI:
000055a4c8fe48a0 RDI:
0000000000000006
RBP:
00007f786ecfb0b0 R08:
00007f786ecfb2a8 R09:
0000000000000001
R10:
0000000000000000 R11:
0000000000000246 R12:
00007f786ecfaf60
R13:
000055a4c8fe42f0 R14:
0000000000000000 R15:
00007ffcb361a628
</TASK>
Modules linked in:
CR2:
00000000000007b0
---[ end trace
0000000000000000 ]---
RIP: 0010:getrusage+0x21/0x3e0
Code: 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 48 89 d1 48 89 e5 41 57 41 56 41 55 41 54 49 89 fe 41 52 53 48 89 d3 48 83 ec 30 <4c> 8b a7 b0 07 00 00 48 8d 7a 08 65 48 8b 04 25 28 00 00 00 48 89
RSP: 0018:
ffffa166c671bb80 EFLAGS:
00010282
RAX:
00000000000040ca RBX:
ffffa166c671bc60 RCX:
ffffa166c671bc60
RDX:
ffffa166c671bc60 RSI:
0000000000000000 RDI:
0000000000000000
RBP:
ffffa166c671bbe0 R08:
ffff9448cc3930c0 R09:
0000000000000000
R10:
ffffa166c671bd50 R11:
ffffffff9ee89260 R12:
0000000000000000
R13:
ffff9448ce099480 R14:
0000000000000000 R15:
ffff9448cff5b000
FS:
00007f786e225900(0000) GS:
ffff94493bc00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00000000000007b0 CR3:
000000010d39c000 CR4:
0000000000750ef0
PKRU:
55555554
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x1ce00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Fixes: 3fcb9d17206e ("io_uring/sqpoll: statistics of the true utilization of sq threads")
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Link: https://lore.kernel.org/r/20240309003256.358-1-krisman@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>