scsi: qedf: Add check to synchronize abort and flush
A race condition was observed between qedf_cleanup_fcport() and
qedf_process_error_detect()->qedf_initiate_abts():
[
2069091.203145] BUG: unable to handle kernel NULL pointer dereference at
0000000000000030
[
2069091.213100] IP: [<
ffffffffc0666cc6>] qedf_process_error_detect+0x96/0x130 [qedf]
[
2069091.223391] PGD
1943049067 PUD
194304e067 PMD 0
[
2069091.233420] Oops: 0000 [#1] SMP
[
2069091.361820] CPU: 1 PID: 14751 Comm: kworker/1:46 Kdump: loaded Tainted: P OE ------------ 3.10.0-1160.25.1.el7.x86_64 #1
[
2069091.388474] Hardware name: HPE Synergy 480 Gen10/Synergy 480 Gen10 Compute Module, BIOS I42 04/08/2020
[
2069091.402148] Workqueue: qedf_io_wq qedf_fp_io_handler [qedf]
[
2069091.415780] task:
ffff9bb9f5190000 ti:
ffff9bacaef9c000 task.ti:
ffff9bacaef9c000
[
2069091.429590] RIP: 0010:[<
ffffffffc0666cc6>] [<
ffffffffc0666cc6>] qedf_process_error_detect+0x96/0x130 [qedf]
[
2069091.443666] RSP: 0018:
ffff9bacaef9fdb8 EFLAGS:
00010246
[
2069091.457692] RAX:
0000000000000000 RBX:
ffff9bbbbbfb18a0 RCX:
ffffffffc0672310
[
2069091.471997] RDX:
00000000000005de RSI:
ffffffffc066e7f0 RDI:
ffff9beb3f4538d8
[
2069091.486130] RBP:
ffff9bacaef9fdd8 R08:
0000000000006000 R09:
0000000000006000
[
2069091.500321] R10:
0000000000001551 R11:
ffffb582996ffff8 R12:
ffffb5829b39cc18
[
2069091.514779] R13:
ffff9badab380c28 R14:
ffffd5827f643900 R15:
0000000000000040
[
2069091.529472] FS:
0000000000000000(0000) GS:
ffff9beb3f440000(0000) knlGS:
0000000000000000
[
2069091.543926] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[
2069091.558942] CR2:
0000000000000030 CR3:
000000193b9a2000 CR4:
00000000007607e0
[
2069091.573424] DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
[
2069091.587876] DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
[
2069091.602007] PKRU:
00000000
[
2069091.616010] Call Trace:
[
2069091.629902] [<
ffffffffc0663969>] qedf_process_cqe+0x109/0x2e0 [qedf]
[
2069091.643941] [<
ffffffffc0663b66>] qedf_fp_io_handler+0x26/0x60 [qedf]
[
2069091.657948] [<
ffffffff85ebddcf>] process_one_work+0x17f/0x440
[
2069091.672111] [<
ffffffff85ebeee6>] worker_thread+0x126/0x3c0
[
2069091.686057] [<
ffffffff85ebedc0>] ? manage_workers.isra.26+0x2a0/0x2a0
[
2069091.700033] [<
ffffffff85ec5da1>] kthread+0xd1/0xe0
[
2069091.713891] [<
ffffffff85ec5cd0>] ? insert_kthread_work+0x40/0x40
Add check in qedf_process_error_detect(). When flush is active, let the
cmds be completed from the cleanup contex.
Link: https://lore.kernel.org/r/20210624171802.598-1-jhasan@marvell.com
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>