RDMA/mlx5: Fix crash when unbind multiport slave
authorMaor Gottlieb <maorg@nvidia.com>
Tue, 10 Aug 2021 09:25:11 +0000 (12:25 +0300)
committerJason Gunthorpe <jgg@nvidia.com>
Thu, 19 Aug 2021 12:59:20 +0000 (09:59 -0300)
Fix the below crash when deleting a slave from the unaffiliated list
twice. First time when the slave is bound to the master and the second
when the slave is unloaded.

Fix it by checking if slave is unaffiliated (doesn't have ib device)
before removing from the list.

  RIP: 0010:mlx5r_mp_remove+0x4e/0xa0 [mlx5_ib]
  Call Trace:
   auxiliary_bus_remove+0x18/0x30
   __device_release_driver+0x177/x220
   device_release_driver+0x24/0x30
   bus_remove_device+0xd8/0x140
   device_del+0x18a/0x3e0
   mlx5_rescan_drivers_locked+0xa9/0x210 [mlx5_core]
   mlx5_unregister_device+0x34/0x60 [mlx5_core]
   mlx5_uninit_one+0x32/0x100 [mlx5_core]
   remove_one+0x6e/0xe0 [mlx5_core]
   pci_device_remove+0x36/0xa0
   __device_release_driver+0x177/0x220
   device_driver_detach+0x3c/0xa0
   unbind_store+0x113/0x130
   kernfs_fop_write_iter+0x110/0x1a0
   new_sync_write+0x116/0x1a0
   vfs_write+0x1ba/0x260
   ksys_write+0x5f/0xe0
   do_syscall_64+0x3d/0x90
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: 93f8244431ad ("RDMA/mlx5: Convert mlx5_ib to use auxiliary bus")
Link: https://lore.kernel.org/r/17ec98989b0ba88f7adfbad68eb20bce8d567b44.1628587493.git.leonro@nvidia.com
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
drivers/infiniband/hw/mlx5/main.c

index 094c976b1eedf73298a1f13efa6a6e3d869ce758..2507051f7b897a00bb363af5a3e60b455f34582a 100644 (file)
@@ -4454,7 +4454,8 @@ static void mlx5r_mp_remove(struct auxiliary_device *adev)
        mutex_lock(&mlx5_ib_multiport_mutex);
        if (mpi->ibdev)
                mlx5_ib_unbind_slave_port(mpi->ibdev, mpi);
-       list_del(&mpi->list);
+       else
+               list_del(&mpi->list);
        mutex_unlock(&mlx5_ib_multiport_mutex);
        kfree(mpi);
 }