habanalabs: flush EQ workers in hard reset
authorOded Gabbay <oded.gabbay@gmail.com>
Sun, 17 Nov 2019 15:41:57 +0000 (17:41 +0200)
committerOded Gabbay <oded.gabbay@gmail.com>
Thu, 21 Nov 2019 09:35:47 +0000 (11:35 +0200)
During hard-reset, there can be multiple events received from the H/W. For
each event, the driver opens a worker thread to handle it. For some of the
events, the driver will read/write registers in the code that handles the
event.

In case of hard-reset, we must prevent reads/writes to the registers during
the reset operation because the device might get stuck if that happens.

Therefore, flush the EQ workers before resetting the device (in hard-reset
only). Additional events won't arrive as we synced and disabled the
interrupts.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
drivers/misc/habanalabs/device.c

index 80205d8584ce60a6f8ed20f6e6dab3d56e5ee423..b155e95490761271262d46c4c46adface12c561a 100644 (file)
@@ -887,13 +887,19 @@ again:
        /* Go over all the queues, release all CS and their jobs */
        hl_cs_rollback_all(hdev);
 
-       /* Kill processes here after CS rollback. This is because the process
-        * can't really exit until all its CSs are done, which is what we
-        * do in cs rollback
-        */
-       if (hard_reset)
+       if (hard_reset) {
+               /* Kill processes here after CS rollback. This is because the
+                * process can't really exit until all its CSs are done, which
+                * is what we do in cs rollback
+                */
                device_kill_open_processes(hdev);
 
+               /* Flush the Event queue workers to make sure no other thread is
+                * reading or writing to registers during the reset
+                */
+               flush_workqueue(hdev->eq_wq);
+       }
+
        /* Release kernel context */
        if ((hard_reset) && (hl_ctx_put(hdev->kernel_ctx) == 1))
                hdev->kernel_ctx = NULL;