habanalabs: fail collective wait when not supported
authorOfir Bitton <obitton@habana.ai>
Thu, 2 Sep 2021 06:47:53 +0000 (09:47 +0300)
committerOded Gabbay <ogabbay@kernel.org>
Tue, 14 Sep 2021 12:00:04 +0000 (15:00 +0300)
As collective wait operation is required only when NIC ports are
available, we disable the option to submit a CS in case all the ports
are disabled, which is the current situation in the upstream driver.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
drivers/misc/habanalabs/common/command_submission.c

index deb080830ecb718467d37eb72f099396997a3cb0..5b7de857fbc1a7960f42ef098d5f796457ffc8a8 100644 (file)
@@ -1995,6 +1995,15 @@ static int cs_ioctl_signal_wait(struct hl_fpriv *hpriv, enum hl_cs_type cs_type,
                        goto free_cs_chunk_array;
                }
 
+               if (!hdev->nic_ports_mask) {
+                       atomic64_inc(&ctx->cs_counters.validation_drop_cnt);
+                       atomic64_inc(&cntr->validation_drop_cnt);
+                       dev_err(hdev->dev,
+                               "Collective operations not supported when NIC ports are disabled");
+                       rc = -EINVAL;
+                       goto free_cs_chunk_array;
+               }
+
                collective_engine_id = chunk->collective_engine_id;
        }