habanalabs: define soft-reset as inference op
authorOded Gabbay <ogabbay@kernel.org>
Thu, 30 Sep 2021 08:36:07 +0000 (11:36 +0300)
committerOded Gabbay <ogabbay@kernel.org>
Mon, 18 Oct 2021 09:05:46 +0000 (12:05 +0300)
commita00f1f571e50eb33c5b89db8ac7cd2d684da2943
treeb5718d85181ba8502e5391d900405c6cd5ff97df
parentdd08335fb909e62bd290117f34490ef4e577b554
habanalabs: define soft-reset as inference op

Soft-reset is the procedure where we reset only the compute/DMA engines
of the device, without requiring the current user-space process to
release the device.

This type of reset can happen if TDR event occurred (a workload got
stuck) or by a root request through sysfs.

This is only relevant for inference ASICs, as there is no real-world
use-case to do that in training, because training runs on multiple
devices.

In addition, we also do (in certain ASICs) a reset upon device release.
That reset uses the same code as the soft-reset.

Therefore, to better differentiate between the two resets, it is better
to rename the soft-reset support as "inference soft-reset", to make
the code more self-explanatory.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
drivers/misc/habanalabs/common/device.c
drivers/misc/habanalabs/common/habanalabs.h
drivers/misc/habanalabs/common/sysfs.c
drivers/misc/habanalabs/goya/goya.c