From: Ohad Sharabi Date: Mon, 20 Dec 2021 11:30:35 +0000 (+0200) Subject: habanalabs: handle skip multi-CS if handling not done X-Git-Url: http://git.maquefel.me/?a=commitdiff_plain;h=60bf3bfb5a37965fc33fa00f19a2074dd48077c5;p=linux.git habanalabs: handle skip multi-CS if handling not done This patch fixes issue in which we have timeout for multi-CS although the CS in the list actually completed. Example scenario (the two threads marked as WAIT for the thread that handles the wait_for_multi_cs and CMPL as the thread that signal completion for both CS and multi-CS): 1. Submit CS with sequence X 2. [WAIT]: call wait_for_multi_cs with single CS X 3. [CMPL]: CS X do invoke complete_all for both CS and multi-CS (multi_cs_completion_done still false) 4. [WAIT]: enter poll_fences, reinit the completion and find the CS as completed when asking on the fence but multi_cs_done is still false it returns that no CS actually completed 5. [CMPL]: set multi_cs_handling_done as true 6. [WAIT]: wait for completion but no CS to awake the wait context and hence wait till timeout Solution: if CS detected as completed in poll_fences but multi_cs_done is still false invoke complete_all to the multi-CS completion and so it will not go to sleep in wait_for_completion but rather will have a "second chance" to wait for multi_cs_completion_done. Signed-off-by: Ohad Sharabi Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- diff --git a/drivers/misc/habanalabs/common/command_submission.c b/drivers/misc/habanalabs/common/command_submission.c index 7073fa6b9f0f5..d39343f90bc2a 100644 --- a/drivers/misc/habanalabs/common/command_submission.c +++ b/drivers/misc/habanalabs/common/command_submission.c @@ -2453,9 +2453,19 @@ static int hl_cs_poll_fences(struct multi_cs_data *mcs_data, struct multi_cs_com * returns to user indicating CS completed before it finished * all of its mcs handling, to avoid race the next time the * user waits for mcs. + * note: when reaching this case fence is definitely not NULL + * but NULL check was added to overcome static analysis */ - if (!fence->mcs_handling_done) + if (fence && !fence->mcs_handling_done) { + /* + * in case multi CS is completed but MCS handling not done + * we "complete" the multi CS to prevent it from waiting + * until time-out and the "multi-CS handling done" will have + * another chance at the next iteration + */ + complete_all(&mcs_compl->completion); break; + } mcs_data->completion_bitmap |= BIT(i); /*