NVMe shares tagset between fabric queue and admin queue or between
connect_q and NS queue, so hctx_may_queue() can be called to allocate
request for these queues.
Tags can be reserved in these tagset. Before error recovery, there is
often lots of in-flight requests which can't be completed, and new
reserved request may be needed in error recovery path. However,
hctx_may_queue() can always return false because there is too many
in-flight requests which can't be completed during error handling.
Finally, nothing can proceed.
Fix this issue by always allowing reserved tag allocation in
hctx_may_queue(). This is reasonable because reserved tags are supposed
to always be available.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Cc: David Milburn <dmilburn@redhat.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
 static int __blk_mq_get_tag(struct blk_mq_alloc_data *data,
                            struct sbitmap_queue *bt)
 {
-       if (!data->q->elevator && !hctx_may_queue(data->hctx, bt))
+       if (!data->q->elevator && !(data->flags & BLK_MQ_REQ_RESERVED) &&
+                       !hctx_may_queue(data->hctx, bt))
                return BLK_MQ_NO_TAG;
 
        if (data->shallow_depth)
 
        if (blk_mq_tag_is_reserved(rq->mq_hctx->sched_tags, rq->internal_tag)) {
                bt = rq->mq_hctx->tags->breserved_tags;
                tag_offset = 0;
+       } else {
+               if (!hctx_may_queue(rq->mq_hctx, bt))
+                       return false;
        }
 
-       if (!hctx_may_queue(rq->mq_hctx, bt))
-               return false;
        tag = __sbitmap_queue_get(bt);
        if (tag == BLK_MQ_NO_TAG)
                return false;