drm/i915/gt: Restart the heartbeat timer when forcing a pulse
authorJohn Harrison <John.C.Harrison@Intel.com>
Wed, 10 Jan 2024 21:02:16 +0000 (13:02 -0800)
committerJohn Harrison <John.C.Harrison@Intel.com>
Thu, 15 Feb 2024 01:17:35 +0000 (17:17 -0800)
The context persistence code does things like send super high priority
heartbeat pulses to ensure any leaked context can still be pre-empted
and thus isn't a total denial of service but only a minor denial of
service. Unfortunately, it wasn't bothering to restart the heartbeat
worker with a fresh timeout. Thus, if a persistent context happened to
be closed just before the heartbeat was going to go ping anyway then
the forced pulse would get a negligble execution time. And as the
forced pulse is super high priority, the worker thread's next step is
a reset. Which means a potentially innocent system randomly goes boom
when attempting to close a context. So, force a re-schedule of the
worker thread with the appropriate timeout.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240110210216.4125092-1-John.C.Harrison@Intel.com
drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c

index 1a8e2b7db0138f4928482f3ce8d1b42ff5b30cc3..4ae2fa0b61dd46edc6198dcff1803c0358002196 100644 (file)
@@ -290,6 +290,9 @@ static int __intel_engine_pulse(struct intel_engine_cs *engine)
        heartbeat_commit(rq, &attr);
        GEM_BUG_ON(rq->sched.attr.priority < I915_PRIORITY_BARRIER);
 
+       /* Ensure the forced pulse gets a full period to execute */
+       next_heartbeat(engine);
+
        return 0;
 }