I've recently been seeing some unexplained GSP errors on my RTX 6000 from
failed aux transactions:
[ 132.915867] nouveau 0000:1f:00.0: gsp: cli:0xc1d00002 obj:0x00730000
ctrl cmd:0x00731341 failed: 0x0000ffff
While the cause of these is not yet clear, these messages made me notice
that the aux transactions causing these transactions were succeeding - not
failing. As it turns out, this is because we're currently not returning the
correct variable when r535_dp_aux_xfer() hits an error - causing us to
never propagate GSP errors for failed aux transactions to userspace.
So, let's fix that.
Fixes: 4ae3a20102b2 ("nouveau/gsp: don't free ctrl messages on errors")
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240315212104.776936-1-lyude@redhat.com
ret = nvkm_gsp_rm_ctrl_push(&disp->rm.objcom, &ctrl, sizeof(*ctrl));
if (ret) {
nvkm_gsp_rm_ctrl_done(&disp->rm.objcom, ctrl);
- return PTR_ERR(ctrl);
+ return ret;
}
memcpy(data, ctrl->data, size);