migration/multifd: Add a synchronization point for channel creation
authorFabiano Rosas <farosas@suse.de>
Tue, 6 Feb 2024 21:51:18 +0000 (18:51 -0300)
committerPeter Xu <peterx@redhat.com>
Wed, 7 Feb 2024 01:53:18 +0000 (09:53 +0800)
commit93fa9dc2e0522c54b813dee0898a5feb98b624c9
tree6a4108182ba047c9c218aed8ccfa56b3349afd01
parent2576ae488ef9aa692486157df7d8b410919cd219
migration/multifd: Add a synchronization point for channel creation

It is possible that one of the multifd channels fails to be created at
multifd_new_send_channel_async() while the rest of the channel
creation tasks are still in flight.

This could lead to multifd_save_cleanup() executing the
qemu_thread_join() loop too early and not waiting for the threads
which haven't been created yet, leading to the freeing of resources
that the newly created threads will try to access and crash.

Add a synchronization point after which there will be no attempts at
thread creation and therefore calling multifd_save_cleanup() past that
point will ensure it properly waits for the threads.

A note about performance: Prior to this patch, if a channel took too
long to be established, other channels could finish connecting first
and already start taking load. Now we're bounded by the
slowest-connecting channel.

Reported-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240206215118.6171-7-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
migration/multifd.c