This seems to have worked just fine so far on weakly-ordered
architectures, but I don't see anything that prevents the
reordering from:
store 1 to exit_request
store 1 to tcg_exit_req
load tcg_exit_req
store 0 to tcg_exit_req
load exit_request
store 0 to exit_request
store 1 to exit_request
store 1 to tcg_exit_req
to this:
store 1 to exit_request
store 1 to tcg_exit_req
load tcg_exit_req
load exit_request
store 1 to exit_request
store 1 to tcg_exit_req
store 0 to tcg_exit_req
store 0 to exit_request
therefore losing a request. It's possible that other memory barriers
(e.g. in rcu_read_unlock) are hiding it, but better safe than
sorry.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* have set something else (eg exit_request or
* interrupt_request) which we will handle
* next time around the loop. But we need to
- * ensure the tcg_exit_req read in generated code
+ * ensure the zeroing of tcg_exit_req (see cpu_tb_exec)
* comes before the next read of cpu->exit_request
* or cpu->interrupt_request.
*/
- smp_rmb();
+ smp_mb();
*last_tb = NULL;
break;
case TB_EXIT_ICOUNT_EXPIRED: