Eric Blake [Thu, 12 Oct 2017 03:47:20 +0000 (22:47 -0500)]
qemu-io: Relax 'alloc' now that block-status doesn't assert
Previously, the alloc command required that input parameters be
sector-aligned and clamped to 32 bits, because the underlying
bdrv_is_allocated used a 32-bit parameter and asserted aligned
inputs. But now that we have fixed block status to report a
64-bit bytes value, and to properly round requests on behalf of
guests, we can pass any values, and can use qemu-io to add
coverage that our rounding is correct regardless of the guest
alignment constraints.
Update iotest 177 to intentionally probe block status at
unaligned boundaries as well as with a bytes value that does not
map to 32-bit sectors, which also required tweaking the image
prep to leave an unallocated portion to the image under test.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:47:19 +0000 (22:47 -0500)]
qcow2: Reduce is_zero() rounding
Now that bdrv_is_allocated accepts non-aligned inputs, we can
remove the TODO added in earlier refactoring.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:47:18 +0000 (22:47 -0500)]
block: Reduce bdrv_aligned_preadv() rounding
Now that bdrv_is_allocated accepts non-aligned inputs, we can
remove the TODO added in commit
d6a644bb.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:47:17 +0000 (22:47 -0500)]
block: Align block status requests
Any device that has request_alignment greater than 512 should be
unable to report status at a finer granularity; it may also be
simpler for such devices to be guaranteed that the block layer
has rounded things out to the granularity boundary (the way the
block layer already rounds all other I/O out). Besides, getting
the code correct for super-sector alignment also benefits us
for the fact that our public interface now has byte granularity,
even though none of our drivers have byte-level callbacks.
Add an assertion in blkdebug that proves that the block layer
never requests status of unaligned sections, similar to what it
does on other requests (while still keeping the generic helper
in place for when future patches add a throttle driver). Note
that iotest 177 already covers this (it would fail if you use
just the blkdebug.c hunk without the io.c changes). Meanwhile,
we can drop assertions in callers that no longer have to pass
in sector-aligned addresses.
There is a mid-function scope added for 'count' and 'longret',
for a couple of reasons: first, an upcoming patch will add an
'if' statement that checks whether a driver has an old- or
new-style callback, and can conveniently use the same scope for
less indentation churn at that time. Second, since we are
trying to get rid of sector-based computations, wrapping things
in a scope makes it easier to group and see what will be
deleted in a final cleanup patch once all drivers have been
converted to the new-style callback.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:47:16 +0000 (22:47 -0500)]
qemu-img: Change img_compare() to be byte-based
In the continuing quest to make more things byte-based, change
the internal iteration of img_compare(). We can finally drop the
TODO assertions added earlier, now that the entire algorithm is
byte-based and no longer has to shift from bytes to sectors.
Most of the change is mechanical ('total_sectors' becomes
'total_size', 'sector_num' becomes 'offset', 'nb_sectors' becomes
'chunk', 'progress_base' goes from sectors to bytes); some of it
is also a cleanup (sectors_to_bytes() is now unused, loss of
variable 'count' added earlier in commit
51b0a488).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:47:15 +0000 (22:47 -0500)]
qemu-img: Change img_rebase() to be byte-based
In the continuing quest to make more things byte-based, change
the internal iteration of img_rebase(). We can finally drop the
TODO assertion added earlier, now that the entire algorithm is
byte-based and no longer has to shift from bytes to sectors.
Most of the change is mechanical ('num_sectors' becomes 'size',
'sector' becomes 'offset', 'n' goes from sectors to bytes); some
of it is also a cleanup (use of MIN() instead of open-coding,
loss of variable 'count' added earlier in commit
d6a644bb).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:47:14 +0000 (22:47 -0500)]
qemu-img: Change compare_sectors() to be byte-based
In the continuing quest to make more things byte-based, change
compare_sectors(), renaming it to compare_buffers() in the
process. Note that one caller (qemu-img compare) only cares
about the first difference, while the other (qemu-img rebase)
cares about how many consecutive sectors have the same
equal/different status; however, this patch does not bother to
micro-optimize the compare case to avoid the comparisons of
sectors beyond the first mismatch. Both callers are always
passing valid buffers in, so the initial check for buffer size
can be turned into an assertion.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:47:13 +0000 (22:47 -0500)]
qemu-img: Change check_empty_sectors() to byte-based
Continue on the quest to make more things byte-based instead of
sector-based.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:47:12 +0000 (22:47 -0500)]
qemu-img: Drop redundant error message in compare
If a read error is encountered during 'qemu-img compare', we
were printing the "Error while reading offset ..." message twice;
this was because our helper function was awkward, printing output
on some but not all paths. Fix it to consistently report errors
on all paths, so that the callers do not risk a redundant message,
and update the testsuite for the improved output.
Further simplify the code by hoisting the conversion from an error
message to an exit code into the helper function, rather than
repeating that logic at all callers (yes, the helper function is
now less generic, but it's a net win in lines of code).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:47:11 +0000 (22:47 -0500)]
qemu-img: Add find_nonzero()
During 'qemu-img compare', when we are checking that an allocated
portion of one file is all zeros, we don't need to waste time
computing how many additional sectors after the first non-zero
byte are also non-zero. Create a new helper find_nonzero() to do
the check for a first non-zero sector, and rebase
check_empty_sectors() to use it.
The new interface intentionally uses bytes in its interface, even
though it still crawls the buffer a sector at a time; it is robust
to a partial sector at the end of the buffer.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:47:10 +0000 (22:47 -0500)]
qemu-img: Speed up compare on pre-allocated larger file
Compare the following images with all-zero contents:
$ truncate --size 1M A
$ qemu-img create -f qcow2 -o preallocation=off B 1G
$ qemu-img create -f qcow2 -o preallocation=metadata C 1G
On my machine, the difference is noticeable for pre-patch speeds,
with more than an order of magnitude in difference caused by the
choice of preallocation in the qcow2 file:
$ time ./qemu-img compare -f raw -F qcow2 A B
Warning: Image size mismatch!
Images are identical.
real 0m0.014s
user 0m0.007s
sys 0m0.007s
$ time ./qemu-img compare -f raw -F qcow2 A C
Warning: Image size mismatch!
Images are identical.
real 0m0.341s
user 0m0.144s
sys 0m0.188s
Why? Because bdrv_is_allocated() returns false for image B but
true for image C, throwing away the fact that both images know
via lseek(SEEK_HOLE) that the entire image still reads as zero.
From there, qemu-img ends up calling bdrv_pread() for every byte
of the tail, instead of quickly looking for the next allocation.
The solution: use block_status instead of is_allocated, giving:
$ time ./qemu-img compare -f raw -F qcow2 A C
Warning: Image size mismatch!
Images are identical.
real 0m0.014s
user 0m0.011s
sys 0m0.003s
which is on par with the speeds for no pre-allocation.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:47:09 +0000 (22:47 -0500)]
qemu-img: Simplify logic in img_compare()
As long as we are querying the status for a chunk smaller than
the known image size, we are guaranteed that a successful return
will have set pnum to a non-zero size (pnum is zero only for
queries beyond the end of the file). Use that to slightly
simplify the calculation of the current chunk size being compared.
Likewise, we don't have to shrink the amount of data operated on
until we know we have to read the file, and therefore have to fit
in the bounds of our buffer. Also, note that 'total_sectors_over'
is equivalent to 'progress_base'.
With these changes in place, sectors_to_process() is now dead code,
and can be removed.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:47:08 +0000 (22:47 -0500)]
block: Convert bdrv_get_block_status_above() to bytes
We are gradually moving away from sector-based interfaces, towards
byte-based. In the common case, allocation is unlikely to ever use
values that are not naturally sector-aligned, but it is possible
that byte-based values will let us be more precise about allocation
at the end of an unaligned file that can do byte-based access.
Changing the name of the function from bdrv_get_block_status_above()
to bdrv_block_status_above() ensures that the compiler enforces that
all callers are updated. Likewise, since it a byte interface allows
an offset mapping that might not be sector aligned, split the mapping
out of the return value and into a pass-by-reference parameter. For
now, the io.c layer still assert()s that all uses are sector-aligned,
but that can be relaxed when a later patch implements byte-based
block status in the drivers.
For the most part this patch is just the addition of scaling at the
callers followed by inverse scaling at bdrv_block_status(), plus
updates for the new split return interface. But some code,
particularly bdrv_block_status(), gets a lot simpler because it no
longer has to mess with sectors. Likewise, mirror code no longer
computes s->granularity >> BDRV_SECTOR_BITS, and can therefore drop
an assertion about alignment because the loop no longer depends on
alignment (never mind that we don't really have a driver that
reports sub-sector alignments, so it's not really possible to test
the effect of sub-sector mirroring). Fix a neighboring assertion to
use is_power_of_2 while there.
For ease of review, bdrv_get_block_status() was tackled separately.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:47:07 +0000 (22:47 -0500)]
block: Switch bdrv_co_get_block_status_above() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based. Convert another internal
type (no semantic change), and rename it to match the corresponding
public function rename.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:47:06 +0000 (22:47 -0500)]
block: Switch bdrv_common_block_status_above() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based. Convert another internal
function (no semantic change).
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:47:05 +0000 (22:47 -0500)]
block: Switch BdrvCoGetBlockStatusData to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based. Convert another internal
type (no semantic change), and rename it to match the corresponding
public function rename.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:47:04 +0000 (22:47 -0500)]
block: Switch bdrv_co_get_block_status() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based. Convert another internal
function (no semantic change); and as with its public counterpart,
rename to bdrv_co_block_status() and split the offset return, to
make the compiler enforce that we catch all uses. For now, we
assert that callers and the return value still use aligned data,
but ultimately, this will be the function where we hand off to a
byte-based driver callback, and will eventually need to add logic
to ensure we round calls according to the driver's
request_alignment then touch up the result handed back to the
caller, to start permitting a caller to pass unaligned offsets.
Note that we are now prepared to accepts 'bytes' larger than INT_MAX;
this is okay as long as we clamp things internally before violating
any 32-bit limits, and makes no difference to how a client will
use the information (clients looping over the entire file must
already be prepared for consecutive calls to return the same status,
as drivers are already free to return shorter-than-maximal status
due to any other convenient split points, such as when the L2 table
crosses cluster boundaries in qcow2).
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:47:03 +0000 (22:47 -0500)]
block: Convert bdrv_get_block_status() to bytes
We are gradually moving away from sector-based interfaces, towards
byte-based. In the common case, allocation is unlikely to ever use
values that are not naturally sector-aligned, but it is possible
that byte-based values will let us be more precise about allocation
at the end of an unaligned file that can do byte-based access.
Changing the name of the function from bdrv_get_block_status() to
bdrv_block_status() ensures that the compiler enforces that all
callers are updated. For now, the io.c layer still assert()s that
all callers are sector-aligned, but that can be relaxed when a later
patch implements byte-based block status in the drivers.
There was an inherent limitation in returning the offset via the
return value: we only have room for BDRV_BLOCK_OFFSET_MASK bits, which
means an offset can only be mapped for sector-aligned queries (or,
if we declare that non-aligned input is at the same relative position
modulo 512 of the answer), so the new interface also changes things to
return the offset via output through a parameter by reference rather
than mashed into the return value. We'll have some glue code that
munges between the two styles until we finish converting all uses.
For the most part this patch is just the addition of scaling at the
callers followed by inverse scaling at bdrv_block_status(), coupled
with the tweak in calling convention. But some code, particularly
bdrv_is_allocated(), gets a lot simpler because it no longer has to
mess with sectors.
For ease of review, bdrv_get_block_status_above() will be tackled
separately.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:47:02 +0000 (22:47 -0500)]
qemu-img: Switch get_block_status() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based. Continue by converting
an internal function (no semantic change), and simplifying its
caller accordingly.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:47:01 +0000 (22:47 -0500)]
block: Switch bdrv_make_zero() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based. Change the internal
loop iteration of zeroing a device to track by bytes instead of
sectors (although we are still guaranteed that we iterate by steps
that are sector-aligned).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:47:00 +0000 (22:47 -0500)]
qcow2: Switch is_zero_sectors() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based. Convert another internal
function (no semantic change), and rename it to is_zero() in the
process.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:46:59 +0000 (22:46 -0500)]
block: Make bdrv_round_to_clusters() signature more useful
In the process of converting sector-based interfaces to bytes,
I'm finding it easier to represent a byte count as a 64-bit
integer at the block layer (even if we are internally capped
by SIZE_MAX or even INT_MAX for individual transactions, it's
still nicer to not have to worry about truncation/overflow
issues on as many variables). Update the signature of
bdrv_round_to_clusters() to uniformly use int64_t, matching
the signature already chosen for bdrv_is_allocated and the
fact that off_t is also a signed type, then adjust clients
according to the required fallout (even where the result could
now exceed 32 bits, no client is directly assigning the result
into a 32-bit value without breaking things into a loop first).
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:46:58 +0000 (22:46 -0500)]
block: Add flag to avoid wasted work in bdrv_is_allocated()
Not all callers care about which BDS owns the mapping for a given
range of the file, or where the zeroes lie within that mapping. In
particular, bdrv_is_allocated() cares more about finding the
largest run of allocated data from the guest perspective, whether
or not that data is consecutive from the host perspective, and
whether or not the data reads as zero. Therefore, doing subsequent
refinements such as checking how much of the format-layer
allocation also satisfies BDRV_BLOCK_ZERO at the protocol layer is
wasted work - in the best case, it just costs extra CPU cycles
during a single bdrv_is_allocated(), but in the worst case, it
results in a smaller *pnum, and forces callers to iterate through
more status probes when visiting the entire file for even more
extra CPU cycles.
This patch only optimizes the block layer (no behavior change when
want_zero is true, but skip unnecessary effort when it is false).
Then when subsequent patches tweak the driver callback to be
byte-based, we can also pass this hint through to the driver.
Tweak BdrvCoGetBlockStatusData to declare arguments in parameter
order, rather than mixing things up (minimizing padding is not
necessary here).
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 Oct 2017 03:46:57 +0000 (22:46 -0500)]
block: Allow NULL file for bdrv_get_block_status()
Not all callers care about which BDS owns the mapping for a given
range of the file. This patch merely simplifies the callers by
consolidating the logic in the common call point, while guaranteeing
a non-NULL file to all the driver callbacks, for no semantic change.
The only caller that does not care about pnum is bdrv_is_allocated,
as invoked by vvfat; we can likewise add assertions that the rest
of the stack does not have to worry about a NULL pnum.
Furthermore, this will also set the stage for a future cleanup: when
a caller does not care about which BDS owns an offset, it would be
nice to allow the driver to optimize things to not have to return
BDRV_BLOCK_OFFSET_VALID in the first place. In the case of fragmented
allocation (for example, it's fairly easy to create a qcow2 image
where consecutive guest addresses are not at consecutive host
addresses), the current contract requires bdrv_get_block_status()
to clamp *pnum to the limit where host addresses are no longer
consecutive, but allowing a NULL file means that *pnum could be
set to the full length of known-allocated data.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Kevin Wolf [Tue, 17 Oct 2017 15:08:51 +0000 (17:08 +0200)]
qemu-iotests: Test backing_fmt with backing node reference
This changes test case 191 to include a backing image that has
backing_fmt set in the image file, but is referenced by node name in the
qemu command line.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Peter Krempa [Thu, 12 Oct 2017 14:14:10 +0000 (16:14 +0200)]
block: don't add 'driver' to options when referring to backing via node name
When referring to a backing file of an image via node name
bdrv_open_backing_file would add the 'driver' option to the option list
filling it with the backing format driver. This breaks construction of
the backing chain via -blockdev, as bdrv_open_inherit reports an error
if both 'reference' and 'options' are provided.
$ qemu-img create -f raw /tmp/backing.raw 64M
$ qemu-img create -f qcow2 -F raw -b /tmp/backing.raw /tmp/test.qcow2
$ qemu-system-x86_64 \
-blockdev driver=file,filename=/tmp/backing.raw,node-name=backing \
-blockdev driver=qcow2,file.driver=file,file.filename=/tmp/test.qcow2,node-name=root,backing=backing
qemu-system-x86_64: -blockdev driver=qcow2,file.driver=file,file.filename=/tmp/test.qcow2,node-name=root,backing=backing: Could not open backing file: Cannot reference an existing block device with additional options or a new filename
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Peter Maydell [Wed, 25 Oct 2017 15:38:57 +0000 (16:38 +0100)]
Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-
20171025' into staging
TCG patch queue
# gpg: Signature made Wed 25 Oct 2017 10:30:18 BST
# gpg: using RSA key 0x64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>"
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* remotes/rth/tags/pull-tcg-
20171025: (51 commits)
translate-all: exit from tb_phys_invalidate if qht_remove fails
tcg: Initialize cpu_env generically
tcg: enable multiple TCG contexts in softmmu
tcg: introduce regions to split code_gen_buffer
translate-all: use qemu_protect_rwx/none helpers
osdep: introduce qemu_mprotect_rwx/none
tcg: allocate optimizer temps with tcg_malloc
tcg: distribute profiling counters across TCGContext's
tcg: introduce **tcg_ctxs to keep track of all TCGContext's
gen-icount: fold exitreq_label into TCGContext
tcg: define tcg_init_ctx and make tcg_ctx a pointer
tcg: take tb_ctx out of TCGContext
translate-all: report correct avg host TB size
exec-all: rename tb_free to tb_remove
translate-all: use a binary search tree to track TBs in TBContext
tcg: Remove CF_IGNORE_ICOUNT
tcg: Add CF_LAST_IO + CF_USE_ICOUNT to CF_HASH_MASK
cpu-exec: lookup/generate TB outside exclusive region during step_atomic
tcg: check CF_PARALLEL instead of parallel_cpus
target/sparc: check CF_PARALLEL instead of parallel_cpus
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Wed, 25 Oct 2017 14:24:08 +0000 (15:24 +0100)]
Merge remote-tracking branch 'remotes/juanquintela/tags/migration/
20171023' into staging
migration/next for
20171023
# gpg: Signature made Mon 23 Oct 2017 17:05:14 BST
# gpg: using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg: aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723
* remotes/juanquintela/tags/migration/
20171023: (21 commits)
migration: Improve migration thread error handling
qapi: Fix grammar in x-multifd-page-count descriptions
migration: add bitmap for received page
migration: introduce qemu_ufd_copy_ioctl helper
migration: postcopy_place_page factoring out
migration: new ram_init_bitmaps()
migration: clean up xbzrle cache init/destroy
migration: provide ram_state_cleanup
migration: provide ram_state_init()
migration: pause-before-switchover for postcopy
migration: allow cancel to unpause
migrate: HMP migate_continue
migration: migrate-continue
migration: Wait for semaphore before completing migration
migration: Add 'pre-switchover' and 'device' statuses
migration: Add 'pause-before-switchover' capability
migration: Make cache_init() take an error parameter
migration: Move xbzrle cache resize error handling to xbzrle_cache_resize
migration: Make cache size elements use the right types
migratiom: Remove max_item_age parameter
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Emilio G. Cota [Thu, 19 Oct 2017 20:31:54 +0000 (16:31 -0400)]
translate-all: exit from tb_phys_invalidate if qht_remove fails
Two or more threads might race while invalidating the same TB. We currently
do not check for this at all despite taking tb_lock, which means we would
wrongly invalidate the same TB more than once. This bug has actually been
hit by users: I recently saw a report on IRC, although I have yet to see
the corresponding test case.
Fix this by using qht_remove as the synchronization point; if it fails,
that means the TB has already been invalidated, and therefore there
is nothing left to do in tb_phys_invalidate.
Note that this solution works now that we still have tb_lock, and will
continue working once we remove tb_lock.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <
1508445114-4717-1-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Tue, 10 Oct 2017 21:34:37 +0000 (14:34 -0700)]
tcg: Initialize cpu_env generically
This is identical for each target. So, move the initialization to
common code. Move the variable itself out of tcg_ctx and name it
cpu_env to minimize changes within targets.
This also means we can remove tcg_global_reg_new_{ptr,i32,i64},
since there are no longer global-register temps created by targets.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Wed, 19 Jul 2017 22:57:58 +0000 (18:57 -0400)]
tcg: enable multiple TCG contexts in softmmu
This enables parallel TCG code generation. However, we do not take
advantage of it yet since tb_lock is still held during tb_gen_code.
In user-mode we use a single TCG context; see the documentation
added to tcg_region_init for the rationale.
Note that targets do not need any conversion: targets initialize a
TCGContext (e.g. defining TCG globals), and after this initialization
has finished, the context is cloned by the vCPU threads, each of
them keeping a separate copy.
TCG threads claim one entry in tcg_ctxs[] by atomically increasing
n_tcg_ctxs. Do not be too annoyed by the subsequent atomic_read's
of that variable and tcg_ctxs; they are there just to play nice with
analysis tools such as thread sanitizer.
Note that we do not allocate an array of contexts (we allocate
an array of pointers instead) because when tcg_context_init
is called, we do not know yet how many contexts we'll use since
the bool behind qemu_tcg_mttcg_enabled() isn't set yet.
Previous patches folded some TCG globals into TCGContext. The non-const
globals remaining are only set at init time, i.e. before the TCG
threads are spawned. Here is a list of these set-at-init-time globals
under tcg/:
Only written by tcg_context_init:
- indirect_reg_alloc_order
- tcg_op_defs
Only written by tcg_target_init (called from tcg_context_init):
- tcg_target_available_regs
- tcg_target_call_clobber_regs
- arm: arm_arch, use_idiv_instructions
- i386: have_cmov, have_bmi1, have_bmi2, have_lzcnt,
have_movbe, have_popcnt
- mips: use_movnz_instructions, use_mips32_instructions,
use_mips32r2_instructions, got_sigill (tcg_target_detect_isa)
- ppc: have_isa_2_06, have_isa_3_00, tb_ret_addr
- s390: tb_ret_addr, s390_facilities
- sparc: qemu_ld_trampoline, qemu_st_trampoline (build_trampolines),
use_vis3_instructions
Only written by tcg_prologue_init:
- 'struct jit_code_entry one_entry'
- aarch64: tb_ret_addr
- arm: tb_ret_addr
- i386: tb_ret_addr, guest_base_flags
- ia64: tb_ret_addr
- mips: tb_ret_addr, bswap32_addr, bswap32u_addr, bswap64_addr
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Fri, 7 Jul 2017 23:24:20 +0000 (19:24 -0400)]
tcg: introduce regions to split code_gen_buffer
This is groundwork for supporting multiple TCG contexts.
The naive solution here is to split code_gen_buffer statically
among the TCG threads; this however results in poor utilization
if translation needs are different across TCG threads.
What we do here is to add an extra layer of indirection, assigning
regions that act just like pages do in virtual memory allocation.
(BTW if you are wondering about the chosen naming, I did not want
to use blocks or pages because those are already heavily used in QEMU).
We use a global lock to serialize allocations as well as statistics
reporting (we now export the size of the used code_gen_buffer with
tcg_code_size()). Note that for the allocator we could just use
a counter and atomic_inc; however, that would complicate the gathering
of tcg_code_size()-like stats. So given that the region operations are
not a fast path, a lock seems the most reasonable choice.
The effectiveness of this approach is clear after seeing some numbers.
I used the bootup+shutdown of debian-arm with '-tb-size 80' as a benchmark.
Note that I'm evaluating this after enabling per-thread TCG (which
is done by a subsequent commit).
* -smp 1, 1 region (entire buffer):
qemu: flush code_size=
83885014 nb_tbs=154739 avg_tb_size=357
qemu: flush code_size=
83884902 nb_tbs=153136 avg_tb_size=363
qemu: flush code_size=
83885014 nb_tbs=152777 avg_tb_size=364
qemu: flush code_size=
83884950 nb_tbs=150057 avg_tb_size=373
qemu: flush code_size=
83884998 nb_tbs=150234 avg_tb_size=373
qemu: flush code_size=
83885014 nb_tbs=154009 avg_tb_size=360
qemu: flush code_size=
83885014 nb_tbs=151007 avg_tb_size=370
qemu: flush code_size=
83885014 nb_tbs=151816 avg_tb_size=367
That is, 8 flushes.
* -smp 8, 32 regions (80/32 MB per region) [i.e. this patch]:
qemu: flush code_size=
76328008 nb_tbs=141040 avg_tb_size=356
qemu: flush code_size=
75366534 nb_tbs=138000 avg_tb_size=361
qemu: flush code_size=
76864546 nb_tbs=140653 avg_tb_size=361
qemu: flush code_size=
76309084 nb_tbs=135945 avg_tb_size=375
qemu: flush code_size=
74581856 nb_tbs=132909 avg_tb_size=375
qemu: flush code_size=
73927256 nb_tbs=135616 avg_tb_size=360
qemu: flush code_size=
78629426 nb_tbs=142896 avg_tb_size=365
qemu: flush code_size=
76667052 nb_tbs=138508 avg_tb_size=368
Again, 8 flushes. Note how buffer utilization is not 100%, but it
is close. Smaller region sizes would yield higher utilization,
but we want region allocation to be rare (it acquires a lock), so
we do not want to go too small.
* -smp 8, static partitioning of 8 regions (10 MB per region):
qemu: flush code_size=
21936504 nb_tbs=40570 avg_tb_size=354
qemu: flush code_size=
11472174 nb_tbs=20633 avg_tb_size=370
qemu: flush code_size=
11603976 nb_tbs=21059 avg_tb_size=365
qemu: flush code_size=
23254872 nb_tbs=41243 avg_tb_size=377
qemu: flush code_size=
28289496 nb_tbs=52057 avg_tb_size=358
qemu: flush code_size=
43605160 nb_tbs=78896 avg_tb_size=367
qemu: flush code_size=
45166552 nb_tbs=82158 avg_tb_size=364
qemu: flush code_size=
63289640 nb_tbs=116494 avg_tb_size=358
qemu: flush code_size=
51389960 nb_tbs=93937 avg_tb_size=362
qemu: flush code_size=
59665928 nb_tbs=107063 avg_tb_size=372
qemu: flush code_size=
38380824 nb_tbs=68597 avg_tb_size=374
qemu: flush code_size=
44884568 nb_tbs=79901 avg_tb_size=376
qemu: flush code_size=
50782632 nb_tbs=90681 avg_tb_size=374
qemu: flush code_size=
39848888 nb_tbs=71433 avg_tb_size=372
qemu: flush code_size=
64708840 nb_tbs=119052 avg_tb_size=359
qemu: flush code_size=
49830008 nb_tbs=90992 avg_tb_size=362
qemu: flush code_size=
68372408 nb_tbs=123442 avg_tb_size=368
qemu: flush code_size=
33555560 nb_tbs=59514 avg_tb_size=378
qemu: flush code_size=
44748344 nb_tbs=80974 avg_tb_size=367
qemu: flush code_size=
37104248 nb_tbs=67609 avg_tb_size=364
That is, 20 flushes. Note how a static partitioning approach uses
the code buffer poorly, leading to many unnecessary flushes.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Sat, 15 Jul 2017 06:38:57 +0000 (02:38 -0400)]
translate-all: use qemu_protect_rwx/none helpers
The helpers require the address and size to be page-aligned, so
do that before calling them.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Sat, 15 Jul 2017 06:28:47 +0000 (02:28 -0400)]
osdep: introduce qemu_mprotect_rwx/none
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Wed, 19 Jul 2017 18:32:24 +0000 (14:32 -0400)]
tcg: allocate optimizer temps with tcg_malloc
Groundwork for supporting multiple TCG contexts.
While at it, also allocate temps_used directly as a bitmap of the
required size, instead of using a bitmap of TCG_MAX_TEMPS via
TCGTempSet.
Performance-wise we lose about 1.12% in a translation-heavy workload
such as booting+shutting down debian-arm:
Performance counter stats for 'taskset -c 0 arm-softmmu/qemu-system-arm \
-machine type=virt -nographic -smp 1 -m 4096 \
-netdev user,id=unet,hostfwd=tcp::2222-:22 \
-device virtio-net-device,netdev=unet \
-drive file=die-on-boot.qcow2,id=myblock,index=0,if=none \
-device virtio-blk-device,drive=myblock \
-kernel kernel.img -append console=ttyAMA0 root=/dev/vda1 \
-name arm,debug-threads=on -smp 1' (10 runs):
exec time (s) Relative slowdown wrt original (%)
---------------------------------------------------------------
original 20.
213321616 0.
tcg_malloc 20.
441130078 1.
1270214
TCGContext 20.
477846517 1.
3086662
g_malloc 20.
780527895 2.
8061013
The other two alternatives shown in the table are:
- TCGContext: embed temps[TCG_MAX_TEMPS] and TCGTempSet used_temps
in TCGContext. This is simple enough but it isn't faster than using
tcg_malloc; moreover, it wastes memory.
- g_malloc: allocate/deallocate both temps and used_temps every time
tcg_optimize is executed.
Suggested-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Wed, 5 Jul 2017 23:35:06 +0000 (19:35 -0400)]
tcg: distribute profiling counters across TCGContext's
This is groundwork for supporting multiple TCG contexts.
To avoid scalability issues when profiling info is enabled, this patch
makes the profiling info counters distributed via the following changes:
1) Consolidate profile info into its own struct, TCGProfile, which
TCGContext also includes. Note that tcg_table_op_count is brought
into TCGProfile after dropping the tcg_ prefix.
2) Iterate over the TCG contexts in the system to obtain the total counts.
This change also requires updating the accessors to TCGProfile fields to
use atomic_read/set whenever there may be conflicting accesses (as defined
in C11) to them.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Wed, 12 Jul 2017 22:26:40 +0000 (18:26 -0400)]
tcg: introduce **tcg_ctxs to keep track of all TCGContext's
Groundwork for supporting multiple TCG contexts.
Note that having n_tcg_ctxs is unnecessary. However, it is
convenient to have it, since it will simplify iterating over the
array: we'll have just a for loop instead of having to iterate
over a NULL-terminated array (which would require n+1 elems)
or having to check with ifdef's for usermode/softmmu.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Tue, 4 Jul 2017 17:54:21 +0000 (13:54 -0400)]
gen-icount: fold exitreq_label into TCGContext
Groundwork for supporting multiple TCG contexts.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Wed, 12 Jul 2017 21:15:52 +0000 (17:15 -0400)]
tcg: define tcg_init_ctx and make tcg_ctx a pointer
Groundwork for supporting multiple TCG contexts.
The core of this patch is this change to tcg/tcg.h:
> -extern TCGContext tcg_ctx;
> +extern TCGContext tcg_init_ctx;
> +extern TCGContext *tcg_ctx;
Note that for now we set *tcg_ctx to whatever TCGContext is passed
to tcg_context_init -- in this case &tcg_init_ctx.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Sat, 24 Jun 2017 00:04:43 +0000 (20:04 -0400)]
tcg: take tb_ctx out of TCGContext
Groundwork for supporting multiple TCG contexts.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Sat, 24 Jun 2017 00:57:44 +0000 (20:57 -0400)]
translate-all: report correct avg host TB size
Since commit
6e3b2bfd6 ("tcg: allocate TB structs before the
corresponding translated code") we are not fully utilizing
code_gen_buffer for translated code, and therefore are
incorrectly reporting the amount of translated code as well as
the average host TB size. Address this by:
- Making the conscious choice of misreporting the total translated code;
doing otherwise would mislead users into thinking "-tb-size" is not
honoured.
- Expanding tb_tree_stats to accurately count the bytes of translated code on
the host, and using this for reporting the average tb host size,
as well as the expansion ratio.
In the future we might want to consider reporting the accurate numbers for
the total translated code, together with a "bookkeeping/overhead" field to
account for the TB structs.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Wed, 12 Jul 2017 18:40:28 +0000 (14:40 -0400)]
exec-all: rename tb_free to tb_remove
We don't really free anything in this function anymore; we just remove
the TB from the binary search tree.
Suggested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Fri, 23 Jun 2017 23:00:11 +0000 (19:00 -0400)]
translate-all: use a binary search tree to track TBs in TBContext
This is a prerequisite for supporting multiple TCG contexts, since
we will have threads generating code in separate regions of
code_gen_buffer.
For this we need a new field (.size) in struct tb_tc to keep
track of the size of the translated code. This field uses a size_t
to avoid adding a hole to the struct, although really an unsigned
int would have been enough.
The comparison function we use is optimized for the common case:
insertions. Profiling shows that upon booting debian-arm, 98%
of comparisons are between existing tb's (i.e. a->size and b->size
are both !0), which happens during insertions (and removals, but
those are rare). The remaining cases are lookups. From reading the glib
sources we see that the first key is always the lookup key. However,
the code does not assume this to always be the case because this
behaviour is not guaranteed in the glib docs. However, we embed
this knowledge in the code as a branch hint for the compiler.
Note that tb_free does not free space in the code_gen_buffer anymore,
since we cannot easily know whether the tb is the last one inserted
in code_gen_buffer. The next patch in this series renames tb_free
to tb_remove to reflect this.
Performance-wise, lookups in tb_find_pc are the same as before:
O(log n). However, insertions are O(log n) instead of O(1), which
results in a small slowdown when booting debian-arm:
Performance counter stats for 'build/arm-softmmu/qemu-system-arm \
-machine type=virt -nographic -smp 1 -m 4096 \
-netdev user,id=unet,hostfwd=tcp::2222-:22 \
-device virtio-net-device,netdev=unet \
-drive file=img/arm/jessie-arm32.qcow2,id=myblock,index=0,if=none \
-device virtio-blk-device,drive=myblock \
-kernel img/arm/aarch32-current-linux-kernel-only.img \
-append console=ttyAMA0 root=/dev/vda1 \
-name arm,debug-threads=on -smp 1' (10 runs):
- Before:
8048.598422 task-clock (msec) # 0.931 CPUs utilized ( +- 0.28% )
16,974 context-switches # 0.002 M/sec ( +- 0.12% )
0 cpu-migrations # 0.000 K/sec
10,125 page-faults # 0.001 M/sec ( +- 1.23% )
35,144,901,879 cycles # 4.367 GHz ( +- 0.14% )
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
65,758,252,643 instructions # 1.87 insns per cycle ( +- 0.33% )
10,871,298,668 branches # 1350.707 M/sec ( +- 0.41% )
192,322,212 branch-misses # 1.77% of all branches ( +- 0.32% )
8.
640869419 seconds time elapsed ( +- 0.57% )
- After:
8146.242027 task-clock (msec) # 0.923 CPUs utilized ( +- 1.23% )
17,016 context-switches # 0.002 M/sec ( +- 0.40% )
0 cpu-migrations # 0.000 K/sec
18,769 page-faults # 0.002 M/sec ( +- 0.45% )
35,660,956,120 cycles # 4.378 GHz ( +- 1.22% )
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
65,095,366,607 instructions # 1.83 insns per cycle ( +- 1.73% )
10,803,480,261 branches # 1326.192 M/sec ( +- 1.95% )
195,601,289 branch-misses # 1.81% of all branches ( +- 0.39% )
8.
828660235 seconds time elapsed ( +- 0.38% )
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Fri, 13 Oct 2017 19:22:28 +0000 (12:22 -0700)]
tcg: Remove CF_IGNORE_ICOUNT
Now that we have curr_cflags, we can include CF_USE_ICOUNT
early and then remove it as necessary.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Fri, 13 Oct 2017 19:15:06 +0000 (12:15 -0700)]
tcg: Add CF_LAST_IO + CF_USE_ICOUNT to CF_HASH_MASK
These flags are used by target/*/translate.c,
and affect code generation.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Fri, 14 Jul 2017 21:56:30 +0000 (17:56 -0400)]
cpu-exec: lookup/generate TB outside exclusive region during step_atomic
Now that all code generation has been converted to check CF_PARALLEL, we can
generate !CF_PARALLEL code without having yet set !parallel_cpus --
and therefore without having to be in the exclusive region during
cpu_exec_step_atomic.
While at it, merge cpu_exec_step into cpu_exec_step_atomic.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Sun, 16 Jul 2017 19:13:52 +0000 (15:13 -0400)]
tcg: check CF_PARALLEL instead of parallel_cpus
Thereby decoupling the resulting translated code from the current state
of the system.
The tb->cflags field is not passed to tcg generation functions. So
we add a field to TCGContext, storing there a copy of tb->cflags.
Most architectures have <= 32 registers, which results in a 4-byte hole
in TCGContext. Use this hole for the new field.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Fri, 14 Jul 2017 22:47:23 +0000 (18:47 -0400)]
target/sparc: check CF_PARALLEL instead of parallel_cpus
Thereby decoupling the resulting translated code from the current state
of the system.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Thu, 20 Jul 2017 00:08:38 +0000 (20:08 -0400)]
target/sh4: check CF_PARALLEL instead of parallel_cpus
Thereby decoupling the resulting translated code from the current state
of the system.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Fri, 14 Jul 2017 22:43:35 +0000 (18:43 -0400)]
target/s390x: check CF_PARALLEL instead of parallel_cpus
Thereby decoupling the resulting translated code from the current state
of the system.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Fri, 14 Jul 2017 22:37:21 +0000 (18:37 -0400)]
target/m68k: check CF_PARALLEL instead of parallel_cpus
Thereby decoupling the resulting translated code from the current state
of the system.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Fri, 14 Jul 2017 22:30:49 +0000 (18:30 -0400)]
target/i386: check CF_PARALLEL instead of parallel_cpus
Thereby decoupling the resulting translated code from the current state
of the system.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Fri, 14 Jul 2017 22:29:47 +0000 (18:29 -0400)]
target/hppa: check CF_PARALLEL instead of parallel_cpus
Thereby decoupling the resulting translated code from the current state
of the system.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Fri, 14 Jul 2017 22:20:49 +0000 (18:20 -0400)]
target/arm: check CF_PARALLEL instead of parallel_cpus
Thereby decoupling the resulting translated code from the current state
of the system.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Wed, 19 Jul 2017 00:46:52 +0000 (20:46 -0400)]
tcg: convert tb->cflags reads to tb_cflags(tb)
Convert all existing readers of tb->cflags to tb_cflags, so that we
use atomic_read and therefore avoid undefined behaviour in C11.
Note that the remaining setters/getters of the field are protected
by tb_lock, and therefore do not need conversion.
Luckily all readers access the field via 'tb->cflags' (so no foo.cflags,
bar->cflags in the code base), which makes the conversion easily
scriptable:
FILES=$(git grep 'tb->cflags' target include/exec/gen-icount.h \
accel/tcg/translator.c | cut -f1 -d':' | sort | uniq)
perl -pi -e 's/([^.>])tb->cflags/$1tb_cflags(tb)/g' $FILES
perl -pi -e 's/([a-z->.]*)(->|\.)tb->cflags/tb_cflags($1$2tb)/g' $FILES
Then manually fixed the few errors that checkpatch reported.
Compile-tested for all targets.
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Fri, 13 Oct 2017 18:22:57 +0000 (11:22 -0700)]
tcg: Include CF_COUNT_MASK in CF_HASH_MASK
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Fri, 13 Oct 2017 17:50:02 +0000 (10:50 -0700)]
tcg: Add CPUState cflags_next_tb
We were generating code during tb_invalidate_phys_page_range,
check_watchpoint, cpu_io_recompile, and (seemingly) discarding
the TB, assuming that it would magically be picked up during
the next iteration through the cpu_exec loop.
Instead, record the desired cflags in CPUState so that we request
the proper TB so that there is no more magic.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emilio G. Cota [Tue, 11 Jul 2017 18:29:37 +0000 (14:29 -0400)]
tcg: define CF_PARALLEL and use it for TB hashing along with CF_COUNT_MASK
This will enable us to decouple code translation from the value
of parallel_cpus at any given time. It will also help us minimize
TB flushes when generating code via EXCP_ATOMIC.
Note that the declaration of parallel_cpus is brought to exec-all.h
to be able to define there the "curr_cflags" inline.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Fri, 20 Oct 2017 19:08:19 +0000 (12:08 -0700)]
tcg: Use offsets not indices for TCGv_*
Using the offset of a temporary, relative to TCGContext, rather than
its index means that we don't use 0. That leaves offset 0 free for
a NULL representation without having to leave index 0 unused.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 16 Oct 2017 02:02:42 +0000 (19:02 -0700)]
qom: Introduce CPUClass.tcg_initialize
Move target cpu tcg initialization to common code,
called from cpu_exec_realizefn.
Acked-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Fri, 20 Oct 2017 03:27:27 +0000 (20:27 -0700)]
tcg: Remove TCGV_EQUAL*
When we used structures for TCGv_*, we needed a macro in order to
perform a comparison. Now that we use pointers, this is just clutter.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Fri, 20 Oct 2017 07:30:24 +0000 (00:30 -0700)]
tcg: Remove GET_TCGV_* and MAKE_TCGV_*
The GET and MAKE functions weren't really specific enough.
We now have a full complement of functions that convert exactly
between temporaries, arguments, tcgv pointers, and indices.
The target/sparc change is also a bug fix, which would have affected
a host that defines TCG_TARGET_HAS_extr[lh]_i64_i32, i.e. MIPS64.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Fri, 20 Oct 2017 07:05:45 +0000 (00:05 -0700)]
tcg: Introduce temp_tcgv_{i32,i64,ptr}
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Sun, 15 Oct 2017 20:27:56 +0000 (13:27 -0700)]
tcg: Introduce tcgv_{i32,i64,ptr}_{arg,temp}
Transform TCGv_* to an "argument" or a temporary.
For now, an argument is simply the temporary index.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Sun, 15 Oct 2017 19:19:01 +0000 (12:19 -0700)]
tcg: Push tcg_ctx into tcg_gen_callN
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Sun, 15 Oct 2017 18:50:16 +0000 (11:50 -0700)]
tcg: Push tcg_ctx into generator functions
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Tue, 20 Jun 2017 20:43:15 +0000 (13:43 -0700)]
tcg: Use per-temp state data in optimize
While we're touching many of the lines anyway, adjust the naming
of the functions to better distinguish when "TCGArg" vs "TCGTemp"
should be used.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Richard Henderson [Wed, 9 Nov 2016 15:03:16 +0000 (16:03 +0100)]
tcg: Remove unused TCG_CALL_DUMMY_TCGV
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Richard Henderson [Wed, 9 Nov 2016 14:25:09 +0000 (15:25 +0100)]
tcg: Change temp_allocate_frame arg to TCGTemp
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Richard Henderson [Wed, 2 Nov 2016 17:21:44 +0000 (11:21 -0600)]
tcg: Avoid loops against variable bounds
Copy s->nb_globals or s->nb_temps to a local variable for the purposes
of iteration. This should allow the compiler to use low-overhead
looping constructs on some hosts.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Richard Henderson [Tue, 1 Nov 2016 21:56:04 +0000 (15:56 -0600)]
tcg: Use per-temp state data in liveness
This avoids having to allocate external memory for each temporary.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Richard Henderson [Tue, 20 Jun 2017 19:24:57 +0000 (12:24 -0700)]
tcg: Introduce temp_arg, export temp_idx
At the same time, drop the TCGContext argument and use tcg_ctx instead.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Richard Henderson [Tue, 20 Jun 2017 18:34:55 +0000 (11:34 -0700)]
tcg: Return NULL temp for TCG_CALL_DUMMY_ARG
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Richard Henderson [Wed, 2 Nov 2016 17:20:15 +0000 (11:20 -0600)]
tcg: Add temp_global bit to TCGTemp
This avoids needing to test the index of a temp against nb_globals.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Richard Henderson [Tue, 20 Jun 2017 06:18:10 +0000 (23:18 -0700)]
tcg: Introduce arg_temp
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Richard Henderson [Thu, 8 Dec 2016 21:42:08 +0000 (13:42 -0800)]
tcg: Propagate TCGOp down to allocators
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Richard Henderson [Thu, 8 Dec 2016 21:12:08 +0000 (13:12 -0800)]
tcg: Propagate args to op->args in tcg.c
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Richard Henderson [Thu, 8 Dec 2016 20:28:42 +0000 (12:28 -0800)]
tcg: Propagate args to op->args in optimizer
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Richard Henderson [Thu, 8 Dec 2016 18:52:57 +0000 (10:52 -0800)]
tcg: Merge opcode arguments into TCGOp
Rather than have a separate buffer of 10*max_ops entries,
give each opcode 10 entries. The result is actually a bit
smaller and should have slightly more cache locality.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Peter Maydell [Tue, 24 Oct 2017 15:55:56 +0000 (16:55 +0100)]
Merge remote-tracking branch 'remotes/kraxel/tags/input-
20171023-pull-request' into staging
input: fixes for ui input code and ps/2 keyboard (mostly sysrq key)
# gpg: Signature made Mon 23 Oct 2017 10:19:22 BST
# gpg: using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138
* remotes/kraxel/tags/input-
20171023-pull-request:
ui: pull in latest keycodemapdb
ui: normalize the 'sysrq' key into the 'print' key
ps2: fix scancodes sent for Ctrl+Pause key combination
ps2: fix scancodess sent for Pause key in AT set 1
ps2: fix scancodes sent for Shift/Ctrl+Print key combination
ps2: fix scancodes sent for Alt-Print key combination (aka SysRq)
ui: use correct union field for key number
ui: fix crash with sendkey and raw key numbers
input: use hex in ps2 keycode trace events
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 24 Oct 2017 15:05:57 +0000 (16:05 +0100)]
Merge remote-tracking branch 'remotes/kraxel/tags/usb-
20171023-pull-request' into staging
usb: ccid fix.
# gpg: Signature made Mon 23 Oct 2017 09:45:00 BST
# gpg: using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138
* remotes/kraxel/tags/usb-
20171023-pull-request:
usb-ccid: remove needless migration state code
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 24 Oct 2017 11:03:52 +0000 (12:03 +0100)]
Merge remote-tracking branch 'remotes/kraxel/tags/fixes-
20171023-pull-request' into staging
fixes for the fallout of the recent ui and keymap merges.
# gpg: Signature made Mon 23 Oct 2017 09:02:24 BST
# gpg: using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138
* remotes/kraxel/tags/fixes-
20171023-pull-request:
scripts: don't throw away stderr when checking out git submodules
ui: add qemu-keymap and shader to .gitignore
configure: disable qemu-keymap for linux-user qemu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 24 Oct 2017 09:50:49 +0000 (10:50 +0100)]
Merge remote-tracking branch 'remotes/shorne/tags/openrisc-
20171021-smp-pr' into staging
OpenRISC SMP patchset
20171021
# gpg: Signature made Fri 20 Oct 2017 22:51:16 BST
# gpg: using RSA key 0xC3B31C2D5E6627E4
# gpg: Good signature from "Stafford Horne <shorne@gmail.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: D9C4 7354 AEF8 6C10 3A25 EFF1 C3B3 1C2D 5E66 27E4
* remotes/shorne/tags/openrisc-
20171021-smp-pr:
openrisc: Only kick cpu on timeout, not on update
openrisc: Initial SMP support
openrisc/cputimer: Perparation for Multicore
target/openrisc: Make coreid and numcores variable
openrisc/ompic: Add OpenRISC Multicore PIC (OMPIC)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Juan Quintela [Tue, 5 Sep 2017 10:50:22 +0000 (12:50 +0200)]
migration: Improve migration thread error handling
We now report errors also when we finish migration, not only on info
migrate. We plan to use this error from several places, and we want
the first error to happen to win, so we add an mutex to order it.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Juan Quintela [Wed, 27 Sep 2017 08:52:11 +0000 (10:52 +0200)]
qapi: Fix grammar in x-multifd-page-count descriptions
Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Alexey Perevalov [Thu, 5 Oct 2017 11:13:20 +0000 (14:13 +0300)]
migration: add bitmap for received page
This patch adds ability to track down already received
pages, it's necessary for calculation vCPU block time in
postcopy migration feature, and for recovery after
postcopy migration failure.
Also it's necessary to solve shared memory issue in
postcopy livemigration. Information about received pages
will be transferred to the software virtual bridge
(e.g. OVS-VSWITCHD), to avoid fallocate (unmap) for
already received pages. fallocate syscall is required for
remmaped shared memory, due to remmaping itself blocks
ioctl(UFFDIO_COPY, ioctl in this case will end with EEXIT
error (struct page is exists after remmap).
Bitmap is placed into RAMBlock as another postcopy/precopy
related bitmaps.
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Alexey Perevalov [Thu, 5 Oct 2017 11:13:19 +0000 (14:13 +0300)]
migration: introduce qemu_ufd_copy_ioctl helper
Just for placing auxilary operations inside helper,
auxilary operations like: track received pages,
notify about copying operation in futher patches.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Alexey Perevalov [Thu, 5 Oct 2017 11:13:18 +0000 (14:13 +0300)]
migration: postcopy_place_page factoring out
Need to mark copied pages as closer as possible to the place where it
tracks down. That will be necessary in futher patch.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Peter Xu [Thu, 19 Oct 2017 06:32:00 +0000 (14:32 +0800)]
migration: new ram_init_bitmaps()
Rearrange the bitmap initialization and the first sync. Since at it,
make sure the locks are taken/released in correct order (I moved RCU
unlock upper - though it may not affect much).
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Peter Xu [Thu, 19 Oct 2017 06:31:59 +0000 (14:31 +0800)]
migration: clean up xbzrle cache init/destroy
Let's further simplify ram_init_all() and ram_save_cleanup() by abstract
all the XBZRLE related codes into their own functions.
When allocating xbzrle cache, we are always very careful on -ENOMEM;
which makes sense. Replacing the last g_malloc0() with g_try_malloc0(),
then refactor the logic a bit.
This patch should be fixing some memory leaks when some memory
allocation failed for XBZRLE in the past.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Peter Xu [Thu, 19 Oct 2017 06:31:58 +0000 (14:31 +0800)]
migration: provide ram_state_cleanup
There are two Mutexes that are created but not yet destroyed for
RAMState. Fix that.
Since we are at it, provide helper function to clean up RAMState.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Peter Xu [Thu, 19 Oct 2017 06:31:57 +0000 (14:31 +0800)]
migration: provide ram_state_init()
The old ram_state_init() is not really initializing the RAMState only,
but including lots of other stuff that is RAM-related. Renaming it to
ram_init_all(). Instead, provide a real ram_state_init().
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Dr. David Alan Gilbert [Fri, 20 Oct 2017 09:05:56 +0000 (10:05 +0100)]
migration: pause-before-switchover for postcopy
Add pause-before-switchover support for postcopy.
After starting postcopy it will transition
active->pre-switchover->postcopy_active
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Dr. David Alan Gilbert [Fri, 20 Oct 2017 09:05:55 +0000 (10:05 +0100)]
migration: allow cancel to unpause
If a migration_cancel is issued during the new paused state,
kick the pause_sem to get to unpause so it can cancel.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Dr. David Alan Gilbert [Fri, 20 Oct 2017 09:05:54 +0000 (10:05 +0100)]
migrate: HMP migate_continue
HMP equivalent to the just added migrate-continue
Unpause a migrate paused at a given state.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Dr. David Alan Gilbert [Fri, 20 Oct 2017 09:05:53 +0000 (10:05 +0100)]
migration: migrate-continue
A new qmp command allows the caller to continue from a given
paused state.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Dr. David Alan Gilbert [Fri, 20 Oct 2017 09:05:52 +0000 (10:05 +0100)]
migration: Wait for semaphore before completing migration
Wait for a semaphore before completing the migration,
if the previously added capability was enabled.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Dr. David Alan Gilbert [Fri, 20 Oct 2017 09:05:51 +0000 (10:05 +0100)]
migration: Add 'pre-switchover' and 'device' statuses
Add two statuses for use when the 'pause-before-switchover'
capability is enabled.
'pre-switchover' is the state that we wait in for management
to allow us to continue.
'device' is the state we enter while serialising the devices
after management gives us the OK.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Dr. David Alan Gilbert [Fri, 20 Oct 2017 09:05:50 +0000 (10:05 +0100)]
migration: Add 'pause-before-switchover' capability
When 'pause-before-switchover' is enabled, the outgoing migration
will pause before invalidating the block devices and serializing
the device state.
At this point the management layer gets the chance to clean up any
device jobs or other device users before the migration completes.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Juan Quintela [Fri, 6 Oct 2017 20:30:45 +0000 (22:30 +0200)]
migration: Make cache_init() take an error parameter
Once there, take a total size instead of the size of the pages. We
move the check that the new_size is bigger than one page from
xbzrle_cache_resize().
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Fix typo spotted by Peter Xu