Peter Maydell [Tue, 26 May 2015 10:31:03 +0000 (11:31 +0100)]
Merge remote-tracking branch 'remotes/jnsnow/tags/ide-pull-request' into staging
# gpg: Signature made Fri May 22 20:58:44 2015 BST using RSA key ID
AAFC390E
# gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: FAEB 9711 A12C F475 812F 18F2 88A9 064D 1835 61EB
# Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76 CBD0 7DEF 8106 AAFC 390E
* remotes/jnsnow/tags/ide-pull-request:
ahci: do not remap clb/fis unconditionally
macio: move unaligned DMA write code into separate pmac_dma_write() function
macio: move unaligned DMA read code into separate pmac_dma_read() function
qtest: pre-buffer hex nibs
libqos/ahci: Swap memread/write with bufread/write
qtest: add memset to qtest protocol
qtest: Add base64 encoded read/write
qtest: allow arbitrarily long sends
qtest/ahci: add migrate halted dma test
qtest/ahci: add halted dma test
qtest/ahci: add flush migrate test
qtest/ahci: add migrate dma test
qtest/ahci: Add migration test
ich9/ahci: Enable Migration
libqos: Add migration helpers
libqos/ahci: Fix sector set method
libqos/ahci: Add halted command helpers
glib: remove stale compat functions
configure: require glib 2.22
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
John Snow [Fri, 22 May 2015 18:13:44 +0000 (14:13 -0400)]
ahci: do not remap clb/fis unconditionally
This continues the IOMMU fix from 2.3, where we should not attempt
to remap the CLB or FIS RX buffers if the AHCI device is currently
running.
The same applies to migration: keep our mitts off these registers
unless the device is supposed to be on.
Does not impact backwards compatibility for the AHCI device.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id:
1431470173-30847-2-git-send-email-jsnow@redhat.com
Mark Cave-Ayland [Fri, 22 May 2015 18:13:44 +0000 (14:13 -0400)]
macio: move unaligned DMA write code into separate pmac_dma_write() function
Similarly switch the macio IDE routines over to use the new function and
tidy-up the remaining code as required.
[Maintainer edit: printf format codes adjusted for 32/64bit. --js]
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Acked-by: John Snow <jsnow@redhat.com>
Message-id:
1425939893-14404-3-git-send-email-mark.cave-ayland@ilande.co.uk
Signed-off-by: John Snow <jsnow@redhat.com>
Mark Cave-Ayland [Fri, 22 May 2015 18:13:44 +0000 (14:13 -0400)]
macio: move unaligned DMA read code into separate pmac_dma_read() function
This considerably helps simplify the complexity of the macio read routines and
by switching macio CDROM accesses to use the new code, fixes the issue with
the CDROM device being detected intermittently by Darwin/OS X.
[Maintainer edit: printf format codes adjusted for 32/64bit. --js]
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ailande.co.uk>
Acked-by: John Snow <jsnow@redhat.com>
Message-id:
1425939893-14404-2-git-send-email-mark.cave-ayland@ilande.co.uk
Signed-off-by: John Snow <jsnow@redhat.com>
John Snow [Fri, 22 May 2015 18:13:44 +0000 (14:13 -0400)]
qtest: pre-buffer hex nibs
Instead of converting each byte one-at-a-time and then sending each byte
over the wire, use sprintf() to pre-compute all of the hex nibs into a
single buffer, then send the entire buffer all at once.
This gives a moderate speed boost to memread() and memwrite() functions.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id:
1431021095-7558-2-git-send-email-jsnow@redhat.com
John Snow [Fri, 22 May 2015 18:13:44 +0000 (14:13 -0400)]
libqos/ahci: Swap memread/write with bufread/write
Where it makes sense, use the new faster primitives.
For generally small reads/writes such as for the PRDT
and FIS packets, stick with the more wasteful but
easier to debug memread/memwrite.
For ahci-test (before migration tests):
With this patch:
real 0m3.675s
user 0m2.582s
sys 0m1.718s
Without any qtest protocol improvements:
real 0m14.171s
user 0m12.072s
sys 0m12.527s
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id:
1430864578-22072-6-git-send-email-jsnow@redhat.com
John Snow [Fri, 22 May 2015 18:13:44 +0000 (14:13 -0400)]
qtest: add memset to qtest protocol
Previously, memset was just a frontend to write() and only
stupidly sent the pattern many times across the wire.
Let's not discuss who stupidly wrote it like that in the first place.
(Hint: It was me.)
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id:
1430864578-22072-4-git-send-email-jsnow@redhat.com
John Snow [Fri, 22 May 2015 18:13:44 +0000 (14:13 -0400)]
qtest: Add base64 encoded read/write
For larger pieces of data that won't need to be debugged and
viewing the hex nibbles is unlikely to be useful, we can encode
data using base64 instead of encoding each byte as %02x, which
leads to some space savings and faster reads/writes.
For now, the default is left as hex nibbles in memwrite() and memread().
For the purposes of making qtest io easier to read and debug, some
callers may want to specify using the old encoding format for small
patches of data where the savings from base64 wouldn't be that profound.
memwrite/memread use a data encoding that takes 2x the size of the original
buffer, but base64 uses "only" (4/3)x, so for larger buffers we can save a
decent amount of time and space.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id:
1430864578-22072-3-git-send-email-jsnow@redhat.com
John Snow [Fri, 22 May 2015 18:13:43 +0000 (14:13 -0400)]
qtest: allow arbitrarily long sends
qtest currently has a static buffer of size 1024 that if we
overflow, ignores the additional data silently which leads
to hangs or stream failures.
Use glib's string facilities to allow arbitrarily long data,
but split this off into a new function, qtest_sendf.
Static data can still be sent using qtest_send, which avoids
the malloc/copy overhead.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id:
1430864578-22072-2-git-send-email-jsnow@redhat.com
John Snow [Fri, 22 May 2015 18:13:43 +0000 (14:13 -0400)]
qtest/ahci: add migrate halted dma test
Test migrating a halted DMA transaction.
Resume, then test data integrity.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id:
1430417242-11859-10-git-send-email-jsnow@redhat.com
John Snow [Fri, 22 May 2015 18:13:43 +0000 (14:13 -0400)]
qtest/ahci: add halted dma test
If we're going to test the migration of halted DMA jobs,
we should probably check to make sure we can resume them
locally as a first step.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id:
1430417242-11859-9-git-send-email-jsnow@redhat.com
John Snow [Fri, 22 May 2015 18:13:43 +0000 (14:13 -0400)]
qtest/ahci: add flush migrate test
Use blkdebug to inject an error on first flush, then attempt to flush
on the first guest. When the error halts the VM, migrate to the
second VM, and attempt to resume the command.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id:
1430417242-11859-8-git-send-email-jsnow@redhat.com
John Snow [Fri, 22 May 2015 18:13:43 +0000 (14:13 -0400)]
qtest/ahci: add migrate dma test
Write to one guest, migrate, and then read from the other.
adjust ahci_io to clear any buffers it creates, so that we
can use ahci_io safely on both guests knowing we are using
empty buffers and not accidentally re-using data.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id:
1430417242-11859-7-git-send-email-jsnow@redhat.com
John Snow [Fri, 22 May 2015 18:13:43 +0000 (14:13 -0400)]
qtest/ahci: Add migration test
Notes:
* The migration is performed on QOSState objects.
* The migration is performed in such a way that it does not assume
consistency between the allocators attached to each. That is to say,
you can use each QOSState object completely independently and then at
an arbitrary point decide to migrate, and the destination object will
now be consistent with the memory within the source guest. The source
object that was migrated from will have a completely blank allocator.
ahci-test.c:
- verify_state is added
- ahci_migrate is added as a frontend to migrate
- test_migrate_sanity test case is added.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id:
1430417242-11859-6-git-send-email-jsnow@redhat.com
John Snow [Fri, 22 May 2015 18:13:43 +0000 (14:13 -0400)]
ich9/ahci: Enable Migration
Lift the flag preventing the migration of the ICH9/AHCI devices.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id:
1430417242-11859-5-git-send-email-jsnow@redhat.com
John Snow [Fri, 22 May 2015 18:13:43 +0000 (14:13 -0400)]
libqos: Add migration helpers
libqos.c:
-set_context for addressing which commands go where
-migrate performs the actual migration
malloc.c:
- Structure of the allocator is adjusted slightly with
a second-tier malloc to make swapping around the allocators
easy when we "migrate" the lists from the source to the destination.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id:
1430417242-11859-4-git-send-email-jsnow@redhat.com
John Snow [Fri, 22 May 2015 18:13:42 +0000 (14:13 -0400)]
libqos/ahci: Fix sector set method
|| probably does not mean the same thing as |.
Additionally, allow users to submit a prd_size of 0
to indicate that they'd like to continue using the default.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id:
1430417242-11859-3-git-send-email-jsnow@redhat.com
John Snow [Fri, 22 May 2015 18:13:42 +0000 (14:13 -0400)]
libqos/ahci: Add halted command helpers
Sometimes we want a command to halt the VM instead
of complete successfully, so it'd be nice to let the
libqos/ahci functions cope with such scenarios.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id:
1430417242-11859-2-git-send-email-jsnow@redhat.com
John Snow [Fri, 22 May 2015 18:13:42 +0000 (14:13 -0400)]
glib: remove stale compat functions
Since we're bumping the version to 2.22+,
remove the now-stale compat functions.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id:
1431469140-22208-2-git-send-email-jsnow@redhat.com
John Snow [Fri, 22 May 2015 18:13:42 +0000 (14:13 -0400)]
configure: require glib 2.22
This provides g_ptr_array_new_with_free_func, as well as a few
other functions that we've been hacking around in glib-compat.h.
Cleaning up the compatibility headers will come later.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id:
1431469140-22208-2-git-send-email-jsnow@redhat.com
Peter Maydell [Fri, 22 May 2015 16:20:09 +0000 (17:20 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer core and image format patches
# gpg: Signature made Fri May 22 16:21:03 2015 BST using RSA key ID
C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
* remotes/kevin/tags/for-upstream: (22 commits)
MAINTAINERS: Split "Block QAPI, monitor, command line" off core
MAINTAINERS: Add header files to Block Layer Core section
tests: add test case for encrypted qcow2 read/write
qemu-io: prompt for encryption keys when required
util: allow \n to terminate password input
util: move read_password method out of qemu-img into osdep/oslib
qcow2/qcow: protect against uninitialized encryption key
qemu-iotests: Make debugging python tests easier
qemu-iotests: qemu-img info on afl VMDK image with a huge capacity
block: Detect multiplication overflow in bdrv_getlength
qemu-io: Use getopt() correctly
qcow2: style fixes in qcow2-cache.c
qcow2: make qcow2_cache_put() a void function
qcow2: use a hash to look for entries in the L2 cache
qcow2: remove qcow2_cache_find_entry_to_replace()
qcow2: use an LRU algorithm to replace entries from the L2 cache
qcow2: simplify qcow2_cache_put() and qcow2_cache_entry_mark_dirty()
qcow2: use one single memory block for the L2/refcount cache tables
vmdk: Fix overflow if l1_size is 0x20000000
vmdk: Fix next_cluster_sector for compressed write
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Fri, 22 May 2015 15:22:42 +0000 (16:22 +0100)]
Merge remote-tracking branch 'remotes/bkoppelmann/tags/pull-tricore-
20150522' into staging
TriCore v1.6.1 ISA and missing v1.6 instructions
# gpg: Signature made Fri May 22 16:02:45 2015 BST using RSA key ID
6B69CA14
# gpg: Good signature from "Bastian Koppelmann <kbastian@mail.uni-paderborn.de>"
* remotes/bkoppelmann/tags/pull-tricore-
20150522:
target-tricore: add RR_DIV and RR_DIV_U instructions of the v1.6 ISA
target-tricore: add FRET instructions of the v1.6 ISA
target-tricore: add FCALL instructions of the v1.6 ISA
target-tricore: add SYS_RESTORE instruction of the v1.6 ISA
target-tricore: add RR_CRC32 instruction of the v1.6.1 ISA
target-tricore: add SWAPMSK instructions of the v1.6.1 ISA
target-tricore: add CMPSWP instructions of the v1.6.1 ISA
target-tricore: Add SRC_MOV_E instruction of the v1.6 ISA
target-tricore: introduce ISA v1.6.1 feature
target-tricore: Add ISA v1.3.1 cpu and fix tc1796 to using v1.3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Markus Armbruster [Wed, 20 May 2015 11:23:46 +0000 (13:23 +0200)]
MAINTAINERS: Split "Block QAPI, monitor, command line" off core
Kevin and Stefan asked me to take care of this part.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Kevin Wolf [Wed, 20 May 2015 10:03:17 +0000 (12:03 +0200)]
MAINTAINERS: Add header files to Block Layer Core section
Suggested-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Daniel P. Berrange [Tue, 12 May 2015 16:09:22 +0000 (17:09 +0100)]
tests: add test case for encrypted qcow2 read/write
Add a simple test case for qemu-iotests that covers read/write
with encrypted qcow2 files.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Daniel P. Berrange [Tue, 12 May 2015 16:09:21 +0000 (17:09 +0100)]
qemu-io: prompt for encryption keys when required
The qemu-io tool does not check if the image is encrypted so
historically would silently corrupt the sectors by writing
plain text data into them instead of cipher text. The earlier
commit turns this mistake into a fatal abort, so check for
encryption and prompt for key when required.
This enables us to add unit tests to ensure we don't break
the ability of qemu-img to convert existing encrypted qcow2
files into a non-encrypted format.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Daniel P. Berrange [Tue, 12 May 2015 16:09:20 +0000 (17:09 +0100)]
util: allow \n to terminate password input
The qemu_read_password() method looks for \r to terminate the
reading of the a password. This is what will be seen when
reading the password from a TTY. When scripting though, it is
useful to be able to send the password via a pipe, in which
case we must look for \n to terminate password input.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Daniel P. Berrange [Tue, 12 May 2015 16:09:19 +0000 (17:09 +0100)]
util: move read_password method out of qemu-img into osdep/oslib
The qemu-img.c file has a read_password() method impl that is
used to prompt for passwords on the console, with impls for
POSIX and Windows. This will be needed by qemu-io.c too, so
move it into the QEMU osdep/oslib files where it can be shared
without code duplication
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Daniel P. Berrange [Tue, 12 May 2015 16:09:18 +0000 (17:09 +0100)]
qcow2/qcow: protect against uninitialized encryption key
When a qcow[2] file is opened, if the header reports an
encryption method, this is used to set the 'crypt_method_header'
field on the BDRVQcow[2]State struct, and the 'encrypted' flag
in the BDRVState struct.
When doing I/O operations, the 'crypt_method' field on the
BDRVQcow[2]State struct is checked to determine if encryption
needs to be applied.
The crypt_method_header value is copied into crypt_method when
the bdrv_set_key() method is called.
The QEMU code which opens a block device is expected to always
do a check
if (bdrv_is_encrypted(bs)) {
bdrv_set_key(bs, ....key...);
}
If code forgets to do this, then 'crypt_method' is never set
and so when I/O is performed, QEMU writes plain text data
into a sector which is expected to contain cipher text, or
when reading, will return cipher text instead of plain
text.
Change the qcow[2] code to consult bs->encrypted when deciding
whether encryption is required, and assert(s->crypt_method)
to protect against cases where the caller forgets to set the
encryption key.
Also put an assert in the set_key methods to protect against
the case where the caller sets an encryption key on a block
device that does not have encryption
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Fam Zheng [Mon, 18 May 2015 01:39:12 +0000 (09:39 +0800)]
qemu-iotests: Make debugging python tests easier
Adding "-d" option. The output goes to "tee" so it appears in your
console. Also, raise the verbosity of unnitest runner.
When testing a topic branch, it's possible that a bug introduced by a
code change makes the python test case hang, with debug output, it is
much easier to locate the problem.
This can also be helpful if you want to watch the progress of a python
test, it offers you a way to sense the speed of each test case method
you're writing.
Note: because there is no easy way to get *both* the verbose output and
the output expected by ./check comparison, the case would always fail
with an "output mismatch". The sole purpose of using this option is
giving developers a quick way to debug when things go wrong.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Fam Zheng [Fri, 15 May 2015 08:36:06 +0000 (16:36 +0800)]
qemu-iotests: qemu-img info on afl VMDK image with a huge capacity
The image is contributed by Richard W.M. Jones.
Cc: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Fam Zheng [Fri, 15 May 2015 08:36:05 +0000 (16:36 +0800)]
block: Detect multiplication overflow in bdrv_getlength
Bogus image may have a large total_sectors that will overflow the
multiplication. For cleanness, fix the return code so the error message
will be meaningful.
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Tue, 12 May 2015 15:10:56 +0000 (09:10 -0600)]
qemu-io: Use getopt() correctly
POSIX says getopt() returns -1 on completion. While Linux happens
to define EOF as -1, this definition is not required by POSIX, and
there may be platforms where checking for EOF instead of -1 would
lead to an infinite loop.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Alberto Garcia [Mon, 11 May 2015 12:54:59 +0000 (15:54 +0300)]
qcow2: style fixes in qcow2-cache.c
Fix pointer declaration to make it consistent with the rest of the
code.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Alberto Garcia [Mon, 11 May 2015 12:54:58 +0000 (15:54 +0300)]
qcow2: make qcow2_cache_put() a void function
This function never receives an invalid table pointer, so we can make
it void and remove all the error checking code.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Alberto Garcia [Mon, 11 May 2015 12:54:57 +0000 (15:54 +0300)]
qcow2: use a hash to look for entries in the L2 cache
The current cache algorithm traverses the array starting always from
the beginning, so the average number of comparisons needed to perform
a lookup is proportional to the size of the array.
By using a hash of the offset as the starting point, lookups are
faster and independent from the array size.
The hash is computed using the cluster number of the table, multiplied
by 4 to make it perform better when there are collisions.
In my tests, using a cache with 2048 entries, this reduces the average
number of comparisons per lookup from 430 to 2.5.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Alberto Garcia [Mon, 11 May 2015 12:54:56 +0000 (15:54 +0300)]
qcow2: remove qcow2_cache_find_entry_to_replace()
A cache miss means that the whole array was traversed and the entry
we were looking for was not found, so there's no need to traverse it
again in order to select an entry to replace.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Alberto Garcia [Mon, 11 May 2015 12:54:55 +0000 (15:54 +0300)]
qcow2: use an LRU algorithm to replace entries from the L2 cache
The current algorithm to evict entries from the cache gives always
preference to those in the lowest positions. As the size of the cache
increases, the chances of the later elements of being removed decrease
exponentially.
In a scenario with random I/O and lots of cache misses, entries in
positions 8 and higher are rarely (if ever) evicted. This can be seen
even with the default cache size, but with larger caches the problem
becomes more obvious.
Using an LRU algorithm makes the chances of being removed from the
cache independent from the position.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Alberto Garcia [Mon, 11 May 2015 12:54:54 +0000 (15:54 +0300)]
qcow2: simplify qcow2_cache_put() and qcow2_cache_entry_mark_dirty()
Since all tables are now stored together, it is possible to obtain
the position of a particular table directly from its address, so the
operation becomes O(1).
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Alberto Garcia [Mon, 11 May 2015 12:54:53 +0000 (15:54 +0300)]
qcow2: use one single memory block for the L2/refcount cache tables
The qcow2 L2/refcount cache contains one separate table for each cache
entry. Doing one allocation per table adds unnecessary overhead and it
also requires us to store the address of each table separately.
Since the size of the cache is constant during its lifetime, it's
better to have an array that contains all the tables using one single
allocation.
In my tests measuring freshly created caches with sizes 128MB (L2) and
32MB (refcount) this uses around 10MB of RAM less.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Fam Zheng [Tue, 5 May 2015 09:28:13 +0000 (17:28 +0800)]
vmdk: Fix overflow if l1_size is 0x20000000
Richard Jones caught this bug with afl fuzzer.
In fact, that's the only possible value to overflow (extent->l1_size =
0x20000000) l1_size:
l1_size = extent->l1_size * sizeof(long) => 0x80000000;
g_try_malloc returns NULL because l1_size is interpreted as negative
during type casting from 'int' to 'gsize', which yields a enormous
value. Hence, by coincidence, we get a "not too bad" behavior:
qemu-img: Could not open '/tmp/afl6.img': Could not open
'/tmp/afl6.img': Cannot allocate memory
Values larger than 0x20000000 will be refused by the validation in
vmdk_add_extent.
Values smaller than 0x20000000 will not overflow l1_size.
Cc: qemu-stable@nongnu.org
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Fam Zheng [Wed, 6 May 2015 12:23:46 +0000 (20:23 +0800)]
vmdk: Fix next_cluster_sector for compressed write
This fixes the bug introduced by commit
c6ac36e (vmdk: Optimize cluster
allocation).
Sometimes, write_len could be larger than cluster size, because it
contains both data and marker. We must advance next_cluster_sector in
this case, otherwise the image gets corrupted.
Cc: qemu-stable@nongnu.org
Reported-by: Antoni Villalonga <qemu-list@friki.cat>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Christoph Hellwig [Thu, 30 Apr 2015 09:44:17 +0000 (11:44 +0200)]
nvme: support NVME_VOLATILE_WRITE_CACHE feature
The SCSI emulation in the Linux NVMe driver really wants to know
if a device has a volatile write cache. Given that qemu has moved
away from a model where we report the backing store WCE bit to
one where the WCE bit is supposed to be part of the migratable
guest-visible state we always return 1 here.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Kevin Wolf [Wed, 6 May 2015 11:21:51 +0000 (13:21 +0200)]
qcow2: Flush pending discards before allocating cluster
Before a freed cluster can be reused, pending discards for this cluster
must be processed.
The original assumption was that this was not a problem because discards
are only cached during discard/write zeroes operations, which are
synchronous so that no concurrent write requests can cause cluster
allocations.
However, the discard/write zeroes operation itself can allocate a new L2
table (and it has to in order to put zero flags there), so make sure we
can cope with the situation.
This fixes https://bugs.launchpad.net/bugs/
1349972.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Bastian Koppelmann [Mon, 11 May 2015 12:59:55 +0000 (14:59 +0200)]
target-tricore: add RR_DIV and RR_DIV_U instructions of the v1.6 ISA
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Bastian Koppelmann [Thu, 7 May 2015 20:46:50 +0000 (22:46 +0200)]
target-tricore: add FRET instructions of the v1.6 ISA
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Bastian Koppelmann [Thu, 7 May 2015 20:38:16 +0000 (22:38 +0200)]
target-tricore: add FCALL instructions of the v1.6 ISA
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Bastian Koppelmann [Thu, 7 May 2015 19:25:42 +0000 (21:25 +0200)]
target-tricore: add SYS_RESTORE instruction of the v1.6 ISA
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Bastian Koppelmann [Thu, 7 May 2015 17:55:37 +0000 (19:55 +0200)]
target-tricore: add RR_CRC32 instruction of the v1.6.1 ISA
This instruction was introduced by the new Aurix platform.
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Bastian Koppelmann [Wed, 6 May 2015 18:57:10 +0000 (20:57 +0200)]
target-tricore: add SWAPMSK instructions of the v1.6.1 ISA
Those instruction were introduced in the new Aurix platform.
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Bastian Koppelmann [Wed, 6 May 2015 18:47:39 +0000 (20:47 +0200)]
target-tricore: add CMPSWP instructions of the v1.6.1 ISA
Those instruction were introduced in the new Aurix platform.
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Bastian Koppelmann [Wed, 6 May 2015 18:22:45 +0000 (20:22 +0200)]
target-tricore: Add SRC_MOV_E instruction of the v1.6 ISA
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Bastian Koppelmann [Wed, 6 May 2015 18:18:41 +0000 (20:18 +0200)]
target-tricore: introduce ISA v1.6.1 feature
The aurix platform contains of several different cpu models and uses
the 1.6.1 ISA. This patch changes the generic aurix model to the more
specific tc27x cpu model and sets specific features.
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Bastian Koppelmann [Sun, 22 Mar 2015 12:49:12 +0000 (12:49 +0000)]
target-tricore: Add ISA v1.3.1 cpu and fix tc1796 to using v1.3
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Peter Maydell [Fri, 22 May 2015 12:25:40 +0000 (13:25 +0100)]
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Fri May 22 10:00:53 2015 BST using RSA key ID
81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
* remotes/stefanha/tags/block-pull-request: (38 commits)
block: get_block_status: use "else" when testing the opposite condition
qemu-iotests: Test unaligned sub-block zero write
block: Fix NULL deference for unaligned write if qiov is NULL
Revert "block: Fix unaligned zero write"
block: align bounce buffers to page
block: minimal bounce buffer alignment
block: return EPERM on writes or discards to read-only devices
configure: Add workaround for ccache and clang
configure: silence glib unknown attribute __alloc_size__
configure: factor out supported flag check
configure: handle clang -nopie argument warning
block/parallels: improve image writing performance further
block/parallels: optimize linear image expansion
block/parallels: add prealloc-mode and prealloc-size open paramemets
block/parallels: delay writing to BAT till bdrv_co_flush_to_os
block/parallels: create bat_entry_off helper
block/parallels: improve image reading performance
iotests, parallels: check for incorrectly closed image in tests
block/parallels: implement incorrect close detection
block/parallels: implement parallels_check method of block driver
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Fri, 22 May 2015 11:30:13 +0000 (12:30 +0100)]
Revert "target-alpha: Add vector implementation for CMPBGE"
This reverts commit
32ad48abd74a997220b841e4e913edeb267aa362.
Unfortunately the SSE2 code here fails to compile on some versions
of gcc:
target-alpha/int_helper.c:77:24: error: invalid operands to binary >=
(have '__vector(16) unsigned char' and '__vector(16) unsigned char')
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Fri, 22 May 2015 09:06:33 +0000 (10:06 +0100)]
Merge remote-tracking branch 'remotes/rth/tags/pull-axp-
20150521' into staging
Rewrite fp exceptions
# gpg: Signature made Thu May 21 18:35:52 2015 BST using RSA key ID
4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg: aka "Richard Henderson <rth@redhat.com>"
# gpg: aka "Richard Henderson <rth@twiddle.net>"
* remotes/rth/tags/pull-axp-
20150521:
target-alpha: Add vector implementation for CMPBGE
target-alpha: Rewrite helper_zapnot
target-alpha: Raise IOV from CVTQL
target-alpha: Suppress underflow from CVTTQ if DNZ
target-alpha: Raise EXC_M_INV properly for fp inputs
target-alpha: Disallow literal operand to 1C.30 to 1C.37
target-alpha: Implement WH64EN
target-alpha: Fix integer overflow checking insns
target-alpha: Fix cvttq vs inf
target-alpha: Fix cvttq vs large integers
target-alpha: Raise IOV from CVTTQ
target-alpha: Set EXC_M_SWC for exceptions from /S insns
target-alpha: Set fpcr_exc_status even for disabled exceptions
target-alpha: Tidy FPCR representation
target-alpha: Set PC correctly for floating-point exceptions
target-alpha: Forget installed round mode after MT_FPCR
target-alpha: Rename floating-point subroutines
target-alpha: Move VAX helpers to a new file
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Paolo Bonzini [Thu, 14 May 2015 10:35:02 +0000 (12:35 +0200)]
block: get_block_status: use "else" when testing the opposite condition
A bit of Boolean algebra (and common sense) tells us that the
second "if" here is looking for blocks that are not allocated.
This is the opposite of the "if" that sets BDRV_BLOCK_ALLOCATED,
and thus it can use an "else".
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id:
1431599702-10431-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Fam Zheng [Wed, 13 May 2015 13:12:01 +0000 (13:12 +0000)]
qemu-iotests: Test unaligned sub-block zero write
Test zero write in byte range 512~1024 for 4k alignment.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id:
1431522721-3266-4-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Fam Zheng [Wed, 13 May 2015 13:12:00 +0000 (13:12 +0000)]
block: Fix NULL deference for unaligned write if qiov is NULL
For zero write, callers pass in NULL qiov (qemu-io "write -z" or
scsi-disk "write same").
Commit
fc3959e466 fixed bdrv_co_write_zeroes which is the common case
for this bug, but it still exists in bdrv_aio_write_zeroes. A simpler
fix would be in bdrv_co_do_pwritev which is the NULL dereference point
and covers both cases.
So don't access it in bdrv_co_do_pwritev in this case, use three aligned
writes.
[Initialize ret to 0 in bdrv_co_do_zero_pwritev() to avoid uninitialized
variable warning with gcc 4.9.2.
--Stefan]
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id:
1431522721-3266-3-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Fam Zheng [Wed, 13 May 2015 13:11:59 +0000 (13:11 +0000)]
Revert "block: Fix unaligned zero write"
This reverts commit
fc3959e4669a1c2149b91ccb05101cfc7ae1fc05.
The core write code already handles the case, so remove this
duplication.
Because commit
61007b316 moved the touched code from block.c to
block/io.c, the change is manually reverted.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id:
1431522721-3266-2-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 12 May 2015 14:30:56 +0000 (17:30 +0300)]
block: align bounce buffers to page
The following sequence
int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644);
for (i = 0; i < 100000; i++)
write(fd, buf, 4096);
performs 5% better if buf is aligned to 4096 bytes.
The difference is quite reliable.
On the other hand we do not want at the moment to enforce bounce
buffering if guest request is aligned to 512 bytes.
The patch changes default bounce buffer optimal alignment to
MAX(page size, 4k). 4k is chosen as maximal known sector size on real
HDD.
The justification of the performance improve is quite interesting.
From the kernel point of view each request to the disk was split
by two. This could be seen by blktrace like this:
9,0 11 1 0.
000000000 11151 Q WS
312737792 + 1023 [qemu-img]
9,0 11 2 0.
000007938 11151 Q WS
312738815 + 8 [qemu-img]
9,0 11 3 0.
000030735 11151 Q WS
312738823 + 1016 [qemu-img]
9,0 11 4 0.
000032482 11151 Q WS
312739839 + 8 [qemu-img]
9,0 11 5 0.
000041379 11151 Q WS
312739847 + 1016 [qemu-img]
9,0 11 6 0.
000042818 11151 Q WS
312740863 + 8 [qemu-img]
9,0 11 7 0.
000051236 11151 Q WS
312740871 + 1017 [qemu-img]
9,0 5 1 0.
169071519 11151 Q WS
312741888 + 1023 [qemu-img]
After the patch the pattern becomes normal:
9,0 6 1 0.
000000000 12422 Q WS
314834944 + 1024 [qemu-img]
9,0 6 2 0.
000038527 12422 Q WS
314835968 + 1024 [qemu-img]
9,0 6 3 0.
000072849 12422 Q WS
314836992 + 1024 [qemu-img]
9,0 6 4 0.
000106276 12422 Q WS
314838016 + 1024 [qemu-img]
and the amount of requests sent to disk (could be calculated counting
number of lines in the output of blktrace) is reduced about 2 times.
Both qemu-img and qemu-io are affected while qemu-kvm is not. The guest
does his job well and real requests comes properly aligned (to page).
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id:
1431441056-26198-3-git-send-email-den@openvz.org
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 12 May 2015 14:30:55 +0000 (17:30 +0300)]
block: minimal bounce buffer alignment
The patch introduces new concept: minimal memory alignment for bounce
buffers. Original so called "optimal" value is actually minimal required
value for aligment. It should be used for validation that the IOVec
is properly aligned and bounce buffer is not required.
Though, from the performance point of view, it would be better if
bounce buffer or IOVec allocated by QEMU will be aligned stricter.
The patch does not change any alignment value yet.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id:
1431441056-26198-2-git-send-email-den@openvz.org
CC: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Paolo Bonzini [Thu, 7 May 2015 15:45:48 +0000 (17:45 +0200)]
block: return EPERM on writes or discards to read-only devices
This is the behavior in the operating system, for example Linux's
blkdev_write_iter has the following:
if (bdev_read_only(I_BDEV(bd_inode)))
return -EPERM;
This does not apply to opening a device for read/write, when the
device only supports read-only operation. In this case any of
EACCES, EPERM or EROFS is acceptable depending on why writing is
not possible.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id:
1431013548-22492-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
John Snow [Wed, 25 Mar 2015 22:57:39 +0000 (18:57 -0400)]
configure: Add workaround for ccache and clang
Test if ccache is interfering with semantic analysis of macros,
disable its habit of trying to compile already pre-processed
versions of code if so. ccache attempts to save time by compiling
pre-processed versions of code, but this disturbs clang's static
analysis enough to produce false positives.
ccache allows us to disable this feature, opting instead to
compile the original version instead of its preprocessed version.
This makes ccache much slower for cache misses, but at least it
becomes usable with QEMU/clang.
This workaround only activates for users using ccache AND clang,
and only if their configuration is observed to be producing warnings.
You may need to clear your ccache for builds started without -Werror,
as those may continue to produce warnings from the cache.
Thanks to Peter Eisentraut for his writeup on the issue:
http://peter.eisentraut.org/blog/2014/12/01/ccache-and-clang-part-3/
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id:
1427324259-1481-5-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
John Snow [Wed, 25 Mar 2015 22:57:38 +0000 (18:57 -0400)]
configure: silence glib unknown attribute __alloc_size__
The glib headers use GCC attributes. Unfortunately the __GNUC__ and
__GNUC_MINOR__ version macros are also defined by clang, but clang
doesn't support the same attributes as GCC.
clang 3.5.0 does not support the __alloc_size__ attribute:
https://github.com/llvm-mirror/clang/commit/
c047507a9a79e89fc8339e074fa72822a7e7ea73
The following warning is produced:
gstrfuncs.h:257:44: warning: unknown attribute '__alloc_size__' ignored [-Wunknown-attributes]
G_GNUC_MALLOC G_GNUC_ALLOC_SIZE(2);
gmacros.h:67:45: note: expanded from macro 'G_GNUC_ALLOC_SIZE'
#define G_GNUC_ALLOC_SIZE(x) __attribute__((__alloc_size__(x)))
This patch checks whether glib headers cause warnings and disables
-Wunknown-attributes if it is able to.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id:
1427324259-1481-4-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
John Snow [Wed, 25 Mar 2015 22:57:37 +0000 (18:57 -0400)]
configure: factor out supported flag check
Factor out the function that checks if a compiler
flag is supported or not.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id:
1427324259-1481-3-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi [Wed, 25 Mar 2015 22:57:36 +0000 (18:57 -0400)]
configure: handle clang -nopie argument warning
gcc 4.9.2 treats -nopie as an error:
cc: error: unrecognized command line option ‘-nopie’
clang 3.5.0 treats -nopie as a warning:
clang: warning: argument unused during compilation: '-nopie'
The causes ./configure to fail with clang:
ERROR: configure test passed without -Werror but failed with -Werror.
Make the -nopie test use -Werror so that compile_prog works for both gcc
and clang.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id:
1427324259-1481-2-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:47:00 +0000 (10:47 +0300)]
block/parallels: improve image writing performance further
Try to perform IO for the biggest continuous block possible.
All blocks abscent in the image are accounted in the same type
and preallocation is made for all of them at once.
The performance for sequential write is increased from 200 Mb/sec to
235 Mb/sec on my SSD HDD.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-28-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:59 +0000 (10:46 +0300)]
block/parallels: optimize linear image expansion
Plain image expansion spends a lot of time to update image file size.
This seriously affects the performance. The following simple test
qemu_img create -f parallels -o cluster_size=64k ./1.hds 64G
qemu_io -n -c "write -P 0x11 0 1024M" ./1.hds
could be improved if the format driver will pre-allocate some space
in the image file with a reasonable chunk.
This patch preallocates 128 Mb using bdrv_write_zeroes, which should
normally use fallocate() call inside. Fallback to older truncate()
could be used as a fallback using image open options thanks to the
previous patch.
The benefit is around 15%.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Karan <rkagan@parallels.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-27-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:58 +0000 (10:46 +0300)]
block/parallels: add prealloc-mode and prealloc-size open paramemets
This is preparational commit for tweaks in Parallels image expansion.
The idea is that enlarge via truncate by one data block is slow. It
would be much better to use fallocate via bdrv_write_zeroes and
expand by some significant amount at once.
Original idea with sequential file writing to the end of the file without
fallocate/truncate would be slower than this approach if the image is
expanded with several operations:
- each image expanding means file metadata update, i.e. filesystem
journal write. Truncate/write to newly truncated space update file
metadata twice thus truncate removal helps. With fallocate call
inside bdrv_write_zeroes file metadata is updated only once and
this should happen infrequently thus this approach is the best one
for the image expansion
- tail writes are ordered, i.e. the guest IO queue could not be sent
immediately to the host introducing additional IO delays
This patch just adds proper parameters into BDRVParallelsState and
performs options parsing in parallels_open.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-26-git-send-email-den@openvz.org
CC: Roman Kagan <rkagan@parallels.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:57 +0000 (10:46 +0300)]
block/parallels: delay writing to BAT till bdrv_co_flush_to_os
The idea is that we do not need to immediately sync BAT to the image as
from the guest point of view there is a possibility that IO is lost
even in the physical controller until flush command was finished.
bdrv_co_flush_to_os is exactly the right place for this purpose.
Technically the patch uses loaded BAT data as a cache and performs
actual on-disk metadata updates in parallels_co_flush_to_os callback.
This patch speed ups
qemu-img create -f parallels -o cluster_size=64k ./1.hds 64G
qemu-io -f parallels -c "write -P 0x11 0 1024k" 1.hds
writing from 50-60 Mb/sec to 80-90 Mb/sec on rotational media and
from 160 Mb/sec to 190 Mb/sec on SSD disk.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-25-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:56 +0000 (10:46 +0300)]
block/parallels: create bat_entry_off helper
calculate offset of the BAT entry in the image file.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-24-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:55 +0000 (10:46 +0300)]
block/parallels: improve image reading performance
Try to perform IO for the biggest continuous block possible.
The performance for sequential read is increased from 220 Mb/sec to
360 Mb/sec for continous image on my SSD HDD.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-23-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:54 +0000 (10:46 +0300)]
iotests, parallels: check for incorrectly closed image in tests
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-22-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:53 +0000 (10:46 +0300)]
block/parallels: implement incorrect close detection
The software driver must set inuse field in Parallels header to
0x746F6E59 when the image is opened in read-write mode. The presence of
this magic in the header on open forces image consistency check.
There is an unfortunate trick here. We can not check for inuse in
parallels_check as this will happen too late. It is possible to do
that for simple check, but during the fix this would always report
an error as the image was opened in BDRV_O_RDWR mode. Thus we save
the flag in BDRVParallelsState for this.
On the other hand, nothing should be done to clear inuse in
parallels_check. Generic close will do the job right.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-21-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:52 +0000 (10:46 +0300)]
block/parallels: implement parallels_check method of block driver
The check is very simple at the moment. It calculates necessary stats
and fix only the following errors:
- space leak at the end of the image. This would happens due to
preallocation
- clusters outside the image are zeroed. Nothing else could be done here
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-20-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:51 +0000 (10:46 +0300)]
block/parallels: move parallels_open/probe to the very end of the file
This will help to avoid forward declarations for upcoming parallels_check
Some very obvious formatting fixes were made to the moved code to make
checkpatch happy.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-19-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:50 +0000 (10:46 +0300)]
block/parallels: read parallels image header and BAT into single buffer
This metadata cache would allow to properly batch BAT updates to disk
in next patches. These updates will be properly aligned to avoid
read-modify-write transactions on block level.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-18-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:49 +0000 (10:46 +0300)]
block/parallels: keep BAT bitmap data in little endian in memory
This will allow to use this data as buffer to BAT update directly
without any intermediate buffers.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-17-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:48 +0000 (10:46 +0300)]
block/parallels: create bat2sect helper
deduplicate copy/paste arithmetcs
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-16-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:47 +0000 (10:46 +0300)]
block/parallels: rename catalog_ names to bat_
BAT means 'block allocation table'. Thus this name is clean and shorter
on writing.
Some obvious formatting fixes in the old code were made to make checkpatch
happy.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-15-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:46 +0000 (10:46 +0300)]
parallels: change copyright information in the image header
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-14-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:45 +0000 (10:46 +0300)]
iotests, parallels: test for newly created parallels image via qemu-img
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-13-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:44 +0000 (10:46 +0300)]
block/parallels: support parallels image creation
Do not even care to create WithoutFreeSpace image, it is obsolete.
Always create WithouFreSpacExt one.
The code also does not spend a lot of efforts to fill cylinders and
heads fields, they are not used actually in a real life neither in
QEMU nor in Parallels products.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-12-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:43 +0000 (10:46 +0300)]
iotests, parallels: test for write into Parallels image
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-11-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:42 +0000 (10:46 +0300)]
block/parallels: _co_writev callback for Parallels format
Support write on Parallels images. The code is almost the same as one
in the previous patch implemented scatter-gather IO for read.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-10-git-send-email-den@openvz.org
CC: Roman Kagan <rkagan@parallels.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:41 +0000 (10:46 +0300)]
block/parallels: mark parallels format driver as zero inited
From the guest point of view unallocated blocks are zeroed.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-9-git-send-email-den@openvz.org
CC: Roman Kagan <rkagan@parallels.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:40 +0000 (10:46 +0300)]
block/parallels: replace magic constants 4, 64 with proper sizeofs
simple purification..
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-8-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:39 +0000 (10:46 +0300)]
block/parallels: provide _co_readv routine for parallels format driver
Main approach is taken from qcow2_co_readv.
The patch drops coroutine lock for the duration of IO operation and
peforms normal scatter-gather IO using standard QEMU backend.
The patch also adds comment about locking considerations in the driver.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id:
1430207220-24458-7-git-send-email-den@openvz.org
CC: Roman Kagan <rkagan@parallels.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Roman Kagan [Tue, 28 Apr 2015 07:46:38 +0000 (10:46 +0300)]
block/parallels: add get_block_status
Implement VFS method for get_block_status to Parallels format driver.
qemu_co_mutex_lock is not necessary yet (the driver is read-only) but
will be necessary very soon when write will be supported.
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Message-id:
1430207220-24458-6-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Roman Kagan [Tue, 28 Apr 2015 07:46:37 +0000 (10:46 +0300)]
block/parallels: read up to cluster end in one go
Teach parallels_read() to do reads in coarser granularity than just a
single sector: if requested, read up to the cluster end in one go.
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id:
1430207220-24458-5-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Roman Kagan [Tue, 28 Apr 2015 07:46:36 +0000 (10:46 +0300)]
block/parallels: switch to bdrv_read
Switch the .bdrv_read method implementation from using bdrv_pread() to
bdrv_read() on the underlying file, since the latter is subject to i/o
throttling while the former is not.
Besides, since bdrv_read() operates in sectors rather than bytes, adjust
the helper functions to do so too.
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Message-id:
1430207220-24458-4-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:35 +0000 (10:46 +0300)]
block/parallels: rename parallels_header to ParallelsHeader
this follows QEMU coding convention
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id:
1430207220-24458-3-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Tue, 28 Apr 2015 07:46:34 +0000 (10:46 +0300)]
iotests, parallels: quote TEST_IMG in 076 test to be path-safe
suggested by Jeff Cody
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id:
1430207220-24458-2-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Richard Henderson [Mon, 18 Aug 2014 17:19:06 +0000 (10:19 -0700)]
target-alpha: Add vector implementation for CMPBGE
While conditionalized on SSE2, it's a "portable" gcc generic vector
implementation, which could be enabled on other hosts.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Richard Henderson [Tue, 15 Jul 2014 19:07:05 +0000 (12:07 -0700)]
target-alpha: Rewrite helper_zapnot
This form produces significantly smaller code on x86_64.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Peter Maydell [Thu, 21 May 2015 08:07:19 +0000 (09:07 +0100)]
Merge remote-tracking branch 'remotes/kraxel/tags/pull-vnc-
20150520-1' into staging
vnc: misc fixes.
# gpg: Signature made Wed May 20 09:32:45 2015 BST using RSA key ID
D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
* remotes/kraxel/tags/pull-vnc-
20150520-1:
qemu-sockets: Report explicit error if unlink fails
vnc: Tweak error when init fails
vnc: Don't assert if opening unix socket fails
ui: remove check for failure of qemu_acl_init()
Strip brackets from vnc host
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Cole Robinson [Tue, 5 May 2015 15:07:19 +0000 (11:07 -0400)]
qemu-sockets: Report explicit error if unlink fails
Consider this case:
$ ls -ld ~/root-owned/
drwx--x--x. 2 root root 4096 Apr 29 12:55 /home/crobinso/root-owned/
$ ls -l ~/root-owned/foo.sock
-rwxrwxrwx. 1 crobinso crobinso 0 Apr 29 12:55 /home/crobinso/root-owned/foo.sock
$ qemu-system-x86_64 -vnc unix:~/root-owned/foo.sock
qemu-system-x86_64: -vnc unix:/home/crobinso/root-owned/foo.sock: Failed to start VNC server: Failed to bind socket to /home/crobinso/root-owned/foo.sock: Address already in use
...which is techinically true, but the real error is that we failed to
unlink. So report it.
This may seem pathological but it's a real possibility via libvirt.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Cole Robinson [Tue, 5 May 2015 15:07:18 +0000 (11:07 -0400)]
vnc: Tweak error when init fails
Before:
qemu-system-x86_64: -display vnc=unix:/root/foo.sock: Failed to start VNC server on `(null)': Failed to bind socket to /root/foo.sock: Permission denied
After:
qemu-system-x86_64: -display vnc=unix:/root/foo.sock: Failed to start VNC server: Failed to bind socket to /root/foo.sock: Permission denied
Rather than tweak the string possibly show unix: value as well,
just drop the explicit display reporting. We already get the cli
string in the error message, that should be sufficient.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>