Jan Beulich [Mon, 23 May 2016 06:44:57 +0000 (00:44 -0600)]
xen/blkif: avoid double access to any shared ring request fields
Commit
f9e98e5d7a ("xen/blkif: Avoid double access to
src->nr_segments") didn't go far enough: src->operation is also being
used twice. And nothing was done to prevent the compiler from using the
source side of the copy done by blk_get_request() (granted that's very
unlikely).
Move the barrier()s up, and add another one to blk_get_request().
Note that for completing XSA-155, the barrier() getting added to
blk_get_request() would suffice, and hence the changes to xen_blkif.h
are more like just cleanup. And since, as said, the unpatched code
getting compiled to something vulnerable is very unlikely (and not
observed in practice), this isn't being viewed as a new security issue.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Peter Maydell [Mon, 13 Jun 2016 12:05:02 +0000 (13:05 +0100)]
Merge remote-tracking branch 'remotes/berrange/tags/qcrypto-next-2016-06-13-v1' into staging
Merge qcrypto-next 2016/06/13 v1
# gpg: Signature made Mon 13 Jun 2016 12:43:22 BST
# gpg: using RSA key 0xBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* remotes/berrange/tags/qcrypto-next-2016-06-13-v1:
crypto: aes: always rename internal symbols
crypto: assert that qcrypto_hash_digest_len is in range
crypto: remove temp files on completion of secrets test
TLS: provide slightly more information when TLS certificate loading fails
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Mike Frysinger [Mon, 6 Jun 2016 22:05:35 +0000 (18:05 -0400)]
crypto: aes: always rename internal symbols
OpenSSL's libcrypto always defines AES symbols with the same names as
qemu's local aes code. This is problematic when enabling at least curl
as that frequently also uses libcrypto. It might not be noticed when
running, but if you try to statically link, everything falls down.
An example snippet:
LINK qemu-nbd
.../libcrypto.a(aes-x86_64.o): In function 'AES_encrypt':
(.text+0x460): multiple definition of 'AES_encrypt'
crypto/aes.o:aes.c:(.text+0x670): first defined here
.../libcrypto.a(aes-x86_64.o): In function 'AES_decrypt':
(.text+0x9f0): multiple definition of 'AES_decrypt'
crypto/aes.o:aes.c:(.text+0xb30): first defined here
.../libcrypto.a(aes-x86_64.o): In function 'AES_cbc_encrypt':
(.text+0xf90): multiple definition of 'AES_cbc_encrypt'
crypto/aes.o:aes.c:(.text+0xff0): first defined here
collect2: error: ld returned 1 exit status
.../qemu-2.6.0/rules.mak:105: recipe for target 'qemu-nbd' failed
make: *** [qemu-nbd] Error 1
The aes.h header has redefines already for FreeBSD, but go ahead and
enable that for everyone since there's no real good reason to not use
a namespace all the time.
Signed-off-by: Mike Frysinger <vapier@chromium.org>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Paolo Bonzini [Fri, 20 May 2016 09:09:54 +0000 (11:09 +0200)]
crypto: assert that qcrypto_hash_digest_len is in range
Otherwise unintended results could happen. For example,
Coverity reports a division by zero in qcrypto_afsplit_hash.
While this cannot really happen, it shows that the contract
of qcrypto_hash_digest_len can be improved.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Daniel P. Berrange [Tue, 26 Apr 2016 09:59:09 +0000 (10:59 +0100)]
crypto: remove temp files on completion of secrets test
The secret object tests left some temporary files on disk
when completing. Ensure they are unlink, and rename them
to make it more obvious where they come from.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Alex Bligh [Tue, 5 Apr 2016 19:33:48 +0000 (20:33 +0100)]
TLS: provide slightly more information when TLS certificate loading fails
Give slightly more information when certification loading fails.
Rather than have no information, you now get gnutls's only slightly
less unhelpful error messages.
Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Peter Maydell [Mon, 13 Jun 2016 11:18:17 +0000 (12:18 +0100)]
Merge remote-tracking branch 'remotes/sstabellini/tags/xen-
20160613-tag' into staging
Xen 2016/06/13
# gpg: Signature made Mon 13 Jun 2016 11:53:18 BST
# gpg: using RSA key 0x894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>"
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3 0AEA 894F 8F48 70E1 AE90
* remotes/sstabellini/tags/xen-
20160613-tag:
Introduce "xen-load-devices-state"
exec: Fix qemu_ram_block_from_host for Xen
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Wen Congyang [Fri, 3 Jun 2016 09:58:34 +0000 (17:58 +0800)]
Introduce "xen-load-devices-state"
Introduce a "xen-load-devices-state" QAPI command that can be used to
load the state of all devices, but not the RAM or the block devices of
the VM.
We only have hmp commands savevm/loadvm, and qmp commands
xen-save-devices-state.
We use this new command for COLO:
1. suspend both primary vm and secondary vm
2. sync the state
3. resume both primary vm and secondary vm
In such case, we need to update all devices' state in any time.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Anthony PERARD [Thu, 9 Jun 2016 15:56:17 +0000 (16:56 +0100)]
exec: Fix qemu_ram_block_from_host for Xen
Since
f615f39 (exec: remove ram_addr argument from
qemu_ram_block_from_host), migration under Xen is likely to fail, with a
SEGV of QEMU. But the commit only reveal a bug with the calculation of
the offset value in qemu_ram_block_from_host().
This patch calculates the offset from the ram_addr as
qemu_ram_addr_from_host() will later calculate the ram_addr from the
offset.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Peter Maydell [Mon, 13 Jun 2016 09:12:44 +0000 (10:12 +0100)]
Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-
20160611' into staging
TB hashing improvements
# gpg: Signature made Sun 12 Jun 2016 01:12:50 BST
# gpg: using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg: aka "Richard Henderson <rth@redhat.com>"
# gpg: aka "Richard Henderson <rth@twiddle.net>"
# Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC 16A4 AD12 70CC 4DD0 279B
* remotes/rth/tags/pull-tcg-
20160611:
translate-all: add tb hash bucket info to 'info jit' dump
tb hash: track translated blocks with qht
qht: add test-qht-par to invoke qht-bench from 'check' target
qht: add qht-bench, a performance benchmark
qht: add test program
qht: QEMU's fast, resizable and scalable Hash Table
qdist: add test program
qdist: add module to represent frequency distributions of data
tb hash: hash phys_pc, pc, and flags with xxhash
exec: add tb_hash_func5, derived from xxhash
qemu-thread: add simple test-and-set spinlock
include/processor.h: define cpu_relax()
seqlock: rename write_lock/unlock to write_begin/end
seqlock: remove optional mutex
compiler.h: add QEMU_ALIGNED() to enforce struct alignment
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Emilio G. Cota [Wed, 8 Jun 2016 18:55:33 +0000 (14:55 -0400)]
translate-all: add tb hash bucket info to 'info jit' dump
Examples:
- Good hashing, i.e. tb_hash_func5(phys_pc, pc, flags):
TB count 715135/
2684354
[...]
TB hash buckets 388775/524288 (74.15% head buckets used)
TB hash occupancy 33.04% avg chain occ. Histogram: [0,10)%|▆ █ ▅▁▃▁▁|[90,100]%
TB hash avg chain 1.017 buckets. Histogram: 1|█▁▁|3
- Not-so-good hashing, i.e. tb_hash_func5(phys_pc, pc, 0):
TB count 712636/
2684354
[...]
TB hash buckets 344924/524288 (65.79% head buckets used)
TB hash occupancy 31.64% avg chain occ. Histogram: [0,10)%|█ ▆ ▅▁▃▁▂|[90,100]%
TB hash avg chain 1.047 buckets. Histogram: 1|█▁▁▁|4
- Bad hashing, i.e. tb_hash_func5(phys_pc, 0, 0):
TB count 702818/
2684354
[...]
TB hash buckets 112741/524288 (21.50% head buckets used)
TB hash occupancy 10.15% avg chain occ. Histogram: [0,10)%|█ ▁ ▁▁▁▁▁|[90,100]%
TB hash avg chain 2.107 buckets. Histogram: [1.0,10.2)|█▁▁▁▁▁▁▁▁▁|[83.8,93.0]
- Good hashing, but no auto-resize:
TB count 715634/
2684354
TB hash buckets 8192/8192 (100.00% head buckets used)
TB hash occupancy 98.30% avg chain occ. Histogram: [95.3,95.8)%|▁▁▃▄▃▄▁▇▁█|[99.5,100.0]%
TB hash avg chain 22.070 buckets. Histogram: [15.0,16.7)|▁▂▅▄█▅▁▁▁▁|[30.3,32.0]
Acked-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <
1465412133-3029-16-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Emilio G. Cota [Wed, 8 Jun 2016 18:55:32 +0000 (14:55 -0400)]
tb hash: track translated blocks with qht
Having a fixed-size hash table for keeping track of all translation blocks
is suboptimal: some workloads are just too big or too small to get maximum
performance from the hash table. The MRU promotion policy helps improve
performance when the hash table is a little undersized, but it cannot
make up for severely undersized hash tables.
Furthermore, frequent MRU promotions result in writes that are a scalability
bottleneck. For scalability, lookups should only perform reads, not writes.
This is not a big deal for now, but it will become one once MTTCG matures.
The appended fixes these issues by using qht as the implementation of
the TB hash table. This solution is superior to other alternatives considered,
namely:
- master: implementation in QEMU before this patchset
- xxhash: before this patch, i.e. fixed buckets + xxhash hashing + MRU.
- xxhash-rcu: fixed buckets + xxhash + RCU list + MRU.
MRU is implemented here by adding an intermediate struct
that contains the u32 hash and a pointer to the TB; this
allows us, on an MRU promotion, to copy said struct (that is not
at the head), and put this new copy at the head. After a grace
period, the original non-head struct can be eliminated, and
after another grace period, freed.
- qht-fixed-nomru: fixed buckets + xxhash + qht without auto-resize +
no MRU for lookups; MRU for inserts.
The appended solution is the following:
- qht-dyn-nomru: dynamic number of buckets + xxhash + qht w/ auto-resize +
no MRU for lookups; MRU for inserts.
The plots below compare the considered solutions. The Y axis shows the
boot time (in seconds) of a debian jessie image with arm-softmmu; the X axis
sweeps the number of buckets (or initial number of buckets for qht-autoresize).
The plots in PNG format (and with errorbars) can be seen here:
http://imgur.com/a/Awgnq
Each test runs 5 times, and the entire QEMU process is pinned to a
single core for repeatability of results.
Host: Intel Xeon E5-2690
28 ++------------+-------------+-------------+-------------+------------++
A***** + + + master **A*** +
27 ++ * xxhash ##B###++
| A******A****** xxhash-rcu $$C$$$ |
26 C$$ A******A****** qht-fixed-nomru*%%D%%%++
D%%$$ A******A******A*qht-dyn-mru A*E****A
25 ++ %%$$ qht-dyn-nomru &&F&&&++
B#####% |
24 ++ #C$$$$$ ++
| B### $ |
| ## C$$$$$$ |
23 ++ # C$$$$$$ ++
| B###### C$$$$$$ %%%D
22 ++ %B###### C$$$$$$C$$$$$$C$$$$$$C$$$$$$C$$$$$$C
| D%%%%%%B###### @E@@@@@@ %%%D%%%@@@E@@@@@@E
21 E@@@@@@E@@@@@@F&&&@@@E@@@&&&D%%%%%%B######B######B######B######B######B
+ E@@@ F&&& + E@ + F&&& + +
20 ++------------+-------------+-------------+-------------+------------++
14 16 18 20 22 24
log2 number of buckets
Host: Intel i7-4790K
14.5 ++------------+------------+-------------+------------+------------++
A** + + + master **A*** +
14 ++ ** xxhash ##B###++
13.5 ++ ** xxhash-rcu $$C$$$++
| qht-fixed-nomru %%D%%% |
13 ++ A****** qht-dyn-mru @@E@@@++
| A*****A******A****** qht-dyn-nomru &&F&&& |
12.5 C$$ A******A******A*****A****** ***A
12 ++ $$ A*** ++
D%%% $$ |
11.5 ++ %% ++
B### %C$$$$$$ |
11 ++ ## D%%%%% C$$$$$ ++
| # % C$$$$$$ |
10.5 F&&&&&&B######D%%%%% C$$$$$$C$$$$$$C$$$$$$C$$$$$C$$$$$$ $$$C
10 E@@@@@@E@@@@@@B#####B######B######E@@@@@@E@@@%%%D%%%%%D%%%###B######B
+ F&& D%%%%%%B######B######B#####B###@@@D%%% +
9.5 ++------------+------------+-------------+------------+------------++
14 16 18 20 22 24
log2 number of buckets
Note that the original point before this patch series is X=15 for "master";
the little sensitivity to the increased number of buckets is due to the
poor hashing function in master.
xxhash-rcu has significant overhead due to the constant churn of allocating
and deallocating intermediate structs for implementing MRU. An alternative
would be do consider failed lookups as "maybe not there", and then
acquire the external lock (tb_lock in this case) to really confirm that
there was indeed a failed lookup. This, however, would not be enough
to implement dynamic resizing--this is more complex: see
"Resizable, Scalable, Concurrent Hash Tables via Relativistic
Programming" by Triplett, McKenney and Walpole. This solution was
discarded due to the very coarse RCU read critical sections that we have
in MTTCG; resizing requires waiting for readers after every pointer update,
and resizes require many pointer updates, so this would quickly become
prohibitive.
qht-fixed-nomru shows that MRU promotion is advisable for undersized
hash tables.
However, qht-dyn-mru shows that MRU promotion is not important if the
hash table is properly sized: there is virtually no difference in
performance between qht-dyn-nomru and qht-dyn-mru.
Before this patch, we're at X=15 on "xxhash"; after this patch, we're at
X=15 @ qht-dyn-nomru. This patch thus matches the best performance that we
can achieve with optimum sizing of the hash table, while keeping the hash
table scalable for readers.
The improvement we get before and after this patch for booting debian jessie
with arm-softmmu is:
- Intel Xeon E5-2690: 10.5% less time
- Intel i7-4790K: 5.2% less time
We could get this same improvement _for this particular workload_ by
statically increasing the size of the hash table. But this would hurt
workloads that do not need a large hash table. The dynamic (upward)
resizing allows us to start small and enlarge the hash table as needed.
A quick note on downsizing: the table is resized back to 2**15 buckets
on every tb_flush; this makes sense because it is not guaranteed that the
table will reach the same number of TBs later on (e.g. most bootup code is
thrown away after boot); it makes sense to grow the hash table as
more code blocks are translated. This also avoids the complication of
having to build downsizing hysteresis logic into qht.
Reviewed-by: Sergey Fedorov <serge.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <
1465412133-3029-15-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Emilio G. Cota [Wed, 8 Jun 2016 18:55:31 +0000 (14:55 -0400)]
qht: add test-qht-par to invoke qht-bench from 'check' target
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <
1465412133-3029-14-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Emilio G. Cota [Wed, 8 Jun 2016 18:55:30 +0000 (14:55 -0400)]
qht: add qht-bench, a performance benchmark
This serves as a performance benchmark as well as a stress test
for QHT. We can tweak quite a number of things, including the
number of resize threads and how frequently resizes are triggered.
A performance comparison of QHT vs CLHT[1] and ck_hs[2] using
this same benchmark program can be found here:
http://imgur.com/a/0Bms4
The tests are run on a 64-core AMD Opteron 6376, pinning threads
to cores favoring same-socket cores. For each run, qht-bench is
invoked with:
$ tests/qht-bench -d $duration -n $n -u $u -g $range
, where $duration is in seconds, $n is the number of threads,
$u is the update rate (0.0 to 100.0), and $range is the number
of keys.
Note that ck_hs's performance drops significantly as writes go
up, since it requires an external lock (I used a ck_spinlock)
around every write.
Also, note that CLHT instead of using a seqlock, relies on an
allocator that does not ever return the same address during the
same read-critical section. This gives it a slight performance
advantage over QHT on read-heavy workloads, since the seqlock
writes aren't there.
[1] CLHT: https://github.com/LPD-EPFL/CLHT
https://infoscience.epfl.ch/record/207109/files/ascy_asplos15.pdf
[2] ck_hs: http://concurrencykit.org/
http://backtrace.io/blog/blog/2015/03/13/workload-specialization/
A few of those plots are shown in text here, since that site
might not be online forever. Throughput is on Mops/s on the Y axis.
200K keys, 0 % updates
450 ++--+------+------+-------+-------+-------+-------+------+-------+--++
| + + + + + + + + +N+ |
400 ++ ---+E+ ++
| +++---- |
350 ++ 9 ++------+------++ --+E+ -+H+ ++
| | +H+- | -+N+---- ---- +++ |
300 ++ 8 ++ +E+ ++ -----+E+ --+H+ ++
| | +++ | -+N+-----+H+-- |
250 ++ 7 ++------+------++ +++-----+E+---- ++
200 ++ 1 -+E+-----+H+ ++
| ---- qht +-E--+ |
150 ++ -+E+ clht +-H--+ ++
| ---- ck +-N--+ |
100 ++ +E+ ++
| ---- |
50 ++ -+E+ ++
| +E+E+ + + + + + + + + |
0 ++--E------+------+-------+-------+-------+-------+------+-------+--++
1 8 16 24 32 40 48 56 64
Number of threads
200K keys, 1 % updates
350 ++--+------+------+-------+-------+-------+-------+------+-------+--++
| + + + + + + + + -+E+ |
300 ++ -----+H+ ++
| +E+-- |
| 9 ++------+------++ +++---- |
250 ++ | +E+ -- | -+E+ ++
| 8 ++ -- ++ ---- |
200 ++ | +++- | +++ ---+E+ ++
| 7 ++------N------++ -+E+-- qht +-E--+ |
| 1 +++---- clht +-H--+ |
150 ++ -+E+ ck +-N--+ ++
| ---- |
100 ++ +E+ ++
| ---- |
| -+E+ |
50 ++ +H+-+N+----+N+-----+N+------ ++
| +E+E+ + + + +N+-----+N+-----+N+----+N+-----+N+ |
0 ++--E------+------+-------+-------+-------+-------+------+-------+--++
1 8 16 24 32 40 48 56 64
Number of threads
200K keys, 20 % updates
300 ++--+------+------+-------+-------+-------+-------+------+-------+--++
| + + + + + + + + + |
| -+H+ |
250 ++ ---- ++
| 9 ++------+------++ --+H+ ---+E+ |
| 8 ++ +H+-- ++ -+H+----+E+-- |
200 ++ | +E+ --| -----+E+-- +++ ++
| 7 ++ + ---- ++ ---+H+---- +++ qht +-E--+ |
150 ++ 6 ++------N------++ -+H+-----+E+ clht +-H--+ ++
| 1 -----+E+-- ck +-N--+ |
| -+H+---- |
100 ++ -----+E+ ++
| +E+-- |
| ----+++ |
50 ++ -+E+ ++
| +E+ +++ |
| +E+N+-+N+-----+ + + + + + + |
0 ++--E------+------N-------N-------N-------N-------N------N-------N--++
1 8 16 24 32 40 48 56 64
Number of threads
200K keys, 100 % updates qht +-E--+
clht +-H--+
160 ++--+------+------+-------+-------+-------+-------+---ck-+-N-----+--++
| + + + + + + + + ----H |
140 ++ +H+-- -+E+ ++
| +++---- ---- |
120 ++ 8 ++------+------++ -+H+ +E+ ++
| 7 ++ +H+---- ++ ---- +++---- |
100 ++ | +E+ | +++ ---+H+ -+E+ ++
| 6 ++ +++ ++ -+H+-- +++---- |
80 ++ 5 ++------N----------+E+-----+E+ ++
| 1 -+H+---- +++ |
| -----+E+ |
60 ++ +H+---- +++ ++
| ----+E+ |
40 ++ +H+---- ++
| --+E+ |
20 ++ +E+ ++
| +EE+ + + + + + + + + |
0 ++--+N-N---N------N-------N-------N-------N-------N------N-------N--++
1 8 16 24 32 40 48 56 64
Number of threads
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <
1465412133-3029-13-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Emilio G. Cota [Wed, 8 Jun 2016 18:55:29 +0000 (14:55 -0400)]
qht: add test program
Acked-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <
1465412133-3029-12-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Emilio G. Cota [Wed, 8 Jun 2016 18:55:28 +0000 (14:55 -0400)]
qht: QEMU's fast, resizable and scalable Hash Table
This is a fast, scalable chained hash table with optional auto-resizing, allowing
reads that are concurrent with reads, and reads/writes that are concurrent
with writes to separate buckets.
A hash table with these features will be necessary for the scalability
of the ongoing MTTCG work; before those changes arrive we can already
benefit from the single-threaded speedup that qht also provides.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <
1465412133-3029-11-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Emilio G. Cota [Wed, 8 Jun 2016 18:55:27 +0000 (14:55 -0400)]
qdist: add test program
Acked-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <
1465412133-3029-10-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Emilio G. Cota [Wed, 8 Jun 2016 18:55:26 +0000 (14:55 -0400)]
qdist: add module to represent frequency distributions of data
Sometimes it is useful to have a quick histogram to represent a certain
distribution -- for example, when investigating a performance regression
in a hash table due to inadequate hashing.
The appended allows us to easily represent a distribution using Unicode
characters. Further, the data structure keeping track of the distribution
is so simple that obtaining its values for off-line processing is trivial.
Example, taking the last 10 commits to QEMU:
Characters in commit title Count
-----------------------------------
39 1
48 1
53 1
54 2
57 1
61 1
67 1
78 1
80 1
qdist_init(&dist);
qdist_inc(&dist, 39);
[...]
qdist_inc(&dist, 80);
char *str = qdist_pr(&dist, 9, QDIST_PR_LABELS);
// -> [39.0,43.6)▂▂ █▂ ▂ ▄[75.4,80.0]
g_free(str);
char *str = qdist_pr(&dist, 4, QDIST_PR_LABELS);
// -> [39.0,49.2)▁█▁▁[69.8,80.0]
g_free(str);
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <
1465412133-3029-9-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Emilio G. Cota [Wed, 8 Jun 2016 18:55:25 +0000 (14:55 -0400)]
tb hash: hash phys_pc, pc, and flags with xxhash
For some workloads such as arm bootup, tb_phys_hash is performance-critical.
The is due to the high frequency of accesses to the hash table, originated
by (frequent) TLB flushes that wipe out the cpu-private tb_jmp_cache's.
More info:
https://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg05098.html
To dig further into this I modified an arm image booting debian jessie to
immediately shut down after boot. Analysis revealed that quite a bit of time
is unnecessarily spent in tb_phys_hash: the cause is poor hashing that
results in very uneven loading of chains in the hash table's buckets;
the longest observed chain had ~550 elements.
The appended addresses this with two changes:
1) Use xxhash as the hash table's hash function. xxhash is a fast,
high-quality hashing function.
2) Feed the hashing function with not just tb_phys, but also pc and flags.
This improves performance over using just tb_phys for hashing, since that
resulted in some hash buckets having many TB's, while others getting very few;
with these changes, the longest observed chain on a single hash bucket is
brought down from ~550 to ~40.
Tests show that the other element checked for in tb_find_physical,
cs_base, is always a match when tb_phys+pc+flags are a match,
so hashing cs_base is wasteful. It could be that this is an ARM-only
thing, though. UPDATE:
On Tue, Apr 05, 2016 at 08:41:43 -0700, Richard Henderson wrote:
> The cs_base field is only used by i386 (in 16-bit modes), and sparc (for a TB
> consisting of only a delay slot).
> It may well still turn out to be reasonable to ignore cs_base for hashing.
BTW, after this change the hash table should not be called "tb_hash_phys"
anymore; this is addressed later in this series.
This change gives consistent bootup time improvements. I tested two
host machines:
- Intel Xeon E5-2690: 11.6% less time
- Intel i7-4790K: 19.2% less time
Increasing the number of hash buckets yields further improvements. However,
using a larger, fixed number of buckets can degrade performance for other
workloads that do not translate as many blocks (600K+ for debian-jessie arm
bootup). This is dealt with later in this series.
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <
1465412133-3029-8-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Emilio G. Cota [Wed, 8 Jun 2016 18:55:24 +0000 (14:55 -0400)]
exec: add tb_hash_func5, derived from xxhash
This will be used by upcoming changes for hashing the tb hash.
Add this into a separate file to include the copyright notice from
xxhash.
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <
1465412133-3029-7-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Guillaume Delbergue [Wed, 8 Jun 2016 18:55:23 +0000 (14:55 -0400)]
qemu-thread: add simple test-and-set spinlock
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Signed-off-by: Guillaume Delbergue <guillaume.delbergue@greensocs.com>
[Rewritten. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[Emilio's additions: use TAS instead of atomic_xchg; emit acquire/release
barriers; return bool from trylock; call cpu_relax() while spinning;
optimize for uncontended locks by acquiring the lock with TAS instead
of TATAS; add qemu_spin_locked().]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <
1465412133-3029-6-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Emilio G. Cota [Wed, 8 Jun 2016 18:55:22 +0000 (14:55 -0400)]
include/processor.h: define cpu_relax()
Taken from the linux kernel.
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <
1465412133-3029-5-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Emilio G. Cota [Wed, 8 Jun 2016 18:55:21 +0000 (14:55 -0400)]
seqlock: rename write_lock/unlock to write_begin/end
It is a more appropriate name, now that the mutex embedded
in the seqlock is gone.
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <
1465412133-3029-4-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Emilio G. Cota [Wed, 8 Jun 2016 18:55:20 +0000 (14:55 -0400)]
seqlock: remove optional mutex
This option is unused; besides, it bloats the struct when not needed.
Let's just let writers define their own locks elsewhere.
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <
1465412133-3029-3-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Emilio G. Cota [Wed, 8 Jun 2016 18:55:19 +0000 (14:55 -0400)]
compiler.h: add QEMU_ALIGNED() to enforce struct alignment
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <
1465412133-3029-2-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Peter Maydell [Fri, 10 Jun 2016 14:47:17 +0000 (15:47 +0100)]
Merge remote-tracking branch 'remotes/kraxel/tags/pull-ui-
20160610-1' into staging
ui: misc bug fixes.
# gpg: Signature made Fri 10 Jun 2016 10:56:06 BST
# gpg: using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138
* remotes/kraxel/tags/pull-ui-
20160610-1:
console: ignore ui_info updates which don't actually update something
ui/console-gl: Add support for big endian display surfaces
gtk: fix vte version check
ui: fix regression in printing VNC host/port on startup
vnc: drop unused depth arg for set_pixel_format
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Gerd Hoffmann [Mon, 30 May 2016 08:41:13 +0000 (10:41 +0200)]
console: ignore ui_info updates which don't actually update something
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id:
1464597673-26464-1-git-send-email-kraxel@redhat.com
Thomas Huth [Mon, 6 Jun 2016 20:01:01 +0000 (22:01 +0200)]
ui/console-gl: Add support for big endian display surfaces
This is required for running QEMU on big endian hosts (like
PowerPC machines) that use RGB instead of BGR byte ordering.
Ticket: https://bugs.launchpad.net/qemu/+bug/
1581796
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id:
1465243261-26731-1-git-send-email-thuth@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Olaf Hering [Wed, 8 Jun 2016 21:43:52 +0000 (21:43 +0000)]
gtk: fix vte version check
vte_terminal_set_encoding takes 3 args since 0.38.0.
This fixes commit
fba958c6 ("gtk: implement set_echo")
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Message-id:
20160608214352.32669-1-olaf@aepfle.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Daniel P. Berrange [Wed, 8 Jun 2016 10:42:56 +0000 (11:42 +0100)]
ui: fix regression in printing VNC host/port on startup
If VNC is chosen as the compile time default display backend,
QEMU will print the host/port it listens on at startup.
Previously this would look like
VNC server running on '::1:5900'
but in
04d2529da27db512dcbd5e99d0e26d333f16efcc the ':' was
accidentally replaced with a ';'. This the ':' back.
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id:
1465382576-25552-1-git-send-email-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Gerd Hoffmann [Mon, 6 Jun 2016 09:18:45 +0000 (11:18 +0200)]
vnc: drop unused depth arg for set_pixel_format
Spotted by Coverity.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id:
1465204725-31562-1-git-send-email-kraxel@redhat.com
Peter Maydell [Tue, 17 May 2016 14:18:07 +0000 (15:18 +0100)]
target-i386: Move user-mode exception actions out of user-exec.c
The exception_action() function in user-exec.c is just a call to
cpu_loop_exit() for every target CPU except i386. Since this
function is only called if the target's handle_mmu_fault() hook has
indicated an MMU fault, and that hook is only called from the
handle_cpu_signal() code path, we can simply move the x86-specific
setup into that hook, which allows us to remove the TARGET_I386
ifdef from user-exec.c.
Of the actions that were done by the call to raise_interrupt_err():
* cpu_svm_check_intercept_param() is a no-op in user mode
* check_exception() is a no-op since double faults are impossible
for user-mode
* assignments to cs->exception_index and env->error_code are no-ops
* assigning to env->exception_next_eip is unnecessary because it
is not used unless env->exception_is_int is true
* cpu_loop_exit_restore() is equivalent to cpu_loop_exit() since
pc is 0
which leaves just setting env_>exception_is_int as the action that
needs to be added to x86_cpu_handle_mmu_fault().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Message-id:
1463494687-25947-7-git-send-email-peter.maydell@linaro.org
Peter Maydell [Tue, 17 May 2016 14:18:06 +0000 (15:18 +0100)]
target-i386: Add comment about do_interrupt_user() next_eip argument
Add a comment to do_interrupt_user() along the same lines as the
existing one for do_interrupt_all() noting that the next_eip
argument is not used unless is_int is true or intno is EXCP_SYSCALL.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Message-id:
1463494687-25947-6-git-send-email-peter.maydell@linaro.org
Peter Maydell [Tue, 17 May 2016 14:18:05 +0000 (15:18 +0100)]
user-exec: Don't reextract sigmask from usercontext pointer
Extracting the old signal mask from the usercontext pointer passed to
a signal handler is a pain because it is OS and CPU dependent.
Since we've already done it once and passed it to handle_cpu_signal(),
there's no need to do it again in cpu_exit_tb_from_sighandler().
This then means we don't need to pass a usercontext pointer in to
handle_cpu_signal() at all.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Message-id:
1463494687-25947-5-git-send-email-peter.maydell@linaro.org
Peter Maydell [Tue, 17 May 2016 14:18:04 +0000 (15:18 +0100)]
cpu-exec: Rename cpu_resume_from_signal() to cpu_loop_exit_noexc()
The function cpu_resume_from_signal() is now always called with a
NULL puc argument, and is rather misnamed since it is never called
from a signal handler. It is essentially forcing an exit to the
top level cpu loop but without raising any exception, so rename
it to cpu_loop_exit_noexc() and drop the useless unused argument.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Message-id:
1463494687-25947-4-git-send-email-peter.maydell@linaro.org
Peter Maydell [Tue, 17 May 2016 14:18:03 +0000 (15:18 +0100)]
user-exec: Push resume-from-signal code out to handle_cpu_signal()
Since the only caller of page_unprotect() which might cause it to
need to call cpu_resume_from_signal() is handle_cpu_signal() in
the user-mode code, push the longjump handling out to that function.
Since this is the only caller of cpu_resume_from_signal() which
passes a non-NULL puc argument, split the non-NULL handling into
a new cpu_exit_tb_from_sighandler() function. This allows us
to merge the softmmu and usermode implementations of the
cpu_resume_from_signal() function, which are now identical.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Message-id:
1463494687-25947-3-git-send-email-peter.maydell@linaro.org
Peter Maydell [Tue, 17 May 2016 14:18:02 +0000 (15:18 +0100)]
translate-all.c: Don't pass puc, locked to tb_invalidate_phys_page()
The user-mode-only function tb_invalidate_phys_page() is only
called from two places:
* page_unprotect(), which passes in a non-zero pc, a puc pointer
and the value 'true' for the locked argument
* page_set_flags(), which passes in a zero pc, a NULL puc pointer
and a 'false' locked argument
If the pc is non-zero then we may call cpu_resume_from_signal(),
which does a longjmp out of the calling code (and out of the
signal handler); this is to cover the case of a target CPU with
"precise self-modifying code" (currently only x86) executing
a store instruction which modifies code in the same TB as the
store itself. Rather than doing the longjump directly here,
return a flag to the caller which indicates whether the current
TB was modified, and move the longjump to page_unprotect.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Message-id:
1463494687-25947-2-git-send-email-peter.maydell@linaro.org
xiaoqiang zhao [Wed, 8 Jun 2016 02:30:45 +0000 (10:30 +0800)]
hw/arm: virt uart fix
commit
f0d1d2c115dffc1fbaf954d0b449db05c5eb79b1
("hw/char: QOM'ify pl011 model") break qemu-system-arm virt machine
if option '-machine secure=on' is provided.
The function create_uart is called twice. So make CharDriverState pointer
a parameter to create_uart instead of hardcoded.
Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Tested-by: Jerome Forissier <jerome.forissier@linaro.org>
Message-id:
1465353045-26323-1-git-send-email-zxq_yx_007@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Wed, 8 Jun 2016 17:34:32 +0000 (18:34 +0100)]
Merge remote-tracking branch 'remotes/riku/tags/pull-linux-user-
20160608' into staging
linux-user pull request for June 2016
# gpg: Signature made Wed 08 Jun 2016 14:27:14 BST
# gpg: using RSA key 0xB44890DEDE3C9BC0
# gpg: Good signature from "Riku Voipio <riku.voipio@iki.fi>"
# gpg: aka "Riku Voipio <riku.voipio@linaro.org>"
* remotes/riku/tags/pull-linux-user-
20160608: (44 commits)
linux-user: In fork_end(), remove correct CPUs from CPU list
linux-user: Special-case ERESTARTSYS in target_strerror()
linux-user: Make target_strerror() return 'const char *'
linux-user: Correct signedness of target_flock l_start and l_len fields
linux-user: Use safe_syscall wrapper for ioctl
linux-user: Use safe_syscall wrapper for accept and accept4 syscalls
linux-user: Use safe_syscall wrapper for semop
linux-user: Use safe_syscall wrapper for epoll_wait syscalls
linux-user: Use safe_syscall wrapper for poll and ppoll syscalls
linux-user: Use safe_syscall wrapper for sleep syscalls
linux-user: Use safe_syscall wrapper for rt_sigtimedwait syscall
linux-user: Use safe_syscall wrapper for flock
linux-user: Use safe_syscall wrapper for mq_timedsend and mq_timedreceive
linux-user: Use safe_syscall wrapper for msgsnd and msgrcv
linux-user: Use safe_syscall wrapper for send* and recv* syscalls
linux-user: Use safe_syscall wrapper for connect syscall
linux-user: Use safe_syscall wrapper for readv and writev syscalls
linux-user: Fix error conversion in 64-bit fadvise syscall
linux-user: Fix NR_fadvise64 and NR_fadvise64_64 for 32-bit guests
linux-user: Fix handling of arm_fadvise64_64 syscall
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Conflicts:
configure
scripts/qemu-binfmt-conf.sh
Peter Maydell [Wed, 8 Jun 2016 16:17:16 +0000 (17:17 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Wed 08 Jun 2016 09:31:38 BST
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
* remotes/kevin/tags/for-upstream: (31 commits)
qemu-img bench: Add --flush-interval
qemu-img bench: Implement -S (step size)
qemu-img bench: Make start offset configurable
qemu-img bench: Sequential writes
qemu-img bench
block: Don't emulate natively supported pwritev flags
blockdev: clean up error handling in do_open_tray
block: Fix bdrv_all_delete_snapshot() error handling
qcow2: avoid extra flushes in qcow2
raw-posix: Fetch max sectors for host block device
block: assert that bs->request_alignment is a power of 2
migration/block: Convert saving to BlockBackend
migration/block: Convert load to BlockBackend
block: Kill bdrv_co_write_zeroes()
vmdk: Convert to bdrv_co_pwrite_zeroes()
raw_bsd: Convert to bdrv_co_pwrite_zeroes()
raw-posix: Convert to bdrv_co_pwrite_zeroes()
qed: Convert to bdrv_co_pwrite_zeroes()
gluster: Convert to bdrv_co_pwrite_zeroes()
blkreplay: Convert to bdrv_co_pwrite_zeroes()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Wed, 8 Jun 2016 15:31:53 +0000 (16:31 +0100)]
Merge remote-tracking branch 'remotes/famz/tags/pull-docker-
20160608' into staging
Docker testing fixes by Paolo.
# gpg: Signature made Wed 08 Jun 2016 08:20:54 BST
# gpg: using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6
* remotes/famz/tags/pull-docker-
20160608:
tests/docker: build all targets in test-clang
tests/docker: support travis test with fedora image
tests/docker: remove unused feature "ccache"
tests/docker: fix test-mingw
tests/docker: make test-full build all targets, not none
tests/docker: fix make-archive-maybe
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Wed, 8 Jun 2016 15:04:52 +0000 (16:04 +0100)]
Merge remote-tracking branch 'remotes/mdroth/tags/qga-pull-2016-07-07-tag' into staging
qemu-ga patch queue
* add unit tests for guest-exec command set
# gpg: Signature made Tue 07 Jun 2016 21:43:33 BST
# gpg: using RSA key 0x3353C9CEF108B584
# gpg: Good signature from "Michael Roth <flukshun@gmail.com>"
# gpg: aka "Michael Roth <mdroth@utexas.edu>"
# gpg: aka "Michael Roth <mdroth@linux.vnet.ibm.com>"
* remotes/mdroth/tags/qga-pull-2016-07-07-tag:
tests: start a /qga/guest-exec test
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Wed, 8 Jun 2016 13:45:28 +0000 (14:45 +0100)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* max-ram-below-4g improvement (Gerd)
* escc fix (xiaoqiang)
* ESP fix (Prasad)
* scsi-disk tweaks/fix (me)
* Makefile dependency fixes (me)
* PKGVERSION improvement (Fam)
* -vnc man improvement (Robert)
# gpg: Signature made Tue 07 Jun 2016 18:06:22 BST
# gpg: using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
* remotes/bonzini/tags/for-upstream:
vnc: list the 'to' parameter of '-vnc' in the qemu man page
scsi-disk: add missing break
Makefile: Derive "PKGVERSION" from "git describe" by default
Makefile: add dependency on scripts/hxtool
Makefile: add dependency on scripts/make_device_config.sh
Makefile: add dependency on scripts/create_config
Makefile: Add a "FORCE" target
scsi: megasas: null terminate bios version buffer
scsi: mark TYPE_SCSI_DISK_BASE as abstract
scsi: esp: check TI buffer index before read/write
hw/char: QOM'ify escc.c (fix)
pc: allow raising low memory via max-ram-below-4g option
tests: Rename tests/Makefile to tests/Makefile.include
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 7 Jun 2016 16:31:04 +0000 (17:31 +0100)]
linux-user: In fork_end(), remove correct CPUs from CPU list
In fork_end(), we must fix the list of current CPUs to match the fact
that the child of the fork has only one thread. Unfortunately we were
removing the wrong CPUs from the list, which meant that if the child
subsequently did an exclusive operation it would deadlock in
start_exclusive() waiting for a sibling CPU which didn't exist.
In particular this could cause hangs doing git submodule init
operations, as reported in https://bugs.launchpad.net/qemu/+bug/955379
comment #47.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Peter Maydell [Mon, 6 Jun 2016 18:58:19 +0000 (19:58 +0100)]
linux-user: Special-case ERESTARTSYS in target_strerror()
Since TARGET_ERESTARTSYS and TARGET_ESIGRETURN are internal-to-QEMU
error numbers, handle them specially in target_strerror(), to avoid
confusing strace output like:
9521 rt_sigreturn(14,8,
274886297808,8,0,
268435456) = -1 errno=513 (Unknown error 513)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Peter Maydell [Mon, 6 Jun 2016 18:58:18 +0000 (19:58 +0100)]
linux-user: Make target_strerror() return 'const char *'
Make target_strerror() return 'const char *' rather than just 'char *';
this will allow us to return constant strings from it for some special
cases.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Peter Maydell [Mon, 6 Jun 2016 18:58:16 +0000 (19:58 +0100)]
linux-user: Correct signedness of target_flock l_start and l_len fields
The l_start and l_len fields in the various target_flock structures are
supposed to be '__kernel_off_t' or '__kernel_loff_t', which means they
should be signed, not unsigned. Correcting the structure definitions means
that __get_user() and __put_user() will correctly sign extend them if
the guest is using 32 bit offsets and the host is using 64 bit offsets.
This fixes failures in the LTP 'fcntl14' tests where it checks that
negative seek offsets work correctly.
We reindent the structures to drop hard tabs since we're touching 40%
of the fields anyway.
RV: long long -> abi_llong as suggested by Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Kevin Wolf [Fri, 3 Jun 2016 11:59:41 +0000 (13:59 +0200)]
qemu-img bench: Add --flush-interval
This options allows to flush the image periodically during write tests.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Kevin Wolf [Mon, 13 Jul 2015 11:13:17 +0000 (13:13 +0200)]
qemu-img bench: Implement -S (step size)
With this new option, qemu-img bench can be told to advance the current
offset after each request by a different value than the buffer size.
This is useful for controlling the conditions for cluster allocation in
image formats (e.g. qcow2 cluster allocation with COW in front of the
request, or COW areas that aren't overwritten immediately).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Kevin Wolf [Fri, 10 Jul 2015 16:09:18 +0000 (18:09 +0200)]
qemu-img bench: Make start offset configurable
This patch adds an option the specify the offset of the first request
made by qemu-img bench. This allows to benchmark misaligned requests.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Kevin Wolf [Fri, 10 Jul 2015 16:09:18 +0000 (18:09 +0200)]
qemu-img bench: Sequential writes
This extends qemu-img bench with an option that makes it use sequential
writes instead of reads for the test run.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Kevin Wolf [Tue, 5 Aug 2014 12:17:13 +0000 (14:17 +0200)]
qemu-img bench
This adds a qemu-img command that allows doing some simple benchmarks
for the block layer without involving guest devices and a real VM.
For the start, this implements only a test of sequential reads.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Kevin Wolf [Tue, 7 Jun 2016 13:51:28 +0000 (15:51 +0200)]
block: Don't emulate natively supported pwritev flags
Drivers that implement .bdrv_co_pwritev() get the flags passed as an
argument to said function, but we also unconditionally emulate the flags
anyway. We shouldn't do that.
Fix this by clearing all flags that the driver supports natively after
it returns from .bdrv_co_pwritev().
Fixes: 4df863f3 ('block: Make supported_write_flags a per-bds property')
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Colin Lord [Mon, 6 Jun 2016 18:15:22 +0000 (14:15 -0400)]
blockdev: clean up error handling in do_open_tray
Returns negative error codes and accompanying error messages in cases where
the device has no tray or the tray is locked and isn't forced open. This
extra information should result in better flexibility in functions that
call do_open_tray.
Suggested by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Colin Lord <clord@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Kevin Wolf [Mon, 6 Jun 2016 10:53:22 +0000 (12:53 +0200)]
block: Fix bdrv_all_delete_snapshot() error handling
The code to exit the loop after bdrv_snapshot_delete_by_id_or_name()
returned failure was duplicated. The first copy of it was too early so
that the AioContext lock would not be freed. This patch removes it so
that only the second, correct copy remains.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Thu, 2 Jun 2016 15:58:15 +0000 (18:58 +0300)]
qcow2: avoid extra flushes in qcow2
The problem with excessive flushing was found by a couple of performance
tests:
- parallel directory tree creation (from 2 processes)
- 32 cached writes + fsync at the end in a loop
For the first one results improved from 2.6 loops/sec to 3.5 loops/sec.
Each loop creates 10^3 directories with 10 files in each.
For the second one results improved from ~600 fsync/sec to ~1100
fsync/sec. Though, it was run on SSD so it probably won't show such
performance gain on rotational media.
qcow2_cache_flush() calls bdrv_flush() unconditionally after writing
cache entries of a particular cache. This can lead to as many as
2 additional fdatasyncs inside bdrv_flush.
We can simply skip all fdatasync calls inside qcow2_co_flush_to_os
as bdrv_flush for sure will do the job. These flushes are necessary to
keep the right order of writes to the different caches. Though this is
not necessary in the current code base as this ordering is ensured through
the flush in qcow2_cache_flush_dependency().
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Pavel Borzenkov <pborzenkov@virtuozzo.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Fam Zheng [Fri, 3 Jun 2016 02:07:02 +0000 (10:07 +0800)]
raw-posix: Fetch max sectors for host block device
This is sometimes a useful value we should count in.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Peter Lieven [Mon, 30 May 2016 11:59:59 +0000 (13:59 +0200)]
block: assert that bs->request_alignment is a power of 2
at least bdrv_co_preadv/pwritev expect this.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Kevin Wolf [Fri, 27 May 2016 17:50:37 +0000 (19:50 +0200)]
migration/block: Convert saving to BlockBackend
This creates a new BlockBackend for copying data from an images to the
migration stream on the source host. All I/O for block migration goes
through BlockBackend now.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Kevin Wolf [Wed, 25 May 2016 15:20:06 +0000 (17:20 +0200)]
migration/block: Convert load to BlockBackend
This converts the loading part of block migration to use BlockBackend
interfaces rather than accessing the BlockDriverState directly.
Note that this takes a lazy shortcut. We should really use a separate
BlockBackend that is configured for the migration rather than for the
guest (e.g. writethrough caching is unnecessary) and holds its own
reference to the BlockDriverState, but the impact isn't that big and we
didn't have a separate migration reference before either, so it must be
good enough, I guess...
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Eric Blake [Wed, 1 Jun 2016 21:10:13 +0000 (15:10 -0600)]
block: Kill bdrv_co_write_zeroes()
Now that all drivers have been converted to a byte interface,
we no longer need a sector interface.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Wed, 1 Jun 2016 21:10:12 +0000 (15:10 -0600)]
vmdk: Convert to bdrv_co_pwrite_zeroes()
Another step on our continuing quest to switch to byte-based
interfaces.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Wed, 1 Jun 2016 21:10:11 +0000 (15:10 -0600)]
raw_bsd: Convert to bdrv_co_pwrite_zeroes()
Another step on our continuing quest to switch to byte-based
interfaces.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Wed, 1 Jun 2016 21:10:10 +0000 (15:10 -0600)]
raw-posix: Convert to bdrv_co_pwrite_zeroes()
Another step on our continuing quest to switch to byte-based
interfaces.
Signed-off-by: Eric Blake <eblake@redhat.com>
[ kwolf: Fixed up trace_paio_submit_co() call for qiov == NULL ]
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Wed, 1 Jun 2016 21:10:09 +0000 (15:10 -0600)]
qed: Convert to bdrv_co_pwrite_zeroes()
Another step on our continuing quest to switch to byte-based
interfaces.
Kill an abuse of the comma operator while at it (fortunately,
the semantics were still right). Also, the test for requests
not aligned to clusters should be applied always, not just
when a backing file is present.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Wed, 1 Jun 2016 21:10:08 +0000 (15:10 -0600)]
gluster: Convert to bdrv_co_pwrite_zeroes()
Another step on our continuing quest to switch to byte-based
interfaces.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Wed, 1 Jun 2016 21:10:07 +0000 (15:10 -0600)]
blkreplay: Convert to bdrv_co_pwrite_zeroes()
Another step on our continuing quest to switch to byte-based
interfaces.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Wed, 1 Jun 2016 21:10:06 +0000 (15:10 -0600)]
qcow2: Convert to bdrv_co_pwrite_zeroes()
Another step on our continuing quest to switch to byte-based
interfaces.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Wed, 1 Jun 2016 21:10:05 +0000 (15:10 -0600)]
iscsi: Convert to bdrv_co_pwrite_zeroes()
Another step on our continuing quest to switch to byte-based
interfaces.
As this is the first byte-based iscsi interface, convert
is_request_lun_aligned() into two versions, one for sectors
and one for bytes. Also, change from outright -EINVAL failure
on an unaligned request, to instead failing with -ENOTSUP to
trigger a read-modify-write fallback, particularly since the
block layer should be honoring bs->request_alignment to avoid
-EINVAL on read/write requests.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Wed, 1 Jun 2016 21:10:04 +0000 (15:10 -0600)]
block: Switch bdrv_write_zeroes() to byte interface
Rename to bdrv_pwrite_zeroes() to let the compiler ensure we
cater to the updated semantics. Do the same for bdrv_co_write_zeroes().
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Wed, 1 Jun 2016 21:10:03 +0000 (15:10 -0600)]
block: Add .bdrv_co_pwrite_zeroes()
Update bdrv_co_do_write_zeroes() to be byte-based, and select
between the new byte-based bdrv_co_pwrite_zeroes() or the old
bdrv_co_write_zeroes(). The next patches will convert drivers,
then remove the old interface.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Wed, 1 Jun 2016 21:10:02 +0000 (15:10 -0600)]
block: Track write zero limits in bytes
Another step towards removing sector-based interfaces: convert
the maximum write and minimum alignment values from sectors to
bytes. Rename the variables to let the compiler check that all
users are converted to the new semantics.
The maximum remains an int as long as BDRV_REQUEST_MAX_SECTORS
is constrained by INT_MAX (this means that we can't even
support a 2G write_zeroes, but just under it) - changing
operation lengths to unsigned or to 64-bits is a much bigger
audit, and debatable if we even want to do it (since at the
core, a 32-bit platform will still have ssize_t as its
underlying limit on write()).
Meanwhile, alignment is changed to 'uint32_t', since it makes no
sense to have an alignment larger than the maximum write, and
less painful to use an unsigned type with well-defined behavior
in bit operations than to have to worry about what happens if
a driver mistakenly supplies a negative alignment.
Add an assert that no one was trying to use sectors to get a
write zeroes larger than 2G, and therefore that a later conversion
to bytes won't be impacted by keeping the limit at 32 bits.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Wed, 1 Jun 2016 21:10:01 +0000 (15:10 -0600)]
iscsi: Use block size as minimum zero/discard alignment
If hardware does not advertise a minimum zero/discard
alignment, we still want to guarantee that the block layer
will align requests to our blocks, rather than the arbitrary
512-byte BDRV sector size.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 26 May 2016 03:48:49 +0000 (21:48 -0600)]
qcow2: Catch more unaligned write_zero into zero cluster
is_zero_cluster() and is_zero_cluster_top_locked() are used only
by qcow2_co_write_zeroes(). The former is too broad (we don't
care if the sectors we are about to overwrite are non-zero, only
that all other sectors in the cluster are zero), so it needs to
be called up to twice but with smaller limits - rename it along
with adding the neeeded parameter. The latter can be inlined for
more compact code.
The testsuite change shows that we now have a sparser top file
when an unaligned write_zeroes overwrites the only portion of
the backing file with data.
Based on a patch proposal by Denis V. Lunev.
CC: Denis V. Lunev <den@openvz.org>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 26 May 2016 03:48:48 +0000 (21:48 -0600)]
qemu-iotests: Test one more spot for optimizing write_zeroes
Add another test to 154, showing that we currently allocate a
data cluster in the top layer if any sector of the backing file
was allocated. The next patch will optimize this case.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Denis V. Lunev [Thu, 26 May 2016 03:48:47 +0000 (21:48 -0600)]
qcow2: add tracepoints for qcow2_co_write_zeroes
This patch follows guidelines of all other tracepoints in qcow2, like ones
in qcow2_co_writev. I think that they should dump values in the same
quantities or be changed all together.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Message-Id: <
1463476543-3087-4-git-send-email-den@openvz.org>
[eblake: typo fix in commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Denis V. Lunev [Thu, 26 May 2016 03:48:46 +0000 (21:48 -0600)]
qcow2: simplify logic in qcow2_co_write_zeroes
Unaligned requests will occupy only one cluster. This is true since the
previous commit. Simplify the code taking this consideration into
account.
In other words, the caller is now buggy if it ever passes us an unaligned
request that crosses cluster boundaries (the only requests that can cross
boundaries will be aligned).
There are no other changes so far.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Message-Id: <
1463476543-3087-3-git-send-email-den@openvz.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Denis V. Lunev [Thu, 26 May 2016 03:48:45 +0000 (21:48 -0600)]
block: split write_zeroes always
We should split requests even if they are less than write_zeroes_alignment.
For example we can have the following request:
offset 62k
size 4k
write_zeroes_alignment 64k
The original code sent 1 request covering 2 qcow2 clusters, and resulted
in both clusters being allocated. But by splitting the request, we can
cater to the case where one of the two clusters can be zeroed as a
whole, for only 1 cluster allocated after the operation.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Message-Id: <
1463476543-3087-2-git-send-email-den@openvz.org>
[eblake: Avoid exceeding nb_sectors, hoist alignment checks out of
loop, and update testsuite to show that patch works]
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Paolo Bonzini [Mon, 6 Jun 2016 14:46:57 +0000 (16:46 +0200)]
tests/docker: build all targets in test-clang
Warnings specific to clang may affect devices that are not build by
x86_64-softmmu and aarch64-softmmu. Build all targets since that
is also what Peter does.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id:
1465224417-141321-7-git-send-email-pbonzini@redhat.com
Signed-off-by: Fam Zheng <famz@redhat.com>
Paolo Bonzini [Mon, 6 Jun 2016 14:46:56 +0000 (16:46 +0200)]
tests/docker: support travis test with fedora image
Install sparse and PyYAML.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id:
1465224417-141321-6-git-send-email-pbonzini@redhat.com
Signed-off-by: Fam Zheng <famz@redhat.com>
Paolo Bonzini [Mon, 6 Jun 2016 14:46:55 +0000 (16:46 +0200)]
tests/docker: remove unused feature "ccache"
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id:
1465224417-141321-5-git-send-email-pbonzini@redhat.com
Signed-off-by: Fam Zheng <famz@redhat.com>
Paolo Bonzini [Mon, 6 Jun 2016 14:46:54 +0000 (16:46 +0200)]
tests/docker: fix test-mingw
Add flex and bison for use in test-mingw, because test-mingw
uses the in-tree libdtc.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id:
1465224417-141321-4-git-send-email-pbonzini@redhat.com
Signed-off-by: Fam Zheng <famz@redhat.com>
Paolo Bonzini [Mon, 6 Jun 2016 14:46:53 +0000 (16:46 +0200)]
tests/docker: make test-full build all targets, not none
Fix common.rc to avoid passing an empty --target-list= option to configure.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id:
1465224417-141321-3-git-send-email-pbonzini@redhat.com
Signed-off-by: Fam Zheng <famz@redhat.com>
Paolo Bonzini [Mon, 6 Jun 2016 14:46:52 +0000 (16:46 +0200)]
tests/docker: fix make-archive-maybe
make-archive-maybe expects an archive path relative
to $1, but receives a path relative to the current directory. Redirect
the output outside the subshell to bypass the "cd $1".
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id:
1465224417-141321-2-git-send-email-pbonzini@redhat.com
Signed-off-by: Fam Zheng <famz@redhat.com>
Peter Maydell [Mon, 6 Jun 2016 18:58:14 +0000 (19:58 +0100)]
linux-user: Use safe_syscall wrapper for ioctl
Use the safe_syscall wrapper to implement the ioctl syscall.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Peter Maydell [Mon, 6 Jun 2016 18:58:13 +0000 (19:58 +0100)]
linux-user: Use safe_syscall wrapper for accept and accept4 syscalls
Use the safe_syscall wrapper for the accept and accept4 syscalls.
accept4 has been in the kernel since 2.6.28 so we can assume it
is always present.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Peter Maydell [Mon, 6 Jun 2016 18:58:12 +0000 (19:58 +0100)]
linux-user: Use safe_syscall wrapper for semop
Use the safe_syscall wrapper for the semop syscall or IPC operation.
(We implement via the semtimedop syscall to make it easier to
implement the guest semtimedop syscall later.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Peter Maydell [Mon, 6 Jun 2016 18:58:11 +0000 (19:58 +0100)]
linux-user: Use safe_syscall wrapper for epoll_wait syscalls
Use the safe_syscall wrapper for epoll_wait and epoll_pwait syscalls.
Since we now directly use the host epoll_pwait syscall for both
epoll_wait and epoll_pwait, we don't need the configure machinery
to check whether glibc supports epoll_pwait(). (The kernel has
supported the syscall since 2.6.19 so we can assume it's always there.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Peter Maydell [Mon, 6 Jun 2016 18:58:10 +0000 (19:58 +0100)]
linux-user: Use safe_syscall wrapper for poll and ppoll syscalls
Use the safe_syscall wrapper for the poll and ppoll syscalls.
Since not all host architectures will have a poll syscall, we
have to rewrite the TARGET_NR_poll handling to use ppoll instead
(we can assume everywhere has ppoll by now).
We take the opportunity to switch to the code structure
already used in the implementation of epoll_wait and epoll_pwait,
which uses a switch() to avoid interleaving #if and if (),
and to stop using a variable with a leading '_' which is in
the implementation's namespace.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Peter Maydell [Mon, 6 Jun 2016 18:58:09 +0000 (19:58 +0100)]
linux-user: Use safe_syscall wrapper for sleep syscalls
Use the safe_syscall wrapper for the clock_nanosleep and nanosleep
syscalls.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Peter Maydell [Mon, 6 Jun 2016 18:58:08 +0000 (19:58 +0100)]
linux-user: Use safe_syscall wrapper for rt_sigtimedwait syscall
Use the safe_syscall wrapper for the rt_sigtimedwait syscall.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Peter Maydell [Mon, 6 Jun 2016 18:58:07 +0000 (19:58 +0100)]
linux-user: Use safe_syscall wrapper for flock
Use the safe_syscall wrapper for the flock syscall.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Peter Maydell [Mon, 6 Jun 2016 18:58:06 +0000 (19:58 +0100)]
linux-user: Use safe_syscall wrapper for mq_timedsend and mq_timedreceive
Use the safe_syscall wrapper for mq_timedsend and mq_timedreceive syscalls.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Peter Maydell [Mon, 6 Jun 2016 18:58:05 +0000 (19:58 +0100)]
linux-user: Use safe_syscall wrapper for msgsnd and msgrcv
Use the safe_syscall wrapper for msgsnd and msgrcv syscalls.
This is made slightly awkward by some host architectures providing
only a single 'ipc' syscall rather than separate syscalls per
operation; we provide safe_msgsnd() and safe_msgrcv() as wrappers
around safe_ipc() to handle this if needed.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Peter Maydell [Mon, 6 Jun 2016 18:58:04 +0000 (19:58 +0100)]
linux-user: Use safe_syscall wrapper for send* and recv* syscalls
Use the safe_syscall wrapper for the send, sendto, sendmsg, recv,
recvfrom and recvmsg syscalls.
RV: adjusted to apply
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Peter Maydell [Mon, 6 Jun 2016 18:58:03 +0000 (19:58 +0100)]
linux-user: Use safe_syscall wrapper for connect syscall
Use the safe_syscall wrapper for the connect syscall.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Peter Maydell [Mon, 6 Jun 2016 18:58:02 +0000 (19:58 +0100)]
linux-user: Use safe_syscall wrapper for readv and writev syscalls
Use the safe_syscall wrapper for readv and writev syscalls.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Peter Maydell [Tue, 31 May 2016 14:45:11 +0000 (15:45 +0100)]
linux-user: Fix error conversion in 64-bit fadvise syscall
Fix a missing host-to-target errno conversion in the 64-bit
fadvise syscall emulation.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Peter Maydell [Tue, 31 May 2016 14:45:10 +0000 (15:45 +0100)]
linux-user: Fix NR_fadvise64 and NR_fadvise64_64 for 32-bit guests
Fix errors in the implementation of NR_fadvise64 and NR_fadvise64_64
for 32-bit guests, which pass their off_t values in register pairs.
We can't use the 64-bit code path for this, so split out the 32-bit
cases, so that we can correctly handle the "only offset is 64-bit"
and "both offset and length are 64-bit" syscall flavours, and
"uses aligned register pairs" and "does not" flavours of target.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Peter Maydell [Tue, 31 May 2016 14:45:09 +0000 (15:45 +0100)]
linux-user: Fix handling of arm_fadvise64_64 syscall
32-bit ARM has an odd variant of the fadvise syscall which has
rearranged arguments, which we try to implement. Unfortunately we got
the rearrangement wrong.
This is a six-argument syscall whose arguments are:
* fd
* advise parameter
* offset high half
* offset low half
* len high half
* len low half
Stop trying to share code with the standard fadvise syscalls,
and just implement the syscall with the correct argument order.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>