Peter Maydell [Tue, 15 Oct 2019 09:55:38 +0000 (10:55 +0100)]
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Pull request
# gpg: Signature made Mon 14 Oct 2019 09:52:03 BST
# gpg: using RSA key
8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/block-pull-request:
test-bdrv-drain: fix iothread_join() hang
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Mon, 14 Oct 2019 16:12:19 +0000 (17:12 +0100)]
Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-
20191012' into staging
qemu-openbios queue
# gpg: Signature made Sat 12 Oct 2019 10:47:55 BST
# gpg: using RSA key
CC621AB98E82200D915CC9C45BC2C56FAE0F321F
# gpg: issuer "mark.cave-ayland@ilande.co.uk"
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>" [full]
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C C9C4 5BC2 C56F AE0F 321F
* remotes/mcayland/tags/qemu-openbios-
20191012:
Update OpenBIOS images to
f28e16f9 built from submodule.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Mon, 14 Oct 2019 15:09:52 +0000 (16:09 +0100)]
Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-
20191011a' into staging
Migration pull 2019-10-11
Mostly cleanups and minor fixes
[Note I'm seeing a hang on the aarch64 hosted x86-64 tcg migration
test in xbzrle; but I'm seeing that on current head as well]
# gpg: Signature made Fri 11 Oct 2019 20:14:31 BST
# gpg: using RSA key
45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert/tags/pull-migration-
20191011a: (21 commits)
migration: Support gtree migration
migration/multifd: pages->used would be cleared when attach to multifd_send_state
migration/multifd: initialize packet->magic/version once at setup stage
migration/multifd: use pages->allocated instead of the static max
migration/multifd: fix a typo in comment of multifd_recv_unfill_packet()
migration/postcopy: check PostcopyState before setting to POSTCOPY_INCOMING_RUNNING
migration/postcopy: rename postcopy_ram_enable_notify to postcopy_ram_incoming_setup
migration/postcopy: postpone setting PostcopyState to END
migration/postcopy: mis->have_listen_thread check will never be touched
migration: report SaveStateEntry id and name on failure
migration: pass in_postcopy instead of check state again
migration/postcopy: fix typo in mark_postcopy_blocktime_begin's comment
migration/postcopy: map large zero page in postcopy_ram_incoming_setup()
migration/postcopy: allocate tmp_page in setup stage
migration: Don't try and recover return path in non-postcopy
rcu: Use automatic rc_read unlock in core memory/exec code
migration: Use automatic rcu_read unlock in rdma.c
migration: Use automatic rcu_read unlock in ram.c
migration: Fix missing rcu_read_unlock
rcu: Add automatically released rcu_read_lock variants
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Mon, 14 Oct 2019 14:09:08 +0000 (15:09 +0100)]
Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-
20191010.0' into staging
VFIO update 2019-10-10
- Fix MSI error path double free (Evgeny Yakovlev)
# gpg: Signature made Thu 10 Oct 2019 20:07:39 BST
# gpg: using RSA key
239B9B6E3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>" [full]
# gpg: aka "Alex Williamson <alex@shazbot.org>" [full]
# gpg: aka "Alex Williamson <alwillia@redhat.com>" [full]
# gpg: aka "Alex Williamson <alex.l.williamson@gmail.com>" [full]
# Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B 8A90 239B 9B6E 3BB0 8B22
* remotes/awilliam/tags/vfio-update-
20191010.0:
hw/vfio/pci: fix double free in vfio_msi_disable
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Mon, 14 Oct 2019 12:34:39 +0000 (13:34 +0100)]
Merge remote-tracking branch 'remotes/gkurz/tags/9p-next-2019-10-10' into staging
The most notable change is that we now detect cross-device setups in the
host since it may cause inode number collision and mayhem in the guest.
A new fsdev property is added for the user to choose the appropriate
policy to handle that: either remap all inode numbers or fail I/Os to
another host device or just print out a warning (default behaviour).
This is also my last PR as _active_ maintainer of 9pfs.
# gpg: Signature made Thu 10 Oct 2019 12:14:07 BST
# gpg: using RSA key
B4828BAF943140CEF2A3491071D4D5E5822F73D6
# gpg: Good signature from "Greg Kurz <groug@kaod.org>" [full]
# gpg: aka "Gregory Kurz <gregory.kurz@free.fr>" [full]
# gpg: aka "[jpeg image of size 3330]" [full]
# Primary key fingerprint: B482 8BAF 9431 40CE F2A3 4910 71D4 D5E5 822F 73D6
* remotes/gkurz/tags/9p-next-2019-10-10:
MAINTAINERS: Downgrade status of virtio-9p to "Odd Fixes"
9p: Use variable length suffixes for inode remapping
9p: stat_to_qid: implement slow path
9p: Added virtfs option 'multidevs=remap|forbid|warn'
9p: Treat multiple devices on one export as an error
fsdev: Add return value to fsdev_throttle_parse_opts()
9p: Simplify error path of v9fs_device_realize_common()
9p: unsigned type for type, version, path
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Mon, 14 Oct 2019 11:26:37 +0000 (12:26 +0100)]
Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2019-10-10' into staging
Block patches:
- Parallelized request handling for qcow2
- Backup job refactoring to use a filter node instead of before-write
notifiers
- Add discard accounting information to file-posix nodes
- Allow trivial reopening of nbd nodes
- Some iotest fixes
# gpg: Signature made Thu 10 Oct 2019 12:40:34 BST
# gpg: using RSA key
91BEB60A30DB3E8857D11829F407DB0061D5CF40
# gpg: issuer "mreitz@redhat.com"
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>" [full]
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* remotes/maxreitz/tags/pull-block-2019-10-10: (36 commits)
iotests/162: Fix for newer Linux 5.3+
tests: fix I/O test for hosts defaulting to LUKSv2
nbd: add empty .bdrv_reopen_prepare
block/backup: use backup-top instead of write notifiers
block: introduce backup-top filter driver
block/block-copy: split block_copy_set_callbacks function
block/backup: move write_flags calculation inside backup_job_create
block/backup: move in-flight requests handling from backup to block-copy
iotests: Use stat -c %b in 125
iotests: Disable 125 on broken XFS versions
iotests: Fix 125 for growth_mode = metadata
qapi: query-blockstat: add driver specific file-posix stats
file-posix: account discard operations
scsi: account unmap operations
scsi: move unmap error checking to the complete callback
scsi: store unmap offset and nb_sectors in request struct
ide: account UNMAP (TRIM) operations
block: add empty account cookie type
qapi: add unmap to BlockDeviceStats
qapi: group BlockDeviceStats fields
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Mon, 14 Oct 2019 09:42:35 +0000 (10:42 +0100)]
Merge remote-tracking branch 'remotes/davidhildenbrand/tags/s390x-tcg-2019-10-10' into staging
- MMU DAT translation rewrite and cleanup
- Implement more TCG CPU features related to the MMU (e.g., IEP)
- Add the current instruction length to unwind data and clean up
- Resolve one TODO for the MVCL instruction
# gpg: Signature made Thu 10 Oct 2019 12:25:06 BST
# gpg: using RSA key
1BD9CAAD735C4C3A460DFCCA4DDE10F700FF835A
# gpg: issuer "david@redhat.com"
# gpg: Good signature from "David Hildenbrand <david@redhat.com>" [unknown]
# gpg: aka "David Hildenbrand <davidhildenbrand@gmail.com>" [full]
# Primary key fingerprint: 1BD9 CAAD 735C 4C3A 460D FCCA 4DDE 10F7 00FF 835A
* remotes/davidhildenbrand/tags/s390x-tcg-2019-10-10: (31 commits)
s390x/tcg: MVCL: Exit to main loop if requested
target/s390x: Remove ILEN_UNWIND
target/s390x: Remove ilen argument from trigger_pgm_exception
target/s390x: Remove ilen argument from trigger_access_exception
target/s390x: Remove ILEN_AUTO
target/s390x: Rely on unwinding in s390_cpu_virt_mem_rw
target/s390x: Rely on unwinding in s390_cpu_tlb_fill
target/s390x: Simplify helper_lra
target/s390x: Remove fail variable from s390_cpu_tlb_fill
target/s390x: Return exception from translate_pages
target/s390x: Return exception from mmu_translate
target/s390x: Remove exc argument to mmu_translate_asce
target/s390x: Return exception from mmu_translate_real
target/s390x: Handle tec in s390_cpu_tlb_fill
target/s390x: Push trigger_pgm_exception lower in s390_cpu_tlb_fill
target/s390x: Use tcg_s390_program_interrupt in TCG helpers
target/s390x: Remove ilen parameter from s390_program_interrupt
target/s390x: Remove ilen parameter from tcg_s390_program_interrupt
target/s390x: Add ilen to unwind data
s390x/cpumodel: Add new TCG features to QEMU cpu model
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Stefan Hajnoczi [Thu, 3 Oct 2019 10:01:03 +0000 (11:01 +0100)]
test-bdrv-drain: fix iothread_join() hang
tests/test-bdrv-drain can hang in tests/iothread.c:iothread_run():
while (!atomic_read(&iothread->stopping)) {
aio_poll(iothread->ctx, true);
}
The iothread_join() function works as follows:
void iothread_join(IOThread *iothread)
{
iothread->stopping = true;
aio_notify(iothread->ctx);
qemu_thread_join(&iothread->thread);
If iothread_run() checks iothread->stopping before the iothread_join()
thread sets stopping to true, then aio_notify() may be optimized away
and iothread_run() hangs forever in aio_poll().
The correct way to change iothread->stopping is from a BH that executes
within iothread_run(). This ensures that iothread->stopping is checked
after we set it to true.
This was already fixed for ./iothread.c (note this is a different source
file!) by commit
2362a28ea11c145e1a13ae79342d76dc118a72a6 ("iothread:
fix iothread_stop() race condition"), but not for tests/iothread.c.
Fixes: 0c330a734b51c177ab8488932ac3b0c4d63a718a
("aio: introduce aio_co_schedule and aio_co_wake")
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <
20191003100103.331-1-stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Mark Cave-Ayland [Sat, 12 Oct 2019 09:17:11 +0000 (10:17 +0100)]
Update OpenBIOS images to
f28e16f9 built from submodule.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Eric Auger [Fri, 11 Oct 2019 12:17:24 +0000 (14:17 +0200)]
migration: Support gtree migration
Introduce support for GTree migration. A custom save/restore
is implemented. Each item is made of a key and a data.
If the key is a pointer to an object, 2 VMSDs are passed into
the GTree VMStateField.
When putting the items, the tree is traversed in sorted order by
g_tree_foreach.
On the get() path, gtrees must be allocated using the proper
key compare, key destroy and value destroy. This must be handled
beforehand, for example in a pre_load method.
Tests are added to test save/dump of structs containing gtrees
including the virtio-iommu domain/mappings scenario.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <
20191011121724.433-1-eric.auger@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
uintptr_t fixup for test on 32bit
Wei Yang [Fri, 11 Oct 2019 08:50:50 +0000 (16:50 +0800)]
migration/multifd: pages->used would be cleared when attach to multifd_send_state
When we found an available channel in multifd_send_pages(), its
pages->used is cleared and then attached to multifd_send_state.
It is not necessary to do this twice.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <
20191011085050.17622-5-richardw.yang@linux.intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Wei Yang [Fri, 11 Oct 2019 08:50:49 +0000 (16:50 +0800)]
migration/multifd: initialize packet->magic/version once at setup stage
MultiFDPacket_t's magic and version field never changes during
migration, so move these two fields in setup stage.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <
20191011085050.17622-4-richardw.yang@linux.intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Wei Yang [Fri, 11 Oct 2019 08:50:48 +0000 (16:50 +0800)]
migration/multifd: use pages->allocated instead of the static max
multifd_send_fill_packet() prepares meta data for following pages to
transfer. It would be more proper to fill pages->allocated instead of
static max value, especially we want to support flexible packet size.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <
20191011085050.17622-3-richardw.yang@linux.intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Wei Yang [Fri, 11 Oct 2019 08:50:47 +0000 (16:50 +0800)]
migration/multifd: fix a typo in comment of multifd_recv_unfill_packet()
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <
20191011085050.17622-2-richardw.yang@linux.intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Wei Yang [Thu, 10 Oct 2019 01:13:16 +0000 (09:13 +0800)]
migration/postcopy: check PostcopyState before setting to POSTCOPY_INCOMING_RUNNING
Currently, we set PostcopyState blindly to RUNNING, even we found the
previous state is not LISTENING. This will lead to a corner case.
First let's look at the code flow:
qemu_loadvm_state_main()
ret = loadvm_process_command()
loadvm_postcopy_handle_run()
return -1;
if (ret < 0) {
if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING)
...
}
>From above snippet, the corner case is loadvm_postcopy_handle_run()
always sets state to RUNNING. And then it checks the previous state. If
the previous state is not LISTENING, it will return -1. But at this
moment, PostcopyState is already been set to RUNNING.
Then ret is checked in qemu_loadvm_state_main(), when it is -1
PostcopyState is checked. Current logic would pause postcopy and retry
if PostcopyState is RUNNING. This is not what we expect, because
postcopy is not active yet.
This patch makes sure state is set to RUNNING only previous state is
LISTENING by checking the state first.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Suggested by: Peter Xu <peterx@redhat.com>
Message-Id: <
20191010011316.31363-3-richardw.yang@linux.intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Wei Yang [Thu, 10 Oct 2019 01:13:15 +0000 (09:13 +0800)]
migration/postcopy: rename postcopy_ram_enable_notify to postcopy_ram_incoming_setup
Function postcopy_ram_incoming_setup and postcopy_ram_incoming_cleanup
is a pair. Rename to make it clear for audience.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <
20191010011316.31363-2-richardw.yang@linux.intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Wei Yang [Sun, 6 Oct 2019 00:02:48 +0000 (08:02 +0800)]
migration/postcopy: postpone setting PostcopyState to END
There are two places to call function postcopy_ram_incoming_cleanup()
postcopy_ram_listen_thread on migration success
loadvm_postcopy_handle_listen one setup failure
On success, the vm will never accept another migration. On failure,
PostcopyState is transited from LISTENING to END and would be checked in
qemu_loadvm_state_main(). If PostcopyState is RUNNING, migration would
be paused and retried.
Currently PostcopyState is set to END in function
postcopy_ram_incoming_cleanup(). With above analysis, we can take this
step out and postpone this till the end of listen thread to indicate the
listen thread is done.
This is a preparation patch for later cleanup.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <
20191006000249.29926-3-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Fixed up in merge to the 1 parameter postcopy_state_set
Wei Yang [Sun, 6 Oct 2019 00:02:47 +0000 (08:02 +0800)]
migration/postcopy: mis->have_listen_thread check will never be touched
If mis->have_listen_thread is true, this means current PostcopyState
must be LISTENING or RUNNING. While the check at the beginning of the
function makes sure the state transaction happens when its previous
PostcopyState is ADVISE or DISCARD.
This means we would never touch this check.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <
20191006000249.29926-2-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Wei Yang [Sat, 5 Oct 2019 22:05:17 +0000 (06:05 +0800)]
migration: report SaveStateEntry id and name on failure
This provides helpful information on which entry failed.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <
20191005220517.24029-5-richardw.yang@linux.intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Wei Yang [Sat, 5 Oct 2019 22:05:16 +0000 (06:05 +0800)]
migration: pass in_postcopy instead of check state again
Not necessary to do the check again.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <
20191005220517.24029-4-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Wei Yang [Sat, 5 Oct 2019 22:05:15 +0000 (06:05 +0800)]
migration/postcopy: fix typo in mark_postcopy_blocktime_begin's comment
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <
20191005220517.24029-3-richardw.yang@linux.intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Wei Yang [Sat, 5 Oct 2019 13:50:21 +0000 (21:50 +0800)]
migration/postcopy: map large zero page in postcopy_ram_incoming_setup()
postcopy_ram_incoming_setup() and postcopy_ram_incoming_cleanup() are
counterpart. It is reasonable to map/unmap large zero page in these two
functions respectively.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <
20191005135021.21721-3-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Wei Yang [Sat, 5 Oct 2019 13:50:20 +0000 (21:50 +0800)]
migration/postcopy: allocate tmp_page in setup stage
During migration, a tmp page is allocated so that we could place a whole
host page during postcopy.
Currently the page is allocated during load stage, this is a little bit
late. And more important, if we failed to allocate it, the error is not
checked properly. Even it is NULL, we would still use it.
This patch moves the allocation to setup stage and if failed error
message would be printed and caller would notice it.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Dr. David Alan Gilbert [Mon, 7 Oct 2019 10:35:07 +0000 (11:35 +0100)]
migration: Don't try and recover return path in non-postcopy
In normal precopy we can't do reconnection recovery - but we also
don't need to, since you can just rerun migration.
At the moment if the 'return-path' capability is on, we use
the return path in precopy to give a positive 'OK' to the end
of migration; however if migration fails then we fall into
the postcopy recovery path and hang. This fixes it by only
running the return path in the postcopy case.
Reported-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Dr. David Alan Gilbert [Mon, 7 Oct 2019 14:36:41 +0000 (15:36 +0100)]
rcu: Use automatic rc_read unlock in core memory/exec code
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <
20191007143642.301445-6-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Dr. David Alan Gilbert [Mon, 7 Oct 2019 14:36:40 +0000 (15:36 +0100)]
migration: Use automatic rcu_read unlock in rdma.c
Use the automatic read unlocker in migration/rdma.c.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <
20191007143642.301445-5-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Dr. David Alan Gilbert [Mon, 7 Oct 2019 14:36:39 +0000 (15:36 +0100)]
migration: Use automatic rcu_read unlock in ram.c
Use the automatic read unlocker in migration/ram.c
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <
20191007143642.301445-4-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Dr. David Alan Gilbert [Mon, 7 Oct 2019 14:36:38 +0000 (15:36 +0100)]
migration: Fix missing rcu_read_unlock
Use the automatic rcu_read unlocker to fix a missing unlock.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <
20191007143642.301445-3-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Dr. David Alan Gilbert [Mon, 7 Oct 2019 14:36:37 +0000 (15:36 +0100)]
rcu: Add automatically released rcu_read_lock variants
RCU_READ_LOCK_GUARD() takes the rcu_read_lock and then uses glib's
g_auto infrastructure (and thus whatever the compiler's hooks are) to
release it on all exits of the block.
WITH_RCU_READ_LOCK_GUARD() is similar but is used as a wrapper for the
lock, i.e.:
WITH_RCU_READ_LOCK_GUARD() {
stuff under lock
}
Note the 'unused' attribute is needed to work around clang bug:
https://bugs.llvm.org/show_bug.cgi?id=43482
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <
20191007143642.301445-2-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Wei Yang [Wed, 17 Jul 2019 00:53:41 +0000 (08:53 +0800)]
migration: use migration_is_active to represent active state
Wrap the check into a function to make it easy to read.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <
20190717005341.14140-1-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Evgeny Yakovlev [Thu, 10 Oct 2019 17:07:28 +0000 (11:07 -0600)]
hw/vfio/pci: fix double free in vfio_msi_disable
The following guest behaviour patter leads to double free in VFIO PCI:
1. Guest enables MSI interrupts
vfio_msi_enable is called, but fails in vfio_enable_vectors.
In our case this was because VFIO GPU device was in D3 state.
Unhappy path in vfio_msi_enable will g_free(vdev->msi_vectors) but not
set this pointer to NULL
2. Guest still sees MSI an enabled after that because emulated config
write is done in vfio_pci_write_config unconditionally before calling
vfio_msi_enable
3. Guest disables MSI interrupts
vfio_msi_disable is called and tries to g_free(vdev->msi_vectors)
in vfio_msi_disable_common => double free
Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Greg Kurz [Thu, 10 Oct 2019 10:36:28 +0000 (12:36 +0200)]
MAINTAINERS: Downgrade status of virtio-9p to "Odd Fixes"
Latest submissions for 9pfs made me realize that I no longer have time
and motivation to actively support it. I'll stay around for odd fixes
though.
Signed-off-by: Greg Kurz <groug@kaod.org>
David Hildenbrand [Tue, 1 Oct 2019 18:03:54 +0000 (20:03 +0200)]
s390x/tcg: MVCL: Exit to main loop if requested
MVCL is interruptible and we should check for interrupts and process
them after writing back the variables to the registers. Let's check
for any exit requests and exit to the main loop. Introduce a new helper
function for that: cpu_loop_exit_requested().
When booting Fedora 30, I can see a handful of these exits and it seems
to work reliable. Also, Richard explained why this works correctly even
when MVCL is called via EXECUTE:
(1) TB with EXECUTE runs, at address Ae
- env->psw_addr stored with Ae.
- helper_ex() runs, memory address Am computed
from D2a(X2a,B2a) or from psw.addr+RI2.
- env->ex_value stored with memory value modified by R1a
(2) TB of executee runs,
- env->ex_value stored with 0.
- helper_mvcl() runs, using and updating R1b, R1b+1, R2b, R2b+1.
(3a) helper_mvcl() completes,
- TB of executee continues, psw.addr += ilen.
- Next instruction is the one following EXECUTE.
(3b) helper_mvcl() exits to main loop,
- cpu_loop_exit_restore() unwinds psw.addr = Ae.
- Next instruction is the EXECUTE itself...
- goto 1.
As the PoP mentiones that an interruptible instruction called via EXECUTE
should avoid modifying storage/registers that are used by EXECUTE itself,
it is fine to retrigger EXECUTE.
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Max Reitz [Wed, 2 Oct 2019 17:40:52 +0000 (19:40 +0200)]
iotests/162: Fix for newer Linux 5.3+
Linux 5.3 has made 0.0.0.0/8 a working IPv4 subnet. As such, "42" is
now a valid host, and the connection to it will (hopefully) time out
over a long period rather than quickly return with EINVAL.
So let us use a negative integer for testing that NBD will not crash
when it receives integer hosts. This way, the connection will again
fail quickly and reliably.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id:
20191002174052.5773-1-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Christian Schoenebeck [Mon, 7 Oct 2019 15:02:45 +0000 (17:02 +0200)]
9p: Use variable length suffixes for inode remapping
Use variable length suffixes for inode remapping instead of the fixed
16 bit size prefixes before. With this change the inode numbers on guest
will typically be much smaller (e.g. around >2^1 .. >2^7 instead of >2^48
with the previous fixed size inode remapping.
Additionally this solution is more efficient, since inode numbers in
practice can take almost their entire 64 bit range on guest as well, so
there is less likely a need for generating and tracking additional suffixes,
which might also be beneficial for nested virtualization where each level of
virtualization would shift up the inode bits and increase the chance of
expensive remapping actions.
The "Exponential Golomb" algorithm is used as basis for generating the
variable length suffixes. The algorithm has a parameter k which controls the
distribution of bits on increasing indeces (minimum bits at low index vs.
maximum bits at high index). With k=0 the generated suffixes look like:
Index Dec/Bin -> Generated Suffix Bin
1 [1] -> [1] (1 bits)
2 [10] -> [010] (3 bits)
3 [11] -> [110] (3 bits)
4 [100] -> [00100] (5 bits)
5 [101] -> [10100] (5 bits)
6 [110] -> [01100] (5 bits)
7 [111] -> [11100] (5 bits)
8 [1000] -> [
0001000] (7 bits)
9 [1001] -> [
1001000] (7 bits)
10 [1010] -> [
0101000] (7 bits)
11 [1011] -> [
1101000] (7 bits)
12 [1100] -> [
0011000] (7 bits)
...
65533 [
1111111111111101] -> [
1011111111111111000000000000000] (31 bits)
65534 [
1111111111111110] -> [
0111111111111111000000000000000] (31 bits)
65535 [
1111111111111111] -> [
1111111111111111000000000000000] (31 bits)
Hence minBits=1 maxBits=31
And with k=5 they would look like:
Index Dec/Bin -> Generated Suffix Bin
1 [1] -> [000001] (6 bits)
2 [10] -> [100001] (6 bits)
3 [11] -> [010001] (6 bits)
4 [100] -> [110001] (6 bits)
5 [101] -> [001001] (6 bits)
6 [110] -> [101001] (6 bits)
7 [111] -> [011001] (6 bits)
8 [1000] -> [111001] (6 bits)
9 [1001] -> [000101] (6 bits)
10 [1010] -> [100101] (6 bits)
11 [1011] -> [010101] (6 bits)
12 [1100] -> [110101] (6 bits)
...
65533 [
1111111111111101] -> [
0011100000000000100000000000] (28 bits)
65534 [
1111111111111110] -> [
1011100000000000100000000000] (28 bits)
65535 [
1111111111111111] -> [
0111100000000000100000000000] (28 bits)
Hence minBits=6 maxBits=28
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Antonios Motakis [Mon, 7 Oct 2019 15:02:45 +0000 (17:02 +0200)]
9p: stat_to_qid: implement slow path
stat_to_qid attempts via qid_path_prefixmap to map unique files (which are
identified by 64 bit inode nr and 32 bit device id) to a 64 QID path value.
However this implementation makes some assumptions about inode number
generation on the host.
If qid_path_prefixmap fails, we still have 48 bits available in the QID
path to fall back to a less memory efficient full mapping.
Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com>
[CS: - Rebased to https://github.com/gkurz/qemu/commits/9p-next
(SHA1
7fc4c49e91).
- Updated hash calls to new xxhash API.
- Removed unnecessary parantheses in qpf_lookup_func().
- Removed unnecessary g_malloc0() result checks.
- Log error message when running out of prefixes in
qid_path_fullmap().
- Log warning message about potential degraded performance in
qid_path_prefixmap().
- Wrapped qpf_table initialization to dedicated qpf_table_init()
function.
- Fixed typo in comment. ]
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Antonios Motakis [Thu, 10 Oct 2019 09:36:05 +0000 (11:36 +0200)]
9p: Added virtfs option 'multidevs=remap|forbid|warn'
'warn' (default): Only log an error message (once) on host if more than one
device is shared by same export, except of that just ignore this config
error though. This is the default behaviour for not breaking existing
installations implying that they really know what they are doing.
'forbid': Like 'warn', but except of just logging an error this
also denies access of guest to additional devices.
'remap': Allows to share more than one device per export by remapping
inodes from host to guest appropriately. To support multiple devices on the
9p share, and avoid qid path collisions we take the device id as input to
generate a unique QID path. The lowest 48 bits of the path will be set
equal to the file inode, and the top bits will be uniquely assigned based
on the top 16 bits of the inode and the device id.
Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com>
[CS: - Rebased to https://github.com/gkurz/qemu/commits/9p-next
(SHA1
7fc4c49e91).
- Added virtfs option 'multidevs', original patch simply did the inode
remapping without being asked.
- Updated hash calls to new xxhash API.
- Updated docs for new option 'multidevs'.
- Fixed v9fs_do_readdir() not having remapped inodes.
- Log error message when running out of prefixes in
qid_path_prefixmap().
- Fixed definition of QPATH_INO_MASK.
- Wrapped qpp_table initialization to dedicated qpp_table_init()
function.
- Dropped unnecessary parantheses in qpp_lookup_func().
- Dropped unnecessary g_malloc0() result checks. ]
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
[groug: - Moved "multidevs" parsing to the local backend.
- Added hint to invalid multidevs option error.
- Turn "remap" into "x-remap". ]
Signed-off-by: Greg Kurz <groug@kaod.org>
Antonios Motakis [Thu, 10 Oct 2019 09:36:05 +0000 (11:36 +0200)]
9p: Treat multiple devices on one export as an error
The QID path should uniquely identify a file. However, the
inode of a file is currently used as the QID path, which
on its own only uniquely identifies files within a device.
Here we track the device hosting the 9pfs share, in order
to prevent security issues with QID path collisions from
other devices.
We only print a warning for now but a subsequent patch will
allow users to have finer control over the desired behaviour.
Failing the I/O will be one the proposed behaviour, so we
also change stat_to_qid() to return an error here in order to
keep other patches simpler.
Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com>
[CS: - Assign dev_id to export root's device already in
v9fs_device_realize_common(), not postponed in
stat_to_qid().
- error_report_once() if more than one device was
shared by export.
- Return -ENODEV instead of -ENOSYS in stat_to_qid().
- Fixed typo in log comment. ]
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
[groug, changed to warning, updated message and changelog]
Signed-off-by: Greg Kurz <groug@kaod.org>
Greg Kurz [Thu, 10 Oct 2019 09:36:05 +0000 (11:36 +0200)]
fsdev: Add return value to fsdev_throttle_parse_opts()
It is more convenient to use the return value of the function to notify
errors, rather than to be tied up setting up the &local_err boilerplate.
Signed-off-by: Greg Kurz <groug@kaod.org>
Greg Kurz [Thu, 10 Oct 2019 09:36:04 +0000 (11:36 +0200)]
9p: Simplify error path of v9fs_device_realize_common()
Make v9fs_device_unrealize_common() idempotent and use it for rollback,
in order to reduce code duplication.
Signed-off-by: Greg Kurz <groug@kaod.org>
Antonios Motakis [Thu, 10 Oct 2019 09:36:04 +0000 (11:36 +0200)]
9p: unsigned type for type, version, path
There is no need for signedness on these QID fields for 9p.
Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com>
[CS: - Also make QID type unsigned.
- Adjust donttouch_stat() to new types.
- Adjust trace-events to new types. ]
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Daniel P. Berrangé [Fri, 27 Sep 2019 10:11:55 +0000 (11:11 +0100)]
tests: fix I/O test for hosts defaulting to LUKSv2
Some distros are now defaulting to LUKS version 2 which QEMU cannot
process. For our I/O test that validates interoperability between the
kernel/cryptsetup and QEMU, we need to explicitly ask for version 1
of the LUKS format.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id:
20190927101155.25896-1-berrange@redhat.com
Tested-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Maxim Levitsky [Mon, 30 Sep 2019 21:38:20 +0000 (00:38 +0300)]
nbd: add empty .bdrv_reopen_prepare
Fixes commit job / qemu-img commit, when
commiting qcow2 file which is based on nbd export.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=
1718727
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-id:
20190930213820.29777-2-mlevitsk@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy [Tue, 1 Oct 2019 13:14:09 +0000 (16:14 +0300)]
block/backup: use backup-top instead of write notifiers
Drop write notifiers and use filter node instead.
= Changes =
1. Add filter-node-name argument for backup qmp api. We have to do it
in this commit, as 257 needs to be fixed.
2. There are no more write notifiers here, so is_write_notifier
parameter is dropped from block-copy paths.
3. To sync with in-flight requests at job finish we now have drained
removing of the filter, we don't need rw-lock.
4. Block-copy is now using BdrvChildren instead of BlockBackends
5. As backup-top owns these children, we also move block-copy state
into backup-top's ownership.
= Iotest changes =
56: op-blocker doesn't shoot now, as we set it on source, but then
check on filter, when trying to start second backup.
To keep the test we instead can catch another collision: both jobs will
get 'drive0' job-id, as job-id parameter is unspecified. To prevent
interleaving with file-posix locks (as they are dependent on config)
let's use another target for second backup.
Also, it's obvious now that we'd like to drop this op-blocker at all
and add a test-case for two backups from one node (to different
destinations) actually works. But not in these series.
141: Output changed: prepatch, "Node is in use" comes from bdrv_has_blk
check inside qmp_blockdev_del. But we've dropped block-copy blk
objects, so no more blk objects on source bs (job blk is on backup-top
filter bs). New message is from op-blocker, which is the next check in
qmp_blockdev_add.
257: The test wants to emulate guest write during backup. They should
go to filter node, not to original source node, of course. Therefore we
need to specify filter node name and use it.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id:
20191001131409.14202-6-vsementsov@virtuozzo.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy [Tue, 1 Oct 2019 13:14:08 +0000 (16:14 +0300)]
block: introduce backup-top filter driver
Backup-top filter caches write operations and does copy-before-write
operations.
The driver will be used in backup instead of write-notifiers.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id:
20191001131409.14202-5-vsementsov@virtuozzo.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy [Tue, 1 Oct 2019 13:14:07 +0000 (16:14 +0300)]
block/block-copy: split block_copy_set_callbacks function
Split block_copy_set_callbacks out of block_copy_state_new. It's needed
for further commit: block-copy will use BdrvChildren of backup-top
filter, so it will be created from backup-top filter creation function.
But callbacks will still belong to backup job and will be set in
separate.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id:
20191001131409.14202-4-vsementsov@virtuozzo.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy [Tue, 1 Oct 2019 13:14:06 +0000 (16:14 +0300)]
block/backup: move write_flags calculation inside backup_job_create
This is logic-less refactoring, which simplifies further patch, as
we'll need write_flags for backup-top filter creation and backup-top
should be created before block job creation.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id:
20191001131409.14202-3-vsementsov@virtuozzo.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy [Tue, 1 Oct 2019 13:14:05 +0000 (16:14 +0300)]
block/backup: move in-flight requests handling from backup to block-copy
Move synchronization mechanism to block-copy, to be able to use one
block-copy instance from backup job and backup-top filter in parallel.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id:
20191001131409.14202-2-vsementsov@virtuozzo.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Max Reitz [Wed, 25 Sep 2019 18:32:31 +0000 (20:32 +0200)]
iotests: Use stat -c %b in 125
125 should not use qemu-img to get the on-disk image size, because that
reports it in a human-readable format that is useless to us. Just use
stat instead (like we do to get the image file length).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id:
20190925183231.11196-4-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Max Reitz [Wed, 25 Sep 2019 18:32:30 +0000 (20:32 +0200)]
iotests: Disable 125 on broken XFS versions
And by that I mean all XFS versions, as far as I can tell. All details
are in the comment below.
We never noticed this problem because we only read the first number from
qemu-img info's "disk size" output -- and that is effectively useless,
because qemu-img prints a human-readable value (which generally includes
a decimal point). That will be fixed in the next patch.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id:
20190925183231.11196-3-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Max Reitz [Wed, 25 Sep 2019 18:32:29 +0000 (20:32 +0200)]
iotests: Fix 125 for growth_mode = metadata
If we use growth_mode = metadata, it is very much possible that the file
uses more disk space after we have written something to the added area.
We did indeed want to test for this case, but unfortunately we evidently
just copied the code from the "Test creation preallocation" section and
forgot to replace "$create_mode" by "$growth_mode".
We never noticed because we only read the first number from qemu-img
info's "disk size" output -- and that is effectively useless, because
qemu-img prints a human-readable value (which generally includes a
decimal point). That will be fixed in the patch after the next one.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id:
20190925183231.11196-2-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Anton Nefedov [Mon, 23 Sep 2019 12:17:37 +0000 (15:17 +0300)]
qapi: query-blockstat: add driver specific file-posix stats
A block driver can provide a callback to report driver-specific
statistics.
file-posix driver now reports discard statistics
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-id:
20190923121737.83281-10-anton.nefedov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Anton Nefedov [Mon, 23 Sep 2019 12:17:36 +0000 (15:17 +0300)]
file-posix: account discard operations
This will help to identify how many of the user-issued discard operations
(accounted on a device level) have actually suceeded down on the host file
(even though the numbers will not be exactly the same if non-raw format
driver is used (e.g. qcow2 sending metadata discards)).
Note that these numbers will not include discards triggered by
write-zeroes + MAY_UNMAP calls.
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id:
20190923121737.83281-9-anton.nefedov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Anton Nefedov [Mon, 23 Sep 2019 12:17:35 +0000 (15:17 +0300)]
scsi: account unmap operations
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id:
20190923121737.83281-8-anton.nefedov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Anton Nefedov [Mon, 23 Sep 2019 12:17:34 +0000 (15:17 +0300)]
scsi: move unmap error checking to the complete callback
This will help to account the operation in the following commit.
The difference is that we don't call scsi_disk_req_check_error() before
the 1st discard iteration anymore. That function also checks if
the request is cancelled, however it shouldn't get canceled until it
yields in blk_aio() functions anyway.
Same approach is already used for emulate_write_same.
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id:
20190923121737.83281-7-anton.nefedov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Anton Nefedov [Mon, 23 Sep 2019 12:17:33 +0000 (15:17 +0300)]
scsi: store unmap offset and nb_sectors in request struct
it allows to report it in the error handler
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Message-id:
20190923121737.83281-6-anton.nefedov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Anton Nefedov [Mon, 23 Sep 2019 12:17:32 +0000 (15:17 +0300)]
ide: account UNMAP (TRIM) operations
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id:
20190923121737.83281-5-anton.nefedov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Anton Nefedov [Mon, 23 Sep 2019 12:17:31 +0000 (15:17 +0300)]
block: add empty account cookie type
Each block_acct_done/failed call is designed to correspond to a
previous block_acct_start call, which initializes the stats cookie.
However sometimes it is not the case, e.g. some error paths might
report the same cookie twice because it is hard to accurately track if
the cookie was reported yet or not.
This patch cleans the cookie after report.
(Note: block_acct_failed/done without a previous block_acct_start at
all should be avoided. Uninitialized cookie might hold a garbage value
and there is still "< BLOCK_MAX_IOTYPE" assertion for that)
It will be particularly useful in ide code where it's hard to
keep track whether the request done its accounting or not: in the
following patch of the series, trim requests will do the accounting
separately.
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id:
20190923121737.83281-4-anton.nefedov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Anton Nefedov [Mon, 23 Sep 2019 12:17:30 +0000 (15:17 +0300)]
qapi: add unmap to BlockDeviceStats
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id:
20190923121737.83281-3-anton.nefedov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Anton Nefedov [Mon, 23 Sep 2019 12:17:29 +0000 (15:17 +0300)]
qapi: group BlockDeviceStats fields
Make the stat fields definition slightly more readable.
Also reorder total_time_ns stats read-write-flush as done elsewhere.
Cosmetic change only.
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id:
20190923121737.83281-2-anton.nefedov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy [Fri, 20 Sep 2019 14:20:52 +0000 (17:20 +0300)]
iotests: 257: drop device_add
SCSI devices are unused in test, drop them.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id:
20190920142056.12778-12-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy [Fri, 20 Sep 2019 14:20:51 +0000 (17:20 +0300)]
iotests: 257: drop unused Drive.device field
After previous commit Drive.device is actually unused. Drop it together
with .name property. While being here reuse .node in qmp commands
instead of writing 'drive0' twice.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id:
20190920142056.12778-11-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy [Fri, 20 Sep 2019 14:20:50 +0000 (17:20 +0300)]
iotests: prepare 124 and 257 bitmap querying for backup-top filter
After backup-top filter appearing it's not possible to see dirty
bitmaps in top node, so use node-name instead.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id:
20190920142056.12778-10-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy [Fri, 20 Sep 2019 14:20:49 +0000 (17:20 +0300)]
block: teach bdrv_debug_breakpoint skip filters with backing
Teach bdrv_debug_breakpoint and bdrv_debug_remove_breakpoint skip
filters with backing. This is needed to implement and use in backup job
it's own backup_top filter driver (like mirror already has one), and
without this improvement, breakpoint removal will fail at least in 55
iotest.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id:
20190920142056.12778-9-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy [Fri, 20 Sep 2019 14:20:48 +0000 (17:20 +0300)]
block: move block_copy from block/backup.c to separate file
Split block_copy to separate file, to be cleanly shared with backup-top
filter driver in further commits.
It's a clean movement, the only change is drop "static" from interface
functions.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id:
20190920142056.12778-8-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy [Fri, 20 Sep 2019 14:20:47 +0000 (17:20 +0300)]
block/backup: fix block-comment style
We need to fix comment style around block-copy functions before further
moving them to separate file to satisfy checkpatch. But do more: fix
all comments style. Also, seems like doubled first asterisk is not
forbidden, but drop it too for consistency.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id:
20190920142056.12778-7-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy [Fri, 20 Sep 2019 14:20:46 +0000 (17:20 +0300)]
block/backup: introduce BlockCopyState
Split copying code part from backup to "block-copy", including separate
state structure and function renaming. This is needed to share it with
backup-top filter driver in further commits.
Notes:
1. As BlockCopyState keeps own BlockBackend objects, remaining
job->common.blk users only use it to get bs by blk_bs() call, so clear
job->commen.blk permissions set in block_job_create and add
job->source_bs to be used instead of blk_bs(job->common.blk), to keep
it more clear which bs we use when introduce backup-top filter in
further commit.
2. Rename s/initializing_bitmap/skip_unallocated/ to sound a bit better
as interface to BlockCopyState
3. Split is not very clean: there left some duplicated fields, backup
code uses some BlockCopyState fields directly, let's postpone it for
further improvements and keep this comment simpler for review.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id:
20190920142056.12778-6-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy [Fri, 20 Sep 2019 14:20:45 +0000 (17:20 +0300)]
block/backup: improve comment about image fleecing
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id:
20190920142056.12778-5-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy [Fri, 20 Sep 2019 14:20:44 +0000 (17:20 +0300)]
block/backup: split shareable copying part from backup_do_cow
Split copying logic which will be shared with backup-top filter.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id:
20190920142056.12778-4-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy [Fri, 20 Sep 2019 14:20:43 +0000 (17:20 +0300)]
block/backup: fix backup_cow_with_offload for last cluster
We shouldn't try to copy bytes beyond EOF. Fix it.
Fixes: 9ded4a0114968e
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id:
20190920142056.12778-3-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy [Fri, 20 Sep 2019 14:20:42 +0000 (17:20 +0300)]
block/backup: fix max_transfer handling for copy_range
Of course, QEMU_ALIGN_UP is a typo, it should be QEMU_ALIGN_DOWN, as we
are trying to find aligned size which satisfy both source and target.
Also, don't ignore too small max_transfer. In this case seems safer to
disable copy_range.
Fixes: 9ded4a0114968e
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id:
20190920142056.12778-2-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy [Mon, 16 Sep 2019 17:53:24 +0000 (20:53 +0300)]
block/qcow2: introduce parallel subrequest handling in read and write
It improves performance for fragmented qcow2 images.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id:
20190916175324.18478-6-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy [Mon, 16 Sep 2019 17:53:23 +0000 (20:53 +0300)]
block/qcow2: refactor qcow2_co_pwritev_part
Similarly to previous commit, prepare for parallelizing write-loop
iterations.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id:
20190916175324.18478-5-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy [Mon, 16 Sep 2019 17:53:22 +0000 (20:53 +0300)]
block/qcow2: refactor qcow2_co_preadv_part
Further patch will run partial requests of iterations of
qcow2_co_preadv in parallel for performance reasons. To prepare for
this, separate part which may be parallelized into separate function
(qcow2_co_preadv_task).
While being here, also separate encrypted clusters reading to own
function, like it is done for compressed reading.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id:
20190916175324.18478-4-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy [Mon, 16 Sep 2019 17:53:21 +0000 (20:53 +0300)]
block: introduce aio task pool
Common interface for aio task loops. To be used for improving
performance of synchronous io loops in qcow2, block-stream,
copy-on-read, and may be other places.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id:
20190916175324.18478-3-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy [Mon, 16 Sep 2019 17:53:20 +0000 (20:53 +0300)]
qemu-iotests: ignore leaks on failure paths in 026
Upcoming asynchronous handling of sub-parts of qcow2 requests will
change number of leaked clusters and even make it racy. As a
preparation, ignore leaks on failure parts in 026.
It's not trivial to just grep or substitute qemu-img output for such
thing. Instead do better: 3 is a error code of qemu-img check, if only
leaks are found. Catch this case and print success output.
Suggested-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id:
20190916175324.18478-2-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Richard Henderson [Tue, 1 Oct 2019 17:16:14 +0000 (10:16 -0700)]
target/s390x: Remove ILEN_UNWIND
This setting is no longer used.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20191001171614.8405-19-richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Richard Henderson [Tue, 1 Oct 2019 17:16:13 +0000 (10:16 -0700)]
target/s390x: Remove ilen argument from trigger_pgm_exception
All but one caller passes ILEN_UNWIND, which is not stored.
For the one use case in s390_cpu_tlb_fill, set int_pgm_ilen
directly, simply to avoid the assert within do_program_interrupt.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20191001171614.8405-18-richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Richard Henderson [Tue, 1 Oct 2019 17:16:12 +0000 (10:16 -0700)]
target/s390x: Remove ilen argument from trigger_access_exception
The single caller passes ILEN_UNWIND; pass that along to
trigger_pgm_exception directly.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20191001171614.8405-17-richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Richard Henderson [Tue, 1 Oct 2019 17:16:11 +0000 (10:16 -0700)]
target/s390x: Remove ILEN_AUTO
This setting is no longer used.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20191001171614.8405-16-richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Richard Henderson [Tue, 1 Oct 2019 17:16:10 +0000 (10:16 -0700)]
target/s390x: Rely on unwinding in s390_cpu_virt_mem_rw
For TCG, we will always call s390_cpu_virt_mem_handle_exc,
which will go through the unwinder to set ILEN. For KVM,
we do not go through do_program_interrupt, so this argument
is unused.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20191001171614.8405-15-richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Richard Henderson [Tue, 1 Oct 2019 17:16:09 +0000 (10:16 -0700)]
target/s390x: Rely on unwinding in s390_cpu_tlb_fill
We currently set ilen to AUTO, then overwrite that during
unwinding, then overwrite that for the code access case.
This can be simplified to setting ilen to our arbitrary
value for the (undefined) code access case, then rely on
unwinding to overwrite that with the correct value for
the data access case.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20191001171614.8405-14-richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Richard Henderson [Tue, 1 Oct 2019 17:16:08 +0000 (10:16 -0700)]
target/s390x: Simplify helper_lra
We currently call trigger_pgm_exception to set cs->exception_index
and env->int_pgm_code and then read the values back and then
reset cs->exception_index so that the exception is not delivered.
Instead, use the exception type that we already have directly
without ever triggering an exception that must be suppressed.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20191001171614.8405-13-richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Richard Henderson [Tue, 1 Oct 2019 17:16:07 +0000 (10:16 -0700)]
target/s390x: Remove fail variable from s390_cpu_tlb_fill
Now that excp always contains a real exception number, we can
use that instead of a separate fail variable. This allows a
redundant test to be removed.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20191001171614.8405-12-richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Richard Henderson [Tue, 1 Oct 2019 17:16:06 +0000 (10:16 -0700)]
target/s390x: Return exception from translate_pages
Do not raise the exception directly within translate_pages,
but pass it back so that caller may do so.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20191001171614.8405-11-richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Richard Henderson [Tue, 1 Oct 2019 17:16:05 +0000 (10:16 -0700)]
target/s390x: Return exception from mmu_translate
Do not raise the exception directly within mmu_translate,
but pass it back so that caller may do so.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20191001171614.8405-10-richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Richard Henderson [Tue, 1 Oct 2019 17:16:04 +0000 (10:16 -0700)]
target/s390x: Remove exc argument to mmu_translate_asce
Now that mmu_translate_asce returns the exception instead of
raising it, the argument is unused.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20191001171614.8405-9-richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Richard Henderson [Tue, 1 Oct 2019 17:16:03 +0000 (10:16 -0700)]
target/s390x: Return exception from mmu_translate_real
Do not raise the exception directly within mmu_translate_real,
but pass it back so that caller may do so.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20191001171614.8405-8-richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Richard Henderson [Tue, 1 Oct 2019 17:16:02 +0000 (10:16 -0700)]
target/s390x: Handle tec in s390_cpu_tlb_fill
As a step toward moving all excption handling out of mmu_translate,
copy handling of the LowCore tec value from trigger_access_exception
into s390_cpu_tlb_fill. So far this new plumbing isn't used.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20191001171614.8405-7-richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Richard Henderson [Tue, 1 Oct 2019 17:16:01 +0000 (10:16 -0700)]
target/s390x: Push trigger_pgm_exception lower in s390_cpu_tlb_fill
Delay triggering an exception until the end, after we have
determined ultimate success or failure, and also taken into
account whether this is a non-faulting probe.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20191001171614.8405-6-richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Richard Henderson [Tue, 1 Oct 2019 17:16:00 +0000 (10:16 -0700)]
target/s390x: Use tcg_s390_program_interrupt in TCG helpers
Replace all uses of s390_program_interrupt within files
that are marked CONFIG_TCG. These are necessarily tcg-only.
This lets each of these users benefit from the QEMU_NORETURN
attribute on tcg_s390_program_interrupt.
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20191001171614.8405-5-richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Richard Henderson [Tue, 1 Oct 2019 17:15:59 +0000 (10:15 -0700)]
target/s390x: Remove ilen parameter from s390_program_interrupt
This is no longer used, and many of the existing uses -- particularly
within hw/s390x -- seem questionable.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20191001171614.8405-4-richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Richard Henderson [Tue, 1 Oct 2019 17:15:58 +0000 (10:15 -0700)]
target/s390x: Remove ilen parameter from tcg_s390_program_interrupt
Since we begin the operation with an unwind, we have the proper
value of ilen immediately available.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20191001171614.8405-3-richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Richard Henderson [Tue, 1 Oct 2019 17:15:57 +0000 (10:15 -0700)]
target/s390x: Add ilen to unwind data
Use ILEN_UNWIND to signal that we have in fact that cpu_restore_state
will have been called by the time we arrive in do_program_interrupt.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <
20191001171614.8405-2-richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
David Hildenbrand [Mon, 5 Aug 2019 08:19:22 +0000 (10:19 +0200)]
s390x/cpumodel: Add new TCG features to QEMU cpu model
We now implement a bunch of new facilities we can properly indicate.
ESOP-1/ESOP-2 handling is discussed in the PoP Chafter 3-15
("Suppression on Protection"). The "Basic suppression-on-protection (SOP)
facility" is a core part of z/Architecture without a facility
indication. ESOP-2 is indicated by ESOP-1 + Side-effect facility
("ESOP-2"). Besides ESOP-2, the side-effect facility is only relevant for
the guarded-storage facility (we don't implement).
S390_ESOP:
- We indicate DAT exeptions by setting bit 61 of the TEID (TEC) to 1 and
bit 60 to zero. We don't trigger ALCP exceptions yet. Also, we set
bit 0-51 and bit 62/63 to the right values.
S390_ACCESS_EXCEPTION_FS_INDICATION:
- The TEID (TEC) properly indicates in bit 52/53 on any access if it was
a fetch or a store
S390_SIDE_EFFECT_ACCESS_ESOP2:
- We have no side-effect accesses (esp., we don't implement the
guarded-storage faciliy), we correctly set bit 64 of the TEID (TEC) to
0 (no side-effect).
- ESOP2: We properly set bit 56, 60, 61 in the TEID (TEC) to indicate the
type of protection. We don't trigger KCP/ALCP exceptions yet.
S390_INSTRUCTION_EXEC_PROT:
- The MMU properly detects and indicates the exception on instruction fetches
- Protected TLB entries will never get PAGE_EXEC set.
There is no need to fake the abscence of any of the facilities - without
the facilities, some bits of the TEID (TEC) are simply unpredictable.
As IEP was added with z14 and we currently implement a z13, add it to
the MAX model instead.
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
David Hildenbrand [Mon, 5 Aug 2019 08:13:52 +0000 (10:13 +0200)]
s390x/cpumodel: Prepare for changes of QEMU model
Setup the 4.1 compatibility model so we can add new features to the
LATEST model.
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
David Hildenbrand [Sun, 14 Jan 2018 23:29:22 +0000 (00:29 +0100)]
s390x/mmu: Implement Instruction-Execution-Protection Facility
IEP support in the mmu is fairly easy. Set the right permissions for TLB
entries and properly report an exception.
Make sure to handle EDAT-2 by setting bit 56/60/61 of the TEID (TEC) to
the right values.
Let's keep s390_cpu_get_phys_page_debug() working even if IEP is
active. Switch MMU_DATA_LOAD - this has no other effects any more as the
ASC to be used is now fully selected outside of mmu_translate().
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
David Hildenbrand [Mon, 15 Jan 2018 00:17:34 +0000 (01:17 +0100)]
s390x/mmu: Implement ESOP-2 and access-exception-fetch/store-indication facility
We already implement ESOP-1. For ESOP-2, we only have to indicate all
protection exceptions properly. Due to EDAT-1, we already indicate DAT
exceptions properly. We don't trigger KCP/ALCP/IEP exceptions yet.
So all we have to do is set the TEID (TEC) to the right values
(bit 56, 60, 61) in case of LAP.
We don't have any side-effects (e.g., no guarded-storage facility),
therefore, bit 64 of the TEID (TEC) is always 0.
We always have to indicate whether it is a fetch or a store for all access
exceptions. This is only missing for LAP exceptions.
Acked-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
David Hildenbrand [Sun, 14 Jan 2018 23:04:07 +0000 (00:04 +0100)]
s390x/mmu: Add EDAT2 translation support
This only adds basic support to the DAT translation, but no EDAT2 support
for TCG. E.g., the gdbstub under kvm uses this function, too, to
translate virtual addresses.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
David Hildenbrand [Wed, 25 Sep 2019 12:24:42 +0000 (14:24 +0200)]
s390x/mmu: Convert to non-recursive page table walk
A non-recursive implementation allows to make better use of the
branch predictor, avoids function calls, and makes the implementation of
new features only for a subset of region table levels easier.
We can now directly compare our implementation to the KVM gaccess
implementation in arch/s390/kvm/gaccess.c:guest_translate().
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>