qemu.git
7 years agoMerge remote-tracking branch 'dgibson/tags/ppc-for-2.10-20170525' into staging
Stefan Hajnoczi [Tue, 30 May 2017 08:44:54 +0000 (09:44 +0100)]
Merge remote-tracking branch 'dgibson/tags/ppc-for-2.10-20170525' into staging

ppc patch queue 2017-05-25

Assorted accumulated patches.  These are nearly all bugfixes at one
level or another - some for longstanding problems, others for some
regressions caused by more recent cleanups.

This includes preliminary patches towards fixing migration for Radix
Page Table guests under POWER9 and also fixing some migration
regressions due to the re-organization of the interrupt controller
code.  Not all the pieces are there yet, so those still won't quite
work, but the preliminary changes make sense on their own.

# gpg: Signature made Thu 25 May 2017 04:50:00 AM BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* dgibson/tags/ppc-for-2.10-20170525:
  xics: add unrealize handler
  hw/ppc/spapr.c: recover pending LMB unplug info in spapr_lmb_release
  hw/ppc: migrating the DRC state of hotplugged devices
  hw/ppc: removing drc->detach_cb and drc->detach_cb_opaque
  hw/ppc/spapr.c: adding pending_dimm_unplugs to sPAPRMachineState
  spapr: add pre_plug function for memory
  pseries: Restore support for total vcpus not a multiple of threads-per-core for old machine types
  pseries: Split CAS PVR negotiation out into a separate function
  spapr: fix error reporting in xics_system_init()
  spapr_cpu_core: drop reference on ICP object during CPU realization
  hw/ppc/spapr_events.c: removing 'exception' from sPAPREventLogEntry
  spapr: ensure core_slot isn't NULL in spapr_core_unplug()
  xics_kvm: cache already enabled vCPU ids
  spapr: Consolidate HPT freeing code into a routine
  spapr-cpu-core: release ICP object when realization fails
  spapr: sanitize error handling in spapr_ics_create()
  ppc/xics: simplify prototype of xics_spapr_init()
  target/ppc: reset reservation in do_rfi()

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7 years agoMerge remote-tracking branch 'armbru/tags/pull-qapi-2017-05-23' into staging
Stefan Hajnoczi [Tue, 30 May 2017 08:33:36 +0000 (09:33 +0100)]
Merge remote-tracking branch 'armbru/tags/pull-qapi-2017-05-23' into staging

QAPI patches for 2017-05-23

# gpg: Signature made Tue 23 May 2017 12:33:32 PM BST
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* armbru/tags/pull-qapi-2017-05-23:
  qapi-schema: Remove obsolete note from ObjectTypeInfo
  block: Use QDict helpers for --force-share
  shutdown: Expose bool cause in SHUTDOWN and RESET events
  shutdown: Add source information to SHUTDOWN and RESET
  shutdown: Preserve shutdown cause through replay
  shutdown: Prepare for use of an enum in reset/shutdown_request
  shutdown: Simplify shutdown_signal
  sockets: Plug memory leak in socket_address_flatten()
  scripts/qmp/qom-set: fix the value argument passed to srv.command()

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7 years agoMerge remote-tracking branch 'ehabkost/tags/numa-pull-request' into staging
Stefan Hajnoczi [Tue, 30 May 2017 08:31:05 +0000 (09:31 +0100)]
Merge remote-tracking branch 'ehabkost/tags/numa-pull-request' into staging

Silence "make check" warnings on NUMA test

# gpg: Signature made Tue 23 May 2017 11:44:24 AM BST
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* ehabkost/tags/numa-pull-request:
  numa: Silence incomplete mapping warning under qtest

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7 years agoxics: add unrealize handler
Greg Kurz [Wed, 24 May 2017 17:40:43 +0000 (19:40 +0200)]
xics: add unrealize handler

Now that ICPState objects get finalized on CPU unplug, we should unregister
reset handlers as well to avoid a QEMU crash at machine reset time.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
7 years agohw/ppc/spapr.c: recover pending LMB unplug info in spapr_lmb_release
Daniel Henrique Barboza [Mon, 22 May 2017 19:35:50 +0000 (16:35 -0300)]
hw/ppc/spapr.c: recover pending LMB unplug info in spapr_lmb_release

When a LMB hot unplug starts, the current DRC LMB status is stored at
spapr->pending_dimm_unplugs QTAILQ. This queue isn't migrated, thus
if a migration occurs in the middle of a LMB unplug the
spapr_lmb_release callback will lost track of the LMB unplug progress.

This patch implements a new recover function spapr_recover_pending_dimm_state
that is used inside spapr_lmb_release to recover this DRC LMB release
status that is lost during the migration.

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
[dwg: Minor stylistic changes, simplify error handling]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
7 years agohw/ppc: migrating the DRC state of hotplugged devices
Daniel Henrique Barboza [Mon, 22 May 2017 19:35:49 +0000 (16:35 -0300)]
hw/ppc: migrating the DRC state of hotplugged devices

In pseries, a firmware abstraction called Dynamic Reconfiguration
Connector (DRC) is used to assign a particular dynamic resource
to the guest and provide an interface to manage configuration/removal
of the resource associated with it. In other words, DRC is the
'plugged state' of a device.

Before this patch, DRC wasn't being migrated. This causes
post-migration problems due to DRC state mismatch between source and
target. The DRC state of a device X in the source might
change, while in the target the DRC state of X is still fresh. When
migrating the guest, X will not have the same hotplugged state as it
did in the source. This means that we can't hot unplug X in the
target after migration is completed because its DRC state is not consistent.
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1677552 is one
bug that is caused by this DRC state mismatch between source and
target.

To migrate the DRC state, we defined the VMStateDescription struct for
spapr_drc to enable the transmission of spapr_drc state in migration.
Not all the elements in the DRC state are migrated - only those
that can be modified by guest actions or device add/remove
operations:

- 'isolation_state', 'allocation_state' and 'indicator_state'
are involved in the DR state transition diagram from
PAPR+ 2.7, 13.4;

- 'configured', 'signalled', 'awaiting_release' and 'awaiting_allocation'
are needed in attaching and detaching devices;

- 'indicator_state' provides users with hardware state information.

These are the DRC elements that are migrated.

In this patch the DRC state is migrated for PCI, LMB and CPU
connector types. At this moment there is no support to migrate
DRC for the PHB (PCI Host Bridge) type.

In the 'realize' function the DRC is registered using vmstate_register,
similar to what hw/ppc/spapr_iommu.c does in 'spapr_tce_table_realize'.
This approach works because  DRCs are bus-less and do not sit
on a BusClass that implements bc->get_dev_path, so as a fallback the
VMSD gets identified via "spapr_drc"/get_index(drc).

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
7 years agohw/ppc: removing drc->detach_cb and drc->detach_cb_opaque
Daniel Henrique Barboza [Mon, 22 May 2017 19:35:48 +0000 (16:35 -0300)]
hw/ppc: removing drc->detach_cb and drc->detach_cb_opaque

The pointer drc->detach_cb is being used as a way of informing
the detach() function inside spapr_drc.c which cb to execute. This
information can also be retrieved simply by checking drc->type and
choosing the right callback based on it. In this context, detach_cb
is redundant information that must be managed.

After the previous spapr_lmb_release change, no detach_cb_opaques
are being used by any of the three callbacks functions. This is
yet another information that is now unused and, on top of that, can't
be migrated either.

This patch makes the following changes:

- removal of detach_cb_opaque. the 'opaque' argument was removed from
the callbacks and from the detach() function of sPAPRConnectorClass. The
attribute detach_cb_opaque of sPAPRConnector was removed.

- removal of detach_cb from the detach() call. The function pointer
detach_cb of sPAPRConnector was removed. detach() now uses a
switch(drc->type) to execute the apropriate callback. To achieve this,
spapr_core_release, spapr_lmb_release and spapr_phb_remove_pci_device_cb
callbacks were made public to be visible inside detach().

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
7 years agohw/ppc/spapr.c: adding pending_dimm_unplugs to sPAPRMachineState
David Gibson [Wed, 24 May 2017 07:01:48 +0000 (17:01 +1000)]
hw/ppc/spapr.c: adding pending_dimm_unplugs to sPAPRMachineState

The LMB DRC release callback, spapr_lmb_release(), uses an opaque
parameter, a sPAPRDIMMState struct that stores the current LMBs that
are allocated to a DIMM (nr_lmbs). After each call to this callback,
the nr_lmbs is decremented by one and, when it reaches zero, the callback
proceeds with the qdev calls to hot unplug the LMB.

Using drc->detach_cb_opaque is problematic because it can't be migrated in
the future DRC migration work. This patch makes the following changes to
eliminate the usage of this opaque callback inside spapr_lmb_release:

- sPAPRDIMMState was moved from spapr.c and added to spapr.h. A new
attribute called 'addr' was added to it. This is used as an unique
identifier to associate a sPAPRDIMMState to a PCDIMM element.

- sPAPRMachineState now hosts a new QTAILQ called 'pending_dimm_unplugs'.
This queue of sPAPRDIMMState elements will store the DIMM state of DIMMs
that are currently going under an unplug process.

- spapr_lmb_release() will now retrieve the nr_lmbs value by getting the
correspondent sPAPRDIMMState. A helper function called spapr_dimm_get_address
was created to fetch the address of a PCDIMM device inside spapr_lmb_release.
When nr_lmbs reaches zero and the callback proceeds with the qdev hot unplug
calls, the sPAPRDIMMState struct is removed from spapr->pending_dimm_unplugs.

After these changes, the opaque argument for spapr_lmb_release is now
unused and is passed as NULL inside spapr_del_lmbs. This and the other
opaque arguments can now be safely removed from the code.

As an additional cleanup made by this patch, the spapr_del_lmbs function
was merged with spapr_memory_unplug_request. The former was being called
only by the latter and both were small enough to fit one single function.

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
[dwg: Minor stylistic cleanups]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
7 years agoMerge remote-tracking branch 'cohuck/tags/s390x-20170523' into staging
Stefan Hajnoczi [Wed, 24 May 2017 12:53:05 +0000 (13:53 +0100)]
Merge remote-tracking branch 'cohuck/tags/s390x-20170523' into staging

s390x updates:
- support for vfio-ccw to passthrough channel devices
- allow ccw bios to boot from scsi generic devices
- bugfix for initial reset

# gpg: Signature made Tue 23 May 2017 12:02:24 PM BST
# gpg:                using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>"
# gpg:                 aka "Cornelia Huck <cohuck@kernel.org>"
# gpg:                 aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# gpg:                 aka "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0  18CE DECF 6B93 C6F0 2FAF

* cohuck/tags/s390x-20170523: (21 commits)
  s390/kvm: do not reset riccb on initial cpu reset
  MAINTAINERS: Add vfio-ccw maintainer
  vfio/ccw: update sense data if a unit check is pending
  s390x/css: ccw translation infrastructure
  s390x/css: introduce and realize ccw-request callback
  vfio/ccw: get irqs info and set the eventfd fd
  vfio/ccw: get io region info
  vfio/ccw: vfio based subchannel passthrough driver
  s390x/css: device support for s390-ccw passthrough
  s390x/css: realize css_create_sch
  s390x/css: realize css_sch_build_schib
  s390x/css: add s390-squash-mcss machine option
  linux-headers: update
  pc-bios/s390-ccw.img: rebuild image
  pc-bios/s390-ccw: Build a reasonable max_sectors limit
  pc-bios/s390-ccw: Get Block Limits VPD device data
  pc-bios/s390-ccw: Get list of supported VPD pages
  pc-bios/s390-ccw: Refactor scsi_inquiry function
  pc-bios/s390-ccw: Break up virtio-scsi read into multiples
  pc-bios/s390-ccw: Move SCSI block factor to outer read
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7 years agospapr: add pre_plug function for memory
Laurent Vivier [Tue, 23 May 2017 11:18:09 +0000 (13:18 +0200)]
spapr: add pre_plug function for memory

This allows to manage errors before the memory
has started to be hotplugged. We already have
the function for the CPU cores.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
[dwg: Fixed a couple of style nits]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
7 years agopseries: Restore support for total vcpus not a multiple of threads-per-core for old...
David Gibson [Tue, 23 May 2017 06:33:06 +0000 (16:33 +1000)]
pseries: Restore support for total vcpus not a multiple of threads-per-core for old machine types

As of pseries-2.7 and later, we require the total number of guest vcpus to
be a multiple of the threads-per-core.  pseries-2.6 and earlier machine
types, however, are supposed to allow this for the sake of migration from
old qemu versions which allowed this.

Unfortunately, 8149e29 "pseries: Enforce homogeneous threads-per-core"
broke this by not considering the old machine type case.  This fixes it by
only applying the check when the machine type supports hotpluggable cpus.
By not-entirely-coincidence, that corresponds to the same time when we
started enforcing total threads being a multiple of threads-per-core.

Fixes: 8149e2992f7811355cc34721b79d69d1a3a667dd
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
7 years agopseries: Split CAS PVR negotiation out into a separate function
David Gibson [Thu, 18 May 2017 04:47:44 +0000 (14:47 +1000)]
pseries: Split CAS PVR negotiation out into a separate function

Guests of the qemu machine type go through a feature negotiation process
known as "client architecture support" (CAS) during early boot.  This does
a number of things, one of which is finding a CPU compatibility mode which
can be supported by both guest and host.

In fact the CPU negotiation is probably the single most complex part of the
CAS process, so this splits it out into a helper function.  We've recently
made some mistakes in maintaining backward compatibility for old machine
types here.  Splitting this out will also make it easier to fix this.

This also adds a possibly useful error message if the negotiation fails
(i.e. if there isn't a CPU mode that's suitable for both guest and host).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
7 years agospapr: fix error reporting in xics_system_init()
Greg Kurz [Fri, 19 May 2017 10:32:12 +0000 (12:32 +0200)]
spapr: fix error reporting in xics_system_init()

If the user explicitely asked for kernel-irqchip support and "xics-kvm"
initialization fails, we shouldn't fallback to emulated "xics" as we
do now. It is also awkward to print an error message when we have an
errp pointer argument.

Let's use the errp argument to report the error and let the caller decide.
This simplifies the code as we don't need a local Error * here.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
7 years agospapr_cpu_core: drop reference on ICP object during CPU realization
Greg Kurz [Fri, 19 May 2017 10:32:04 +0000 (12:32 +0200)]
spapr_cpu_core: drop reference on ICP object during CPU realization

When a piece of code allocates an object, it implicitely gets a reference
on it. If it then makes that object a child property of another object, it
should drop its own reference at some point otherwise the child object can
never be finalized. The current code hence leaks one ICP object per CPU
when hot-removing a core.

Failing to add a newly allocated ICP object to the CPU is a bug. While here,
let's ensure QEMU aborts if this ever happens.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
7 years agohw/ppc/spapr_events.c: removing 'exception' from sPAPREventLogEntry
Daniel Henrique Barboza [Fri, 19 May 2017 14:27:49 +0000 (11:27 -0300)]
hw/ppc/spapr_events.c: removing 'exception' from sPAPREventLogEntry

Currenty we do not have any RTAS event that is reported by the
event-scan interface. The existing events, RTAS_LOG_TYPE_EPOW and
RTAS_LOG_TYPE_HOTPLUG, are being reported by the check-exception
interface and, as such, marked as 'exception=true'.

Commit 79853e18d9, 'spapr_events: event-scan RTAS interface', added
the event_scan interface because the guest kernel requires it to
initialize other required interfaces. It is acting since then as
a stub because no events that would be reported by it were added
since then. However, the existence of the 'exception' boolean adds
an unnecessary load in the future migration of the pending_events,
sPAPREventLogEntry QTAILQ that hosts the pending RTAS events.

To make the code cleaner and ease the future migration changes, this
patch makes the following changes:

- remove the 'exception' boolean that filter these events. There is
nothing to filter since all events are reported by check-exception;

- functions rtas_event_log_queue, rtas_event_log_dequeue and
rtas_event_log_contains don't receive the 'exception' boolean
as parameter;

- event_scan function was simplified. It was calling
'rtas_event_log_dequeue(mask, false)' that was always returning
'NULL' because we have no events that are created with
exception=false, thus in the end it would execute a jump to
'out_no_events' all the time. The function now assumes that
this will always be the case and all the remaining logic were
deleted.

In the future, when or if we add new RTAS events that should
be reported with the event_scan interface, we can refer to
the changes made in this patch to add the event_scan logic
back.

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
7 years agospapr: ensure core_slot isn't NULL in spapr_core_unplug()
Greg Kurz [Thu, 18 May 2017 13:58:31 +0000 (15:58 +0200)]
spapr: ensure core_slot isn't NULL in spapr_core_unplug()

If we go that far on the path of hot-removing a core and we find out that
the core-id is invalid, then we have a serious bug.

Let's make it explicit with an assert() instead of dereferencing a NULL
pointer.

This fixes Coverity issue CID 1375404.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
7 years agoxics_kvm: cache already enabled vCPU ids
Greg Kurz [Wed, 17 May 2017 14:38:20 +0000 (16:38 +0200)]
xics_kvm: cache already enabled vCPU ids

Since commit a45863bda90d ("xics_kvm: Don't enable KVM_CAP_IRQ_XICS if
already enabled"), we were able to re-hotplug a vCPU that had been hot-
unplugged ealier, thanks to a boolean flag in ICPState that we set when
enabling KVM_CAP_IRQ_XICS.

This could work because the lifecycle of all ICPState objects was the
same as the machine. Commit 5bc8d26de20c ("spapr: allocate the ICPState
object from under sPAPRCPUCore") broke this assumption and now we always
pass a freshly allocated ICPState object (ie, with the flag unset) to
icp_kvm_cpu_setup().

This cause re-hotplug to fail with:

Unable to connect CPU8 to kernel XICS: Device or resource busy

Let's fix this by caching all the vCPU ids for which KVM_CAP_IRQ_XICS was
enabled. This also drops the now useless boolean flag from ICPState.

Reported-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
7 years agospapr: Consolidate HPT freeing code into a routine
Bharata B Rao [Wed, 17 May 2017 03:49:20 +0000 (09:19 +0530)]
spapr: Consolidate HPT freeing code into a routine

Consolidate the code that frees HPT into a separate routine
spapr_free_hpt() as the same chunk of code is called from two places.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
7 years agospapr-cpu-core: release ICP object when realization fails
Greg Kurz [Mon, 15 May 2017 11:39:55 +0000 (13:39 +0200)]
spapr-cpu-core: release ICP object when realization fails

While here we introduce a single error path to avoid code duplication.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
7 years agospapr: sanitize error handling in spapr_ics_create()
Greg Kurz [Mon, 15 May 2017 11:39:45 +0000 (13:39 +0200)]
spapr: sanitize error handling in spapr_ics_create()

The spapr_ics_create() function handles errors in a rather convoluted
way, with two local Error * variables. Moreover, failing to parent the
ICS object to the machine should be considered as a bug but it is
currently ignored.

This patch addresses both issues.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
7 years agoppc/xics: simplify prototype of xics_spapr_init()
Greg Kurz [Mon, 15 May 2017 11:39:16 +0000 (13:39 +0200)]
ppc/xics: simplify prototype of xics_spapr_init()

This function only does hypercall and RTAS-call registration, and thus
never returns an error. This patch adapt the prototype to reflect that.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
7 years agotarget/ppc: reset reservation in do_rfi()
Nikunj A Dadhania [Mon, 15 May 2017 08:35:09 +0000 (14:05 +0530)]
target/ppc: reset reservation in do_rfi()

For transitioning back to userspace after the interrupt.

Suggested-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
7 years agoMerge remote-tracking branch 'jasowang/tags/net-pull-request' into staging
Stefan Hajnoczi [Tue, 23 May 2017 13:53:41 +0000 (14:53 +0100)]
Merge remote-tracking branch 'jasowang/tags/net-pull-request' into staging

# gpg: Signature made Tue 23 May 2017 03:27:37 AM BST
# gpg:                using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* jasowang/tags/net-pull-request:
  e1000e: Fix ICR "Other" causes clear logic
  net/filter-rewriter: Remove unused option in filter-rewriter
  net/filter-mirror.c: Rename filter_mirror_send() and fix codestyle
  net/filter-mirror.c: Remove duplicate check code.
  hmp / net: Mark host_net_add/remove as deprecated
  COLO-compare: Improve tcp compare trace event readability
  virtio-net: fix wild pointer when remove virtio-net queues
  net/dump: Issue a warning for the deprecated "-net dump"
  net/tap: Replace tap-haiku.c and tap-aix.c by a generic tap-stub.c

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7 years agoqapi-schema: Remove obsolete note from ObjectTypeInfo
Eduardo Habkost [Tue, 16 May 2017 20:53:51 +0000 (17:53 -0300)]
qapi-schema: Remove obsolete note from ObjectTypeInfo

The "This command is experimental" note in ObjectTypeInfo is obsolete
since 2012.  Commit 5192082097549c5b3aa7c913c6853d97a68172cb removed the
warning from the qom-list-types command documentation, but we forgot to
remove the warning from ObjectTypeInfo.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170516205351.12101-1-ehabkost@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
7 years agoblock: Use QDict helpers for --force-share
Eric Blake [Mon, 15 May 2017 19:54:39 +0000 (14:54 -0500)]
block: Use QDict helpers for --force-share

Fam's addition of --force-share in commits 459571f7 and 335e9937
were developed prior to the addition of QDict scalar insertion
macros, but merged after the general cleanup in commit 46f5ac20.
Patch created mechanically by rerunning:

 spatch --sp-file scripts/coccinelle/qobject.cocci \
        --macro-file scripts/cocci-macro-file.h --dir . --in-place

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170515195439.17677-1-eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
7 years agoshutdown: Expose bool cause in SHUTDOWN and RESET events
Eric Blake [Mon, 15 May 2017 21:41:14 +0000 (16:41 -0500)]
shutdown: Expose bool cause in SHUTDOWN and RESET events

Libvirt would like to be able to distinguish between a SHUTDOWN
event triggered solely by guest request and one triggered by a
SIGTERM or other action on the host.  While qemu_kill_report() was
already able to give different output to stderr based on whether a
shutdown was triggered by a host signal (but NOT by a host UI event,
such as clicking the X on the window), that information was then
lost to management.  The previous patches improved things to use an
enum throughout all callsites, so now we have something ready to
expose through QMP.

Note that for now, the decision was to expose ONLY a boolean,
rather than promoting ShutdownCause to a QAPI enum; this is because
libvirt has not expressed an interest in anything finer-grained.
We can still add additional details, in a backwards-compatible
manner, if a need later arises (if the addition happens before 2.10,
we can replace the bool with an enum; otherwise, the enum will have
to be in addition to the bool); this patch merely adds a helper
shutdown_caused_by_guest() to map the internal enum into the
external boolean.

Update expected iotest outputs to match the new data (complete
coverage of the affected tests is obtained by -raw, -qcow2, and -nbd).

Here is output from 'virsh qemu-monitor-event --loop' with the
patch installed:

event SHUTDOWN at 1492639680.731251 for domain fedora_13: {"guest":true}
event STOP at 1492639680.732116 for domain fedora_13: <null>
event SHUTDOWN at 1492639680.732830 for domain fedora_13: {"guest":false}

Note that libvirt runs qemu with -no-shutdown: the first SHUTDOWN event
was triggered by an action I took directly in the guest (shutdown -h),
at which point qemu stops the vcpus and waits for libvirt to do any
final cleanups; the second SHUTDOWN event is the result of libvirt
sending SIGTERM now that it has completed cleanup.  Libvirt is already
smart enough to only feed the first qemu SHUTDOWN event to the end user
(remember, virsh qemu-monitor-event is a low-level debugging interface
that is explicitly unsupported by libvirt, so it sees things that normal
end users do not); changing qemu to emit SHUTDOWN only once is outside
the scope of this series.

See also https://bugzilla.redhat.com/1384007

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170515214114.15442-6-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
7 years agoshutdown: Add source information to SHUTDOWN and RESET
Eric Blake [Mon, 15 May 2017 21:41:13 +0000 (16:41 -0500)]
shutdown: Add source information to SHUTDOWN and RESET

Time to wire up all the call sites that request a shutdown or
reset to use the enum added in the previous patch.

It would have been less churn to keep the common case with no
arguments as meaning guest-triggered, and only modified the
host-triggered code paths, via a wrapper function, but then we'd
still have to audit that I didn't miss any host-triggered spots;
changing the signature forces us to double-check that I correctly
categorized all callers.

Since command line options can change whether a guest reset request
causes an actual reset vs. a shutdown, it's easy to also add the
information to reset requests.

Signed-off-by: Eric Blake <eblake@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au> [ppc parts]
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> [SPARC part]
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> [s390x parts]
Message-Id: <20170515214114.15442-5-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
7 years agoshutdown: Preserve shutdown cause through replay
Eric Blake [Mon, 15 May 2017 21:41:12 +0000 (16:41 -0500)]
shutdown: Preserve shutdown cause through replay

With the recent addition of ShutdownCause, we want to be able to pass
a cause through any shutdown request, and then faithfully replay that
cause when later replaying the same sequence.  The easiest way is to
expand the reply event mechanism to track a series of values for
EVENT_SHUTDOWN, one corresponding to each value of ShutdownCause.

We are free to change the replay stream as needed, since there are
already no guarantees about being able to use a replay stream by
any other version of qemu than the one that generated it.

The cause is not actually fed back until the next patch changes the
signature for requesting a shutdown; a TODO marks that upcoming change.

Yes, this uses the gcc/clang extension of a ranged case label,
but this is not the first time we've used non-C99 constructs.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20170515214114.15442-4-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
7 years agoshutdown: Prepare for use of an enum in reset/shutdown_request
Eric Blake [Mon, 15 May 2017 21:41:11 +0000 (16:41 -0500)]
shutdown: Prepare for use of an enum in reset/shutdown_request

We want to track why a guest was shutdown; in particular, being able
to tell the difference between a guest request (such as ACPI request)
and host request (such as SIGINT) will prove useful to libvirt.
Since all requests eventually end up changing shutdown_requested in
vl.c, the logical change is to make that value track the reason,
rather than its current 0/1 contents.

Since command-line options control whether a reset request is turned
into a shutdown request instead, the same treatment is given to
reset_requested.

This patch adds an internal enum ShutdownCause that describes reasons
that a shutdown can be requested, and changes qemu_system_reset() to
pass the reason through, although for now nothing is actually changed
with regards to what gets reported.  The enum could be exported via
QAPI at a later date, if deemed necessary, but for now, there has not
been a request to expose that much detail to end clients.

For the most part, we turn 0 into SHUTDOWN_CAUSE_NONE, and 1 into
SHUTDOWN_CAUSE_HOST_ERROR; the only specific case where we have enough
information right now to use a different value is when we are reacting
to a host signal.  It will take a further patch to edit all call-sites
that can trigger a reset or shutdown request to properly pass in any
other reasons; this patch includes TODOs to point such places out.

qemu_system_reset() trades its 'bool report' parameter for a
'ShutdownCause reason', with all non-zero values having the same
effect; this lets us get rid of the weird #defines for VMRESET_*
as synonyms for bools.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170515214114.15442-3-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
7 years agoshutdown: Simplify shutdown_signal
Eric Blake [Mon, 15 May 2017 21:41:10 +0000 (16:41 -0500)]
shutdown: Simplify shutdown_signal

There is no signal 0 (kill(pid, 0) has special semantics to probe whether
a process is alive), rather than actually sending a signal 0).  So we
can use the simpler 0, instead of -1, for our sentinel of whether a
shutdown request due to a signal has happened.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-Id: <20170515214114.15442-2-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
7 years agosockets: Plug memory leak in socket_address_flatten()
Markus Armbruster [Mon, 15 May 2017 16:39:04 +0000 (18:39 +0200)]
sockets: Plug memory leak in socket_address_flatten()

socket_address_flatten() leaks a SocketAddress when its argument is
null.  Happens when opening a ChardevBackend of type 'udp' that is
configured without a local address.  Screwed up in commit bd269ebc due
to last minute semantic conflict resolution.  Spotted by Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1494866344-11013-1-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
7 years agoscripts/qmp/qom-set: fix the value argument passed to srv.command()
Greg Kurz [Tue, 2 May 2017 14:41:43 +0000 (16:41 +0200)]
scripts/qmp/qom-set: fix the value argument passed to srv.command()

When invoking the script with -s, we end up passing a bogus value
to QEMU:

$ ./scripts/qmp/qom-set -s /var/tmp/qmp-sock-exp /machine.accel kvm
{}
$ ./scripts/qmp/qom-get -s /var/tmp/qmp-sock-exp /machine.accel
/var/tmp/qmp-sock-exp

This happens because sys.argv[2] isn't necessarily the command line
argument that holds the value. It is sys.argv[4] when -s was also
passed.

Actually, the code already has a variable to handle that. This patch
simply uses it.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <149373610338.5144.9635049015143453288.stgit@bahia.lan>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
7 years agoe1000e: Fix ICR "Other" causes clear logic
Sameeh Jubran [Mon, 22 May 2017 11:26:22 +0000 (14:26 +0300)]
e1000e: Fix ICR "Other" causes clear logic

This commit fixes a bug which causes the guest to hang. The bug was
observed upon a "receive overrun" (bit #6 of the ICR register)
interrupt which could be triggered post migration in a heavy traffic
environment. Even though the "receive overrun" bit (#6) is masked out
by the IMS register (refer to the log below) the driver still receives
an interrupt as the "receive overrun" bit (#6) causes the "Other" -
bit #24 of the ICR register - bit to be set as documented below. The
driver handles the interrupt and clears the "Other" bit (#24) but
doesn't clear the "receive overrun" bit (#6) which leads to an
infinite loop. Apparently the Windows driver expects that the "receive
overrun" bit and other ones - documented below - to be cleared when
the "Other" bit (#24) is cleared.

So to sum that up:
1. Bit #6 of the ICR register is set by heavy traffic
2. As a results of setting bit #6, bit #24 is set
3. The driver receives an interrupt for bit 24 (it doesn't receieve an
   interrupt for bit #6 as it is masked out by IMS)
4. The driver handles and clears the interrupt of bit #24
5. Bit #6 is still set.
6. 2 happens all over again

The Interrupt Cause Read - ICR register:

The ICR has the "Other" bit - bit #24 - that is set when one or more
of the following ICR register's bits are set:

LSC - bit #2, RXO - bit #6, MDAC - bit #9, SRPD - bit #16, ACK - bit
#17, MNG - bit #18

This bug can occur with any of these bits depending on the driver's
behaviour and the way it configures the device. However, trying to
reproduce it with any bit other than RX0 is challenging and came to
failure as the drivers don't implement most of these bits, trying to
reproduce it with LSC (Link Status Change - bit #2) bit didn't succeed
too as it seems that Windows handles this bit differently.

Log sample of the storm:

27563@1494850819.411877:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004)
27563@1494850819.411900:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.411915:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.412380:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.412395:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.412436:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.412441:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.412998:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004)

* This bug behaviour wasn't observed with the Linux driver.

This commit solves:
https://bugzilla.redhat.com/show_bug.cgi?id=1447935
https://bugzilla.redhat.com/show_bug.cgi?id=1449490

Cc: qemu-stable@nongnu.org
Signed-off-by: Sameeh Jubran <sjubran@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
7 years agonet/filter-rewriter: Remove unused option in filter-rewriter
Zhang Chen [Wed, 17 May 2017 02:09:40 +0000 (10:09 +0800)]
net/filter-rewriter: Remove unused option in filter-rewriter

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
7 years agonet/filter-mirror.c: Rename filter_mirror_send() and fix codestyle
Zhang Chen [Wed, 17 May 2017 02:09:39 +0000 (10:09 +0800)]
net/filter-mirror.c: Rename filter_mirror_send() and fix codestyle

Because filter_mirror_receive_iov() and filter_redirector_receive_iov()
both use the filter_mirror_send() to send packet, so I change
filter_mirror_send() to filter_send() that looks more common.
And fix some codestyle.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
7 years agonet/filter-mirror.c: Remove duplicate check code.
Zhang Chen [Wed, 17 May 2017 02:09:38 +0000 (10:09 +0800)]
net/filter-mirror.c: Remove duplicate check code.

The s->outdev have checked in filter_mirror_set_outdev().

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
7 years agohmp / net: Mark host_net_add/remove as deprecated
Thomas Huth [Mon, 15 May 2017 13:32:56 +0000 (15:32 +0200)]
hmp / net: Mark host_net_add/remove as deprecated

The netdev_add and netdev_del commands should be used nowadays instead.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
7 years agoCOLO-compare: Improve tcp compare trace event readability
Zhang Chen [Thu, 27 Apr 2017 03:46:45 +0000 (11:46 +0800)]
COLO-compare: Improve tcp compare trace event readability

Because of previous patch's trace arguments over the limit
of UST backend, so I rewrite the patch.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
7 years agovirtio-net: fix wild pointer when remove virtio-net queues
Yunjian Wang [Wed, 26 Apr 2017 06:45:56 +0000 (14:45 +0800)]
virtio-net: fix wild pointer when remove virtio-net queues

The tx_bh or tx_timer will free in virtio_net_del_queue() function, when
removing virtio-net queues if the guest doesn't support multiqueue. But
it might be still referenced by virtio_net_set_status(), which needs to
be set NULL. And also the tx_waiting needs to be set zero to prevent
virtio_net_set_status() accessing tx_bh or tx_timer.

Cc: qemu-stable@nongnu.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
7 years agonet/dump: Issue a warning for the deprecated "-net dump"
Thomas Huth [Tue, 25 Apr 2017 07:50:44 +0000 (09:50 +0200)]
net/dump: Issue a warning for the deprecated "-net dump"

Network dumping should be done with "-object filter-dump" nowadays.
Using "-net dump" via the VLAN mechanism is considered as deprecated
and might be removed in a future release. So warn the users now
to inform them to user the filter-dump method instead.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
7 years agonet/tap: Replace tap-haiku.c and tap-aix.c by a generic tap-stub.c
Thomas Huth [Mon, 3 Apr 2017 12:05:16 +0000 (14:05 +0200)]
net/tap: Replace tap-haiku.c and tap-aix.c by a generic tap-stub.c

The files tap-haiku.c and tap-aix.c are identical (except one line
of error message). We should avoid such code duplication, so replace
these by a generic tap-stub.c file instead.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
7 years agonuma: Silence incomplete mapping warning under qtest
Igor Mammedov [Thu, 18 May 2017 08:09:31 +0000 (10:09 +0200)]
numa: Silence incomplete mapping warning under qtest

Silence "make check" warnings triggered by the numa/mon/cpus/partial
test case.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1495094971-177754-4-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
7 years agoMerge remote-tracking branch 'mcayland/tags/qemu-openbios-signed' into staging
Stefan Hajnoczi [Mon, 22 May 2017 09:12:50 +0000 (10:12 +0100)]
Merge remote-tracking branch 'mcayland/tags/qemu-openbios-signed' into staging

Update OpenBIOS images

# gpg: Signature made Fri 19 May 2017 05:05:54 PM BST
# gpg:                using RSA key 0x5BC2C56FAE0F321F
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>"
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* mcayland/tags/qemu-openbios-signed:
  Update OpenBIOS images to 3ebaaa2 built from submodule.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7 years agoMerge remote-tracking branch 'kraxel/tags/pull-audio-20170519-1' into staging
Stefan Hajnoczi [Fri, 19 May 2017 15:54:10 +0000 (16:54 +0100)]
Merge remote-tracking branch 'kraxel/tags/pull-audio-20170519-1' into staging

audio: move & rename soundhw init code.

# gpg: Signature made Fri 19 May 2017 12:22:51 PM BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* kraxel/tags/pull-audio-20170519-1:
  audio: Rename hw/audio/audio.h to hw/audio/soundhw.h
  audio: Rename audio_init() to soundhw_init()
  audio: Move arch_init audio code to hw/audio/soundhw.c

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7 years agoUpdate OpenBIOS images to 3ebaaa2 built from submodule.
Mark Cave-Ayland [Fri, 19 May 2017 15:51:47 +0000 (16:51 +0100)]
Update OpenBIOS images to 3ebaaa2 built from submodule.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
7 years agoMerge remote-tracking branch 'kraxel/tags/pull-ui-20170519-1' into staging
Stefan Hajnoczi [Fri, 19 May 2017 15:44:18 +0000 (16:44 +0100)]
Merge remote-tracking branch 'kraxel/tags/pull-ui-20170519-1' into staging

ui: egl-headless requires dmabuf support

# gpg: Signature made Fri 19 May 2017 09:46:40 AM BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* kraxel/tags/pull-ui-20170519-1:
  ui: egl-headless requires dmabuf support

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7 years agoMerge remote-tracking branch 'quintela/tags/migration/20170518' into staging
Stefan Hajnoczi [Fri, 19 May 2017 15:36:51 +0000 (16:36 +0100)]
Merge remote-tracking branch 'quintela/tags/migration/20170518' into staging

migration/next for 20170518

# gpg: Signature made Thu 18 May 2017 06:23:26 PM BST
# gpg:                using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg:                 aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* quintela/tags/migration/20170518:
  migration: Make savevm.c target independent
  exec: Create include for target_page_size()
  migration: migration.h was not needed
  migration: Remove vmstate.h from migration.h
  migration: Remove qemu-file.h from vmstate.h
  migration: Split vmstate-types.c from vmstate.c
  migration: Move qjson.h to migration/
  migration: Remove migration.h from colo.h
  migration: Export qemu-file-channel.c functions in its own file
  migration: Split migration/channel.c for channel operations
  migration: Create migration/xbzrle.h
  block migration: Allow compile time disable
  migration: Remove old MigrationParams
  migration: Remove use of old MigrationParams
  migration: Create block capability
  hmp: Use visitor api for hmp_migrate_set_parameter()
  postcopy: Require RAMBlocks that are whole pages
  migration: Fix non-multiple of page size migration

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7 years agos390/kvm: do not reset riccb on initial cpu reset
Christian Borntraeger [Fri, 12 May 2017 11:47:30 +0000 (13:47 +0200)]
s390/kvm: do not reset riccb on initial cpu reset

The riccb is kept unchanged during initial cpu reset. Move the data
structure to the other registers that are unchanged.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
7 years agoMAINTAINERS: Add vfio-ccw maintainer
Dong Jia Shi [Wed, 17 May 2017 00:48:13 +0000 (02:48 +0200)]
MAINTAINERS: Add vfio-ccw maintainer

Add Cornelia Huck as the vfio-ccw maintainer.

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-14-bjsdjshi@linux.vnet.ibm.com>
[CH: add tree]
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
7 years agovfio/ccw: update sense data if a unit check is pending
Dong Jia Shi [Wed, 17 May 2017 00:48:12 +0000 (02:48 +0200)]
vfio/ccw: update sense data if a unit check is pending

Concurrent-sense data is currently not delivered. This patch stores
the concurrent-sense data to the subchannel if a unit check is pending
and the concurrent-sense bit is enabled. Then a TSCH can retreive the
right IRB data back to the guest.

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-13-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
7 years agos390x/css: ccw translation infrastructure
Xiao Feng Ren [Wed, 17 May 2017 00:48:11 +0000 (02:48 +0200)]
s390x/css: ccw translation infrastructure

Implement a basic infrastructure of handling channel I/O instruction
interception for passed through subchannels:
1. Branch the code path of instruction interception handling by
   SubChannel type.
2. For a passed-through subchannel, issue the ORB to kernel to do ccw
   translation and perform an I/O operation.
3. Assign different condition code based on the I/O result, or
   trigger a program check.

Signed-off-by: Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-12-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
7 years agos390x/css: introduce and realize ccw-request callback
Xiao Feng Ren [Wed, 17 May 2017 00:48:10 +0000 (02:48 +0200)]
s390x/css: introduce and realize ccw-request callback

Introduce a new callback on subchannel to handle ccw-request.
Realize the callback in vfio-ccw device. Besides, resort to
the event notifier handler to handling the ccw-request results.
1. Pread the I/O results via MMIO region.
2. Update the scsw info to guest.
3. Inject an I/O interrupt to notify guest the I/O result.

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-11-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
7 years agovfio/ccw: get irqs info and set the eventfd fd
Dong Jia Shi [Wed, 17 May 2017 00:48:09 +0000 (02:48 +0200)]
vfio/ccw: get irqs info and set the eventfd fd

vfio-ccw resorts to the eventfd mechanism to communicate with userspace.
We fetch the irqs info via the ioctl VFIO_DEVICE_GET_IRQ_INFO,
register a event notifier to get the eventfd fd which is sent
to kernel via the ioctl VFIO_DEVICE_SET_IRQS, then we can implement
read operation once kernel sends the signal.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-10-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
7 years agovfio/ccw: get io region info
Dong Jia Shi [Wed, 17 May 2017 00:48:08 +0000 (02:48 +0200)]
vfio/ccw: get io region info

vfio-ccw provides an MMIO region for I/O operations. We fetch its
information via ioctls here, then we can use it performing I/O
instructions and retrieving I/O results later on.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-9-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
7 years agovfio/ccw: vfio based subchannel passthrough driver
Xiao Feng Ren [Wed, 17 May 2017 00:48:07 +0000 (02:48 +0200)]
vfio/ccw: vfio based subchannel passthrough driver

We use the IOMMU_TYPE1 of VFIO to realize the subchannels
passthrough, implement a vfio based subchannels passthrough
driver called "vfio-ccw".

Support qemu parameters in the style of:
"-device vfio-ccw,sysfsdev=$mdev_file_path,devno=xx.x.xxxx'

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-8-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
7 years agos390x/css: device support for s390-ccw passthrough
Dong Jia Shi [Wed, 17 May 2017 00:48:06 +0000 (02:48 +0200)]
s390x/css: device support for s390-ccw passthrough

In order to support subchannels pass-through, we introduce a s390
subchannel device called "s390-ccw" to hold the real subchannel info.
The s390-ccw devices inherit from the abstract CcwDevice which connect
to the existing virtual-css-bus.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-7-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
7 years agos390x/css: realize css_create_sch
Dong Jia Shi [Wed, 17 May 2017 00:48:05 +0000 (02:48 +0200)]
s390x/css: realize css_create_sch

The S390 virtual css support already has a mechanism to create a
virtual subchannel and provide it to the guest. However, to
pass-through subchannels to a guest, we need to introduce a new
mechanism to create the subchannel according to the real device
information. Thus we reconstruct css_create_virtual_sch to a new
css_create_sch function to handle all these cases and do allocation
and initialization of the subchannel according to the device type
and machine configuration.

Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-6-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
7 years agos390x/css: realize css_sch_build_schib
Xiao Feng Ren [Wed, 17 May 2017 00:48:04 +0000 (02:48 +0200)]
s390x/css: realize css_sch_build_schib

The S390 virtual css support already has a mechanism to build a
virtual subchannel information block (schib) and provide virtual
subchannels to the guest. However, to pass-through subchannels to
a guest, we need to introduce a new mechanism to build its schib
according to the real device information. Thus we realize a new css
sch_build_schib function to extract the path_masks, chpids, chpid
type from sysfs. To reuse the existing code, we refactor
css_add_virtual_chpid to css_add_chpid.

Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-5-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
7 years agos390x/css: add s390-squash-mcss machine option
Xiao Feng Ren [Wed, 17 May 2017 00:48:03 +0000 (02:48 +0200)]
s390x/css: add s390-squash-mcss machine option

We want to support real (i.e. not virtual) channel devices
even for guests that do not support MCSS-E (where guests may
see devices from any channel subsystem image at once). As all
virtio-ccw devices are in css 0xfe (and show up in the default
css 0 for guests not activating MCSS-E), we need an option to
squash both the virtio subchannels and e.g. passed-through
subchannels from their real css (0-3, or 0 for hosts not
activating MCSS-E) into the default css. This will be
exploited in a later patch.

Signed-off-by: Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170517004813.58227-4-bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
7 years agolinux-headers: update
Cornelia Huck [Thu, 18 May 2017 11:47:27 +0000 (13:47 +0200)]
linux-headers: update

Update against Linux v4.12-rc1.

Also include the new vfio_ccw.h header.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
7 years agopc-bios/s390-ccw.img: rebuild image
Eric Farman [Wed, 10 May 2017 15:53:59 +0000 (17:53 +0200)]
pc-bios/s390-ccw.img: rebuild image

Contains the following commits:
- pc-bios/s390-ccw: Remove duplicate blk_factor adjustment
- pc-bios/s390-ccw: Move SCSI block factor to outer read
- pc-bios/s390-ccw: Break up virtio-scsi read into multiples
- pc-bios/s390-ccw: Refactor scsi_inquiry function
- pc-bios/s390-ccw: Get list of supported EVPD pages
- pc-bios/s390-ccw: Get Block Limits VPD device data
- pc-bios/s390-ccw: Build a reasonable max_sectors limit

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Message-Id: <20170510155359.32727-9-farman@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
7 years agopc-bios/s390-ccw: Build a reasonable max_sectors limit
Eric Farman [Wed, 10 May 2017 15:53:58 +0000 (17:53 +0200)]
pc-bios/s390-ccw: Build a reasonable max_sectors limit

Now that we've read all the possible limits that have been defined for
a virtio-scsi controller and the disk we're booting from, it's possible
that we are STILL going to exceed the limits of the host device.
For example, a "-device scsi-generic" device does not support the
Block Limits VPD page.

So, let's fallback to something that seems to work for most boot
configurations if larger values were specified (including if nothing
was explicitly specified, and we took default values).

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Message-Id: <20170510155359.32727-8-farman@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
7 years agopc-bios/s390-ccw: Get Block Limits VPD device data
Eric Farman [Wed, 10 May 2017 15:53:57 +0000 (17:53 +0200)]
pc-bios/s390-ccw: Get Block Limits VPD device data

The "Block Limits" Inquiry VPD page is optional for any SCSI device,
but if it's supported it provides a hint of the maximum I/O transfer
length for this particular device. If this page is supported by the
disk, let's issue that Inquiry and use the minimum of it and the
SCSI controller limit. That will cover this scenario:

  qemu-system-s390x ...
    -device virtio-scsi-ccw,id=scsi0,max_sectors=32768 ...
    -drive file=/dev/sda,if=none,id=drive0,format=raw ...
    -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,
            drive=drive0,id=disk0,max_io_size=1048576

controller: 32768 sectors x 512 bytes/sector = 16777216 bytes
      disk:                                     1048576 bytes

Now that we have a limit for a virtio-scsi disk, compare that with the
limit for the virtio-scsi controller when we actually build the I/O.
The minimum of these two limits should be the one we use.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Message-Id: <20170510155359.32727-7-farman@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
7 years agopc-bios/s390-ccw: Get list of supported VPD pages
Eric Farman [Wed, 10 May 2017 15:53:56 +0000 (17:53 +0200)]
pc-bios/s390-ccw: Get list of supported VPD pages

The "Supported Pages" Inquiry EVPD page is mandatory for all SCSI devices,
and is used as a gateway for what VPD pages the device actually supports.
Let's issue this Inquiry, and dump that list with the debug facility.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Message-Id: <20170510155359.32727-6-farman@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
7 years agopc-bios/s390-ccw: Refactor scsi_inquiry function
Eric Farman [Wed, 10 May 2017 15:53:55 +0000 (17:53 +0200)]
pc-bios/s390-ccw: Refactor scsi_inquiry function

If we want to issue any of the SCSI Inquiry EVPD pages,
which we do, we could use this function to issue both types
of commands with a little bit of refactoring.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Message-Id: <20170510155359.32727-5-farman@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
7 years agopc-bios/s390-ccw: Break up virtio-scsi read into multiples
Eric Farman [Wed, 10 May 2017 15:53:54 +0000 (17:53 +0200)]
pc-bios/s390-ccw: Break up virtio-scsi read into multiples

A virtio-scsi request that goes through the host sd driver and exceeds
the maximum transfer size is automatically broken up for us.  But the
equivalent request going to the sg driver presumes that any length
requirements have already been honored.

Let's use the max_sectors field on the virtio-scsi controller device,
and break up all requests (both sd and sg) to avoid this problem.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Message-Id: <20170510155359.32727-4-farman@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
7 years agopc-bios/s390-ccw: Move SCSI block factor to outer read
Eric Farman [Wed, 10 May 2017 15:53:53 +0000 (17:53 +0200)]
pc-bios/s390-ccw: Move SCSI block factor to outer read

Simple refactoring so that the blk_factor adjustment is
moved into virtio_scsi_read_many routine, in preparation
for another change.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Message-Id: <20170510155359.32727-3-farman@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
7 years agopc-bios/s390-ccw: Remove duplicate blk_factor adjustment
Eric Farman [Wed, 10 May 2017 15:53:52 +0000 (17:53 +0200)]
pc-bios/s390-ccw: Remove duplicate blk_factor adjustment

When using virtio-scsi, we multiply the READ(10) data_size by
a block factor twice when building the I/O.  This is fine,
since it's only 1 for SCSI disks, but let's clean it up.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20170510155359.32727-2-farman@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
7 years agoaudio: Rename hw/audio/audio.h to hw/audio/soundhw.h
Eduardo Habkost [Mon, 8 May 2017 20:57:35 +0000 (17:57 -0300)]
audio: Rename hw/audio/audio.h to hw/audio/soundhw.h

All the functions in hw/audio/audio.h are called "soundhw_*()"
and live in hw/audio/audiohw.c. Rename the header file for
consistency.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 20170508205735.23444-4-ehabkost@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
7 years agoaudio: Rename audio_init() to soundhw_init()
Eduardo Habkost [Mon, 8 May 2017 20:57:34 +0000 (17:57 -0300)]
audio: Rename audio_init() to soundhw_init()

To make it consistent with the remaining soundhw.c functions and
avoid confusion with the audio_init() function in audio/audio.c,
rename audio_init() to soundhw_init().

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-id: 20170508205735.23444-3-ehabkost@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
7 years agoaudio: Move arch_init audio code to hw/audio/soundhw.c
Eduardo Habkost [Mon, 8 May 2017 20:57:33 +0000 (17:57 -0300)]
audio: Move arch_init audio code to hw/audio/soundhw.c

There's no reason to keep the soundhw table in arch_init.c. Move
that code to a new hw/audio/soundhw.c file.

While moving the code, trivial coding style issues were fixed.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170508205735.23444-2-ehabkost@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
7 years agoui: egl-headless requires dmabuf support
Gerd Hoffmann [Wed, 17 May 2017 12:27:44 +0000 (14:27 +0200)]
ui: egl-headless requires dmabuf support

Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170517122744.3541-1-kraxel@redhat.com

7 years agomigration: Make savevm.c target independent
Juan Quintela [Mon, 24 Apr 2017 19:03:48 +0000 (21:03 +0200)]
migration: Make savevm.c target independent

It only needed TARGET_PAGE_SIZE/BITS/BITS_MIN values, so just export
them from exec.h

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
7 years agoexec: Create include for target_page_size()
Juan Quintela [Mon, 24 Apr 2017 18:50:19 +0000 (20:50 +0200)]
exec: Create include for target_page_size()

That is the only function that we need from exec.c, and having to
include the whole sysemu.h for this.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---

/me leans to be less sloppy with copyright notices
thanks Dave

7 years agomigration: migration.h was not needed
Juan Quintela [Thu, 6 Apr 2017 10:12:21 +0000 (12:12 +0200)]
migration: migration.h was not needed

This files don't use any function from migration.h, so drop it.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
7 years agomigration: Remove vmstate.h from migration.h
Juan Quintela [Mon, 17 Apr 2017 17:02:59 +0000 (19:02 +0200)]
migration: Remove vmstate.h from migration.h

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
---

Minor rearrangements due to rebase

7 years agomigration: Remove qemu-file.h from vmstate.h
Juan Quintela [Mon, 17 Apr 2017 16:59:13 +0000 (18:59 +0200)]
migration: Remove qemu-file.h from vmstate.h

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
--

minor rearangements due to the rebase

7 years agomigration: Split vmstate-types.c from vmstate.c
Juan Quintela [Thu, 20 Apr 2017 11:41:20 +0000 (13:41 +0200)]
migration: Split vmstate-types.c from vmstate.c

Now one just has the interperter, and the other has the basic types.
Once there, add copyright boilerplate.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--

Use GPL v2 or later.  Detected by David.

7 years agomigration: Move qjson.h to migration/
Juan Quintela [Thu, 20 Apr 2017 11:10:28 +0000 (13:10 +0200)]
migration: Move qjson.h to migration/

It is only used for migration code.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
7 years agomigration: Remove migration.h from colo.h
Juan Quintela [Thu, 6 Apr 2017 10:03:13 +0000 (12:03 +0200)]
migration: Remove migration.h from colo.h

migration.h is not included in any includes now.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
7 years agomigration: Export qemu-file-channel.c functions in its own file
Juan Quintela [Mon, 17 Apr 2017 17:34:36 +0000 (19:34 +0200)]
migration: Export qemu-file-channel.c functions in its own file

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
7 years agomigration: Split migration/channel.c for channel operations
Juan Quintela [Mon, 17 Apr 2017 15:07:04 +0000 (17:07 +0200)]
migration: Split migration/channel.c for channel operations

Create an include for its exported functions.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
Add proper header

7 years agomigration: Create migration/xbzrle.h
Juan Quintela [Wed, 5 Apr 2017 19:47:50 +0000 (21:47 +0200)]
migration: Create migration/xbzrle.h

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
7 years agoblock migration: Allow compile time disable
Dr. David Alan Gilbert [Mon, 15 May 2017 14:05:29 +0000 (15:05 +0100)]
block migration: Allow compile time disable

Many users now prefer to use drive_mirror over NBD as an
alternative to the older migrate -b option; drive_mirror is
more complex to setup but gives you more options (e.g. only
migrating some of the disks if some of them are shared).

Allow the large chunk of block migration code to be compiled
out for those who don't use it.

Based on a downstream-patch we've had for a while by Jeff Cody.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
--

- When compiled out, allow seting block only with false value (eric)

7 years agomigration: Remove old MigrationParams
Juan Quintela [Wed, 5 Apr 2017 19:00:09 +0000 (21:00 +0200)]
migration: Remove old MigrationParams

Not used anymore after moving block migration to use capabilities.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
7 years agomigration: Remove use of old MigrationParams
Juan Quintela [Wed, 5 Apr 2017 18:45:22 +0000 (20:45 +0200)]
migration: Remove use of old MigrationParams

We have change in the previous patch to use migration capabilities for
it.  Notice that we continue using the old command line flags from
migrate command from the time being.  Remove the set_params method as
now it is empty.

For savevm, one can't do a:

savevm -b/-i foo

but now one can do:

migrate_set_capability block on
savevm foo

And we can't use block migration. We could disable block capability
unconditionally, but it would not be much better.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
- Maintain shared/enabled dependency (Xu suggestion)
- Now we maintain the dependency on the setter functions
- improve error messages

7 years agomigration: Create block capability
Juan Quintela [Wed, 5 Apr 2017 16:32:37 +0000 (18:32 +0200)]
migration: Create block capability

Create one capability for block migration and one parameter for
incremental block migration.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---

- address all Markus comments
- use Markus and Eric text descriptions
- change logic another time
- improve text messages

7 years agohmp: Use visitor api for hmp_migrate_set_parameter()
Juan Quintela [Tue, 16 May 2017 09:37:45 +0000 (11:37 +0200)]
hmp: Use visitor api for hmp_migrate_set_parameter()

We only use it for int64 at this point, I am not able to find a way to
parse an int with MiB units.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
7 years agopostcopy: Require RAMBlocks that are whole pages
Dr. David Alan Gilbert [Wed, 17 May 2017 16:58:10 +0000 (17:58 +0100)]
postcopy: Require RAMBlocks that are whole pages

It turns out that it's legal to create a VM with RAMBlocks that aren't
a multiple of the pagesize in use; e.g. a 1025M main memory using
2M host pages.  That breaks postcopy's atomic placement of pages,
so disallow it.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
7 years agomigration: Fix non-multiple of page size migration
Dr. David Alan Gilbert [Wed, 17 May 2017 16:58:09 +0000 (17:58 +0100)]
migration: Fix non-multiple of page size migration

Unfortunately it's legal to create a VM with a RAM size that's
not a multiple of the underlying host page or huge page size.
Recently I'd changed things to always send host sized pages,
and that breaks if we have say a 1025MB guest on 2MB hugepages.

Unfortunately we can't just make that illegal since it would break
migration from/to existing oddly configured VMs.

Symptom: qemu-system-x86_64: Illegal RAM offset 40100000
     as it transmits the fraction of the hugepage after the end
     of the RAMBlock (may also cause a crash on the source
     - possibly due to clearing bits after the bitmap)

Reported-by: Yumei Huang <yuhuang@redhat.com>
Red Hat bug: https://bugzilla.redhat.com/show_bug.cgi?id=1449037

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
7 years agoMerge remote-tracking branch 'dgilbert/tags/pull-hmp-20170517' into staging
Stefan Hajnoczi [Thu, 18 May 2017 09:16:38 +0000 (10:16 +0100)]
Merge remote-tracking branch 'dgilbert/tags/pull-hmp-20170517' into staging

HMP pull

# gpg: Signature made Wed 17 May 2017 07:03:39 PM BST
# gpg:                using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* dgilbert/tags/pull-hmp-20170517:
  ramblock: add new hmp command "info ramblock"
  utils: provide size_to_str()
  ramblock: add RAMBLOCK_FOREACH()

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7 years agoMerge remote-tracking branch 'quintela/tags/migration/20170517' into staging
Stefan Hajnoczi [Thu, 18 May 2017 09:05:50 +0000 (10:05 +0100)]
Merge remote-tracking branch 'quintela/tags/migration/20170517' into staging

migration/next for 20170517

# gpg: Signature made Wed 17 May 2017 11:46:36 AM BST
# gpg:                using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg:                 aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* quintela/tags/migration/20170517:
  migration: Move check_migratable() into qdev.c
  migration: Move postcopy stuff to postcopy-ram.c
  migration: Move page_cache.c to migration/
  migration: Create migration/blocker.h
  ram: Rename RAM_SAVE_FLAG_COMPRESS to RAM_SAVE_FLAG_ZERO
  migration: Pass Error ** argument to {save,load}_vmstate
  migration: Fix regression with compression threads

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7 years agoMerge remote-tracking branch 'mst/tags/for_upstream' into staging
Stefan Hajnoczi [Thu, 18 May 2017 09:01:00 +0000 (10:01 +0100)]
Merge remote-tracking branch 'mst/tags/for_upstream' into staging

pci, virtio, vhost: fixes

A bunch of fixes that missed the release.
Most notably we are reverting shpc back to enabled by default state
as guests uses that as an indicator that hotplug is supported
(even though it's unused). Unfortunately we can't fix this
on the stable branch since that would break migration.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Wed 17 May 2017 10:42:06 PM BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* mst/tags/for_upstream:
  exec: abstract address_space_do_translate()
  pci: deassert intx when pci device unrealize
  virtio: allow broken device to notify guest
  Revert "hw/pci: disable pci-bridge's shpc by default"
  acpi-defs: clean up open brace usage
  ACPI: don't call acpi_pcihp_device_plug_cb on xen
  iommu: Don't crash if machine is not PC_MACHINE
  pc: add 2.10 machine type
  pc/fwcfg: unbreak migration from qemu-2.5 and qemu-2.6 during firmware boot
  libvhost-user: fix crash when rings aren't ready
  hw/virtio: fix vhost user fails to startup when MQ
  hw/arm/virt: generate 64-bit addressable ACPI objects
  hw/acpi-defs: replace leading X with x_ in FADT field names

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7 years agoexec: abstract address_space_do_translate()
Peter Xu [Wed, 17 May 2017 08:57:42 +0000 (16:57 +0800)]
exec: abstract address_space_do_translate()

This function is an abstraction helper for address_space_translate() and
address_space_get_iotlb_entry(). It does the lookup of address into
memory region section, then does proper IOMMU translation if necessary.
Refactor the two existing functions to use it.

This fixes vhost when IOMMU is disabled by guest.

Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
7 years agopci: deassert intx when pci device unrealize
Herongguang (Stephen) [Tue, 25 Apr 2017 02:29:54 +0000 (10:29 +0800)]
pci: deassert intx when pci device unrealize

If a pci device is not reset by VM (by writing into config space)
and unplugged by VM, after that when VM reboots, qemu may assert:
pcibus_reset: Assertion `bus->irq_count[i] == 0' failed

Cc: qemu-stable@nongnu.org
Signed-off-by: herongguang <herongguang.he@huawei.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
7 years agovirtio: allow broken device to notify guest
Greg Kurz [Wed, 17 May 2017 08:17:51 +0000 (10:17 +0200)]
virtio: allow broken device to notify guest

According to section 2.1.2 of the virtio-1 specification:

"The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state that
a reset is needed. If DRIVER_OK is set, after it sets DEVICE_NEEDS_RESET,
the device MUST send a device configuration change notification to the
driver."

Commit "f5ed36635d8f virtio: stop virtqueue processing if device is broken"
introduced a virtio_error() call that just does that:

- internally mark the device as broken
- set the DEVICE_NEEDS_RESET bit in the status
- send a configuration change notification

Unfortunately, virtio_notify_vector(), called by virtio_notify_config(),
returns right away when the device is marked as broken and the notification
isn't sent in this case.

The spec doesn't say whether a broken device can send notifications
in other situations or not. But since the driver isn't supposed to do
anything but to reset the device, it makes sense to keep the check in
virtio_notify_config().

Marking the device as broken AFTER the configuration change notification was
sent is enough to fix the issue.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
7 years agoRevert "hw/pci: disable pci-bridge's shpc by default"
Marcel Apfelbaum [Thu, 11 May 2017 10:25:29 +0000 (13:25 +0300)]
Revert "hw/pci: disable pci-bridge's shpc by default"

This reverts commit dc0ae767700c156894e36fab89a745a2dc4173de.

Disabling the shpc controller has an undesired side effect.
The PCI bridge remains with no attached devices at boot time,
and the guest operating systems do not allocate any resources
for it, leaving the bridge unusable. Note that the behaviour
is dictated by the pci bridge specification.

Revert the commit and leave the shpc controller even if is not
actually used by any architecture. Slot 0 remains unusable at boot time.

Keep shpc off for QEMU 2.9 machines.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
7 years agoramblock: add new hmp command "info ramblock"
Peter Xu [Fri, 12 May 2017 04:17:41 +0000 (12:17 +0800)]
ramblock: add new hmp command "info ramblock"

To dump information about ramblocks. It looks like:

(qemu) info ramblock
              Block Name    PSize              Offset               Used              Total
            /objects/mem    2 MiB  0x0000000000000000 0x0000000080000000 0x0000000080000000
                vga.vram    4 KiB  0x0000000080060000 0x0000000001000000 0x0000000001000000
    /rom@etc/acpi/tables    4 KiB  0x00000000810b0000 0x0000000000020000 0x0000000000200000
                 pc.bios    4 KiB  0x0000000080000000 0x0000000000040000 0x0000000000040000
  0000:00:03.0/e1000.rom    4 KiB  0x0000000081070000 0x0000000000040000 0x0000000000040000
                  pc.rom    4 KiB  0x0000000080040000 0x0000000000020000 0x0000000000020000
    0000:00:02.0/vga.rom    4 KiB  0x0000000081060000 0x0000000000010000 0x0000000000010000
   /rom@etc/table-loader    4 KiB  0x00000000812b0000 0x0000000000001000 0x0000000000001000
      /rom@etc/acpi/rsdp    4 KiB  0x00000000812b1000 0x0000000000001000 0x0000000000001000

Ramblock is something hidden internally in QEMU implementation, and this
command should only be used by mostly QEMU developers on RAM stuff. It
is not a command suitable for QMP interface. So only HMP interface is
provided for it.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1494562661-9063-4-git-send-email-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
7 years agoutils: provide size_to_str()
Peter Xu [Fri, 12 May 2017 04:17:40 +0000 (12:17 +0800)]
utils: provide size_to_str()

Moving the algorithm from print_type_size() into size_to_str() so that
other component can also leverage it. With that, refactor
print_type_size().

The assert() in that logic is removed though, since even UINT64_MAX
would not overflow.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1494562661-9063-3-git-send-email-peterx@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
7 years agoramblock: add RAMBLOCK_FOREACH()
Peter Xu [Fri, 12 May 2017 04:17:39 +0000 (12:17 +0800)]
ramblock: add RAMBLOCK_FOREACH()

So that it can simplifies the iterators.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1494562661-9063-2-git-send-email-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>