Stefano Stabellini [Wed, 13 Jan 2016 14:59:09 +0000 (14:59 +0000)]
fix MSI injection on Xen
On Xen MSIs can be remapped into pirqs, which are a type of event
channels. It's mostly for the benefit of PCI passthrough devices, to
avoid the overhead of interacting with the emulated lapic.
However remapping interrupts and MSIs is also supported for emulated
devices, such as the e1000 and virtio-net.
When an interrupt or an MSI is remapped into a pirq, masking and
unmasking is done by masking and unmasking the event channel. The
masking bit on the PCI config space or MSI-X table should be ignored,
but it isn't at the moment.
As a consequence emulated devices which use MSI or MSI-X, such as
virtio-net, don't work properly (the guest doesn't receive any
notifications). The mechanism was working properly when xen_apic was
introduced, but I haven't narrowed down which commit in particular is
causing the regression.
Fix the issue by ignoring the masking bit for MSI and MSI-X which have
been remapped into pirqs.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jason Wang [Thu, 14 Jan 2016 05:47:24 +0000 (00:47 -0500)]
intel_iommu: large page support
Current intel_iommu only supports 4K page which may not be sufficient
to cover guest working set. This patch tries to enable 2M and 1G mapping
for intel_iommu. This is also useful for future device IOTLB
implementation to have a better hit rate.
Major work is adding a page mask field on IOTLB entry to make it
support large page. And also use the slpte level as key to do IOTLB
lookup. MAMV was increased to 18 to support direct invalidation for 1G
mapping.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
David Gibson [Thu, 21 Jan 2016 01:37:51 +0000 (12:37 +1100)]
dimm: Correct type of MemoryHotplugState->base
The 'base' field of MemoryHotplugState is ram_addr_t, which indicates that
it exists in the abstract address space of RAM regions.
However, the actual usage of this field indicates that it is a concrete
physical address (it's passed as an offset to memory_region_add_subgregion
for example).
So, correct its type to 'hwaddr'.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Laszlo Ersek [Mon, 18 Jan 2016 14:12:13 +0000 (15:12 +0100)]
pc: set the OEM fields in the RSDT and the FADT from the SLIC
The Microsoft spec about the SLIC and MSDM ACPI tables at
<http://go.microsoft.com/fwlink/p/?LinkId=234834> requires the OEM ID and
OEM Table ID fields to be consistent between the SLIC and the RSDT/XSDT.
That further affects the FADT, because a similar match between the FADT
and the RSDT/XSDT is required by the ACPI spec in general.
This patch wires up the previous three patches.
Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:ACPI/SMBIOS)
Cc: Igor Mammedov <imammedo@redhat.com> (supporter:ACPI/SMBIOS)
Cc: Paolo Bonzini <pbonzini@redhat.com> (maintainer:X86)
Cc: Richard W.M. Jones <rjones@redhat.com>
Cc: Aleksei Kovura <alex3kov@zoho.com>
Cc: Michael Tokarev <mjt@tls.msk.ru>
Cc: Steven Newbury <steve@snewbury.org.uk>
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=
1248758
LP: https://bugs.launchpad.net/qemu/+bug/
1533848
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Steven Newbury <steve@snewbury.org.uk>
Laszlo Ersek [Mon, 18 Jan 2016 14:12:12 +0000 (15:12 +0100)]
acpi: add function to extract oem_id and oem_table_id from the user's SLIC
The acpi_get_slic_oem() function stores pointers to these fields in the
(first) SLIC table that the user passes in with the -acpitable switch.
Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:ACPI/SMBIOS)
Cc: Igor Mammedov <imammedo@redhat.com> (supporter:ACPI/SMBIOS)
Cc: Richard W.M. Jones <rjones@redhat.com>
Cc: Aleksei Kovura <alex3kov@zoho.com>
Cc: Michael Tokarev <mjt@tls.msk.ru>
Cc: Steven Newbury <steve@snewbury.org.uk>
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=
1248758
LP: https://bugs.launchpad.net/qemu/+bug/
1533848
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Steven Newbury <steve@snewbury.org.uk>
Laszlo Ersek [Mon, 18 Jan 2016 14:12:11 +0000 (15:12 +0100)]
acpi: expose oem_id and oem_table_id in build_rsdt()
Since build_rsdt() is implemented as common utility code (in
"hw/acpi/aml-build.c"), it should expose -- and forward -- the oem_id and
oem_table_id parameters between board code and the generic build_header()
function.
Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:ACPI/SMBIOS)
Cc: Igor Mammedov <imammedo@redhat.com> (supporter:ACPI/SMBIOS)
Cc: Shannon Zhao <zhaoshenglong@huawei.com> (maintainer:ARM ACPI Subsystem)
Cc: Paolo Bonzini <pbonzini@redhat.com> (maintainer:X86)
Cc: Richard W.M. Jones <rjones@redhat.com>
Cc: Aleksei Kovura <alex3kov@zoho.com>
Cc: Michael Tokarev <mjt@tls.msk.ru>
Cc: Steven Newbury <steve@snewbury.org.uk>
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=
1248758
LP: https://bugs.launchpad.net/qemu/+bug/
1533848
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Laszlo Ersek [Mon, 18 Jan 2016 14:12:10 +0000 (15:12 +0100)]
acpi: take oem_id in build_header(), optionally
This patch is the continuation of commit
8870ca0e94f2 ("acpi: support
specified oem table id for build_header"). It will allow us to control the
OEM ID field too in the SDT header.
Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:ACPI/SMBIOS)
Cc: Igor Mammedov <imammedo@redhat.com> (supporter:ACPI/SMBIOS)
Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com> (maintainer:NVDIMM)
Cc: Shannon Zhao <zhaoshenglong@huawei.com> (maintainer:ARM ACPI Subsystem)
Cc: Paolo Bonzini <pbonzini@redhat.com> (maintainer:X86)
Cc: Richard W.M. Jones <rjones@redhat.com>
Cc: Aleksei Kovura <alex3kov@zoho.com>
Cc: Michael Tokarev <mjt@tls.msk.ru>
Cc: Steven Newbury <steve@snewbury.org.uk>
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=
1248758
LP: https://bugs.launchpad.net/qemu/+bug/
1533848
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Eduardo Habkost [Fri, 11 Dec 2015 18:42:33 +0000 (16:42 -0200)]
pc: Eliminate PcGuestInfo struct
The struct is not used for anything, now.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eduardo Habkost [Fri, 11 Dec 2015 18:42:32 +0000 (16:42 -0200)]
pc: Move APIC and NUMA data from PcGuestInfo to PCMachineState
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Eduardo Habkost [Fri, 11 Dec 2015 18:42:31 +0000 (16:42 -0200)]
pc: Move PcGuestInfo.fw_cfg to PCMachineState
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Eduardo Habkost [Fri, 11 Dec 2015 18:42:30 +0000 (16:42 -0200)]
pc: Remove PcGuestInfo.isapc_ram_fw field
The code can use the PCMachineClass.pci_enabled field directly.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Eduardo Habkost [Fri, 11 Dec 2015 18:42:29 +0000 (16:42 -0200)]
pc: Remove RAM size fields from PcGuestInfo
The ACPI code can use the PCMachineState fields directly.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Eduardo Habkost [Fri, 11 Dec 2015 18:42:28 +0000 (16:42 -0200)]
pc: Remove compat fields from PcGuestInfo
Remove the fields: legacy_acpi_table_size, has_acpi_build,
has_reserved_memory, and rsdp_in_ram from PcGuestInfo, and let
the existing code use the PCMachineClass fields directly.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Eduardo Habkost [Fri, 11 Dec 2015 18:42:27 +0000 (16:42 -0200)]
acpi: Don't save PcGuestInfo on AcpiBuildState
We don't need to save the pointer on AcpiBuildState, as it is not
used anymore.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Eduardo Habkost [Fri, 11 Dec 2015 18:42:26 +0000 (16:42 -0200)]
acpi: Remove guest_info parameters from functions
We can use PC_MACHINE(qdev_get_machine())->acpi_guest_info to get
guest_info.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eduardo Habkost [Fri, 11 Dec 2015 18:42:25 +0000 (16:42 -0200)]
pc: Simplify xen_load_linux() signature
We can get the PcGuestInfo struct directly from PCMachineState,
and the return value is not needed at all.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Eduardo Habkost [Fri, 11 Dec 2015 18:42:24 +0000 (16:42 -0200)]
pc: Simplify pc_memory_init() signature
We can get the PcGuestInfo struct directly from PCMachineState,
and the return value is not needed at all.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Eduardo Habkost [Fri, 11 Dec 2015 18:42:23 +0000 (16:42 -0200)]
pc: Eliminate struct PcGuestInfoState
Instead of allocating a new struct just for PcGuestInfo and the
mchine_done Notifier, place them inside PCMachineState.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Eduardo Habkost [Fri, 11 Dec 2015 18:42:22 +0000 (16:42 -0200)]
pc: Move PcGuestInfo declaration to top of file
The struct will be used inside PCMachineState.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Cédric Le Goater [Mon, 25 Jan 2016 14:07:34 +0000 (15:07 +0100)]
ipmi: add ACPI power and GUID commands
>From the specs (20.8 Get Device GUID Command), the command needs to
return a GUID (Globally Unique ID), or UUID, that should never change
over the lifetime of the device. qemu_uuid looked like a good
candidate to start with but we could use a specific BMC property also
if needed.
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cédric Le Goater [Mon, 25 Jan 2016 14:07:33 +0000 (15:07 +0100)]
ipmi: add GET_SYS_RESTART_CAUSE chassis command
This is a simulator. Just return an unknown cause (0).
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cédric Le Goater [Mon, 25 Jan 2016 14:07:32 +0000 (15:07 +0100)]
ipmi: add get and set SENSOR_TYPE commands
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cédric Le Goater [Mon, 25 Jan 2016 14:07:31 +0000 (15:07 +0100)]
ipmi: introduce a struct ipmi_sdr_compact
Currently, sdr attributes are identified using byte offsets and this
can be a bit confusing.
This patch adds a struct ipmi_sdr_compact conforming to the IPMI specs
and replaces byte offsets with names. It also introduces and uses a
struct ipmi_sdr_header in sections of the code where no assumption is
made on the type of SDR. This leave rooms to potential usage of other
types in the future.
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cédric Le Goater [Mon, 25 Jan 2016 14:07:30 +0000 (15:07 +0100)]
ipmi: fix SDR length value
The IPMI BMC simulator populates the SDR table with a set of initial
SDRs. The length of each SDR is taken from the record itself (byte 4)
which does not include the size of the header. But, the full length
(header + data) is required by the sdr_add_entry() routine.
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cédric Le Goater [Mon, 25 Jan 2016 14:07:29 +0000 (15:07 +0100)]
ipmi: cleanup error_report messages
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cédric Le Goater [Mon, 25 Jan 2016 14:07:28 +0000 (15:07 +0100)]
ipmi: replace *_MAXCMD defines
ARRAY_SIZE() is simple to use and removes the need to pre-define
the size of the command arrays.
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cédric Le Goater [Mon, 25 Jan 2016 14:07:27 +0000 (15:07 +0100)]
ipmi: replace goto by a return statement
Each routine using the IPMI_ADD_RSP_DATA, IPMI_CHECK_CMD_LEN or
IPMI_CHECK_RESERVATION macros needs to define a goto label 'out' to
handle hidden errors. Using directly a return statement has the same
effect and it removes the fact that 'out' needs to be defined.
The code exits in ipmi_sim_handle_command() are a little different
from the rest and a "possible" error in the macro IPMI_ADD_RSP_DATA is
handled before making use of it. This might be a bit excessive as a
minimum response len is currently 300 bytes and the patch checks that
at least 3 are available.
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Marcel Apfelbaum [Mon, 18 Jan 2016 15:27:26 +0000 (17:27 +0200)]
hw/pci: ensure that only PCI/PCIe bridges can be attached to pxb/pxb-pcie devices
PCI devices can't be plugged directly into PCI extra root bridges
because their resources can't be computed by firmware before the ACPI
tables are loaded.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Bonzini [Thu, 4 Feb 2016 15:00:52 +0000 (16:00 +0100)]
vhost-user-test: use correct ROM to speed up and avoid spurious failures
The mechanism to get the option ROM for virtio-net does not block the
PCI ROM from being loaded. Therefore, in vhost-user-test there are
two entries in the boot menu for the virtio-net card: one as an
embedded option ROM, one from the ROM BAR.
The embedded option ROM in vhost-user-test is the non-EFI-enabled,
while the ROM BAR has an EFI-enabled ROM. The two are compiled with
slightly different parameters, where only the old BIOS-only one doesn't
have a timeout for the "Press Ctrl-B" banner. When using a new
machine type, therefore, the vhost-user-test has to wait for the
EFI-enabled ROM's banner to go away. There are several ways to fix
this:
1) fix the ROMs to have the same configuration
2) add ",romfile=" to the -device line
3) remove --option-rom and add the ROM file name to the -device line
4) use an old machine type
This patch chooses 3. In addition, the file name was wrong because
qtest runs QEMU relative to the top build directory, not to the
x86_64-softmmu/ subdirectory, which is fixed too.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Marcel Apfelbaum [Wed, 3 Feb 2016 11:56:10 +0000 (13:56 +0200)]
hw/pxb: add pxb devices to the bridge category
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Vincenzo Maffione [Sun, 31 Jan 2016 10:29:06 +0000 (11:29 +0100)]
virtio: combine write of an entry into used ring
Fill in an element of the used ring with a single combined access to the
guest physical memory, rather than using two separated accesses.
This reduces the overhead due to expensive address translation.
Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com>
Message-Id: <
e4a89a767a4a92cbb6bcc551e151487eb36e1722.
1450218353.git.v.maffione@gmail.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Vincenzo Maffione [Sun, 31 Jan 2016 10:29:05 +0000 (11:29 +0100)]
virtio: read avail_idx from VQ only when necessary
The virtqueue_pop() implementation needs to check if the avail ring
contains some pending buffers. To perform this check, it is not
always necessary to fetch the avail_idx in the VQ memory, which is
expensive. This patch introduces a shadow variable tracking avail_idx
and modifies virtio_queue_empty() to access avail_idx in physical
memory only when necessary.
Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com>
Message-Id: <
b617d6459902773d9f4ab843bfaca764f5af8eda.
1450218353.git.v.maffione@gmail.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Vincenzo Maffione [Sun, 31 Jan 2016 10:29:04 +0000 (11:29 +0100)]
virtio: cache used_idx in a VirtQueue field
Accessing used_idx in the VQ requires an expensive access to
guest physical memory. Before this patch, 3 accesses are normally
done for each pop/push/notify call. However, since the used_idx is
only written by us, we can track it in our internal data structure.
Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com>
Message-Id: <
3d062ec54e9a7bf9fb325c1fd693564951f2b319.
1450218353.git.v.maffione@gmail.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Bonzini [Sun, 31 Jan 2016 10:29:03 +0000 (11:29 +0100)]
virtio: combine the read of a descriptor
Compared to vring, virtio has a performance penalty of 10%. Fix it
by combining all the reads for a descriptor in a single address_space_read
call. This also simplifies the code nicely.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Bonzini [Sun, 31 Jan 2016 10:29:02 +0000 (11:29 +0100)]
vring: slim down allocation of VirtQueueElements
Build the addresses and s/g lists on the stack, and then copy them
to a VirtQueueElement that is just as big as required to contain this
particular s/g list. The cost of the copy is minimal compared to that
of a large malloc.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Bonzini [Sun, 31 Jan 2016 10:29:01 +0000 (11:29 +0100)]
virtio: slim down allocation of VirtQueueElements
Build the addresses and s/g lists on the stack, and then copy them
to a VirtQueueElement that is just as big as required to contain this
particular s/g list. The cost of the copy is minimal compared to that
of a large malloc.
When virtqueue_map is used on the destination side of migration or on
loadvm, the iovecs have already been split at memory region boundary,
so we can just reuse the out_num/in_num we find in the file.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Bonzini [Sun, 31 Jan 2016 10:29:00 +0000 (11:29 +0100)]
virtio: introduce virtqueue_alloc_element
Allocate the arrays for in_addr/out_addr/in_sg/out_sg outside the
VirtQueueElement. For now, virtqueue_pop and vring_pop keep
allocating a very large VirtQueueElement.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Bonzini [Sun, 31 Jan 2016 10:28:59 +0000 (11:28 +0100)]
virtio: introduce qemu_get/put_virtqueue_element
Move allocation to virtio functions also when loading/saving a
VirtQueueElement. This will also let the load/save functions
keep backwards compatibility when the VirtQueueElement layout
is changed.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Bonzini [Thu, 4 Feb 2016 14:26:51 +0000 (16:26 +0200)]
virtio: move allocation to virtqueue_pop/vring_pop
The return code of virtqueue_pop/vring_pop is unused except to check for
errors or 0. We can thus easily move allocation inside the functions
and just return a pointer to the VirtQueueElement.
The advantage is that we will be able to allocate only the space that
is needed for the actual size of the s/g list instead of the full
VIRTQUEUE_MAX_SIZE items. Currently VirtQueueElement takes about 48K
of memory, and this kind of allocation puts a lot of stress on malloc.
By cutting the size by two or three orders of magnitude, malloc can
use much more efficient algorithms.
The patch is pretty large, but changes to each device are testable
more or less independently. Splitting it would mostly add churn.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Paolo Bonzini [Sun, 31 Jan 2016 10:28:57 +0000 (11:28 +0100)]
virtio: move VirtQueueElement at the beginning of the structs
The next patch will make virtqueue_pop/vring_pop allocate memory for
the VirtQueueElement. In some cases (blk, scsi, gpu) the device wants
to extend VirtQueueElement with device-specific fields and, until now,
the place of the VirtQueueElement within the containing struct didn't
matter. When allocating the entire block in virtqueue_pop/vring_pop,
however, the containing struct must basically be a "subclass" of
VirtQueueElement, with the VirtQueueElement as the first field. Make
that the case for blk and scsi; gpu is already doing it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Igor Mammedov [Fri, 22 Jan 2016 14:36:08 +0000 (15:36 +0100)]
tests: pc: acpi: add expected DSDT.bridge blobs and update DSDT blobs
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Igor Mammedov [Fri, 22 Jan 2016 14:36:07 +0000 (15:36 +0100)]
tests: pc: acpi: drop not needed 'expected SSDT' blobs
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Igor Mammedov [Fri, 22 Jan 2016 14:36:06 +0000 (15:36 +0100)]
pc: acpi: merge SSDT into DSDT
Since both tables are built dynamically now,
there is no point in keeping ASL in them in separate
tables.
So do the same as we do for ARM where we have only
DSDT table, i.e. move SSDT ASL into DSDT and
drop SSDT altogether.
This patch doesn't change moved SSDT ASL in any way,
but it opens a way to relatively independently simplify
generated ASL on per device/subsystem basis in
followup series.
It also simplifies bios-tables-test where expected
SSDT blobs could be dropped and only DSDT ones
have to be maintained.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Dr. David Alan Gilbert [Fri, 29 Jan 2016 13:18:56 +0000 (13:18 +0000)]
Fix virtio migration
I misunderstood the vmstate macro definition when I reworked the
virtio .get/.put.
The VMSTATE_STRUCT_VARRAY_KNOWN, was described as being for "a
variable length array (i.e. _type *_field) but we know the
length". However it actually specified operation for arrays embedded in
the struct (i.e. _type _field[]) since it lacked the VMS_POINTER
flag. This caused offset calculation to be completely off, examining and
potentially sending random data instead of the VirtQueue content.
Replace the otherwise unused VMSTATE_STRUCT_VARRAY_KNOWN with a
VMSTATE_STRUCT_VARRAY_POINTER_KNOWN that includes the VMS_POINTER flag
(so now actually doing what it advertises) and use it in the virtio
migration code.
Fixes and description as per Sascha's suggestions/debug.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reported-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Tested-By: Sascha Silbe <silbe@linux.vnet.ibm.com>
Reviewed-By: Sascha Silbe <silbe@linux.vnet.ibm.com>
Fixes: 50e5ae4dc3e4f21e874512f9e87b93b5472d26e0
Fixes: 2cf0148674430b6693c60d42b7eef721bfa9509f
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Peter Maydell [Wed, 3 Feb 2016 19:00:33 +0000 (19:00 +0000)]
Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Wed 03 Feb 2016 15:47:34 GMT using RSA key ID
81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
* remotes/stefanha/tags/tracing-pull-request:
log: add "-d trace:PATTERN"
trace: switch default backend to "log"
trace: convert stderr backend to log
log: move qemu-log.c into util/ directory
log: do not unnecessarily include qom/cpu.h
trace: add "-trace help"
trace: add "-trace enable=..."
trace: no need to call trace_backend_init in different branches now
trace: split trace_init_file out of trace_init_backends
trace: split trace_init_events out of trace_init_backends
trace: fix documentation
trace: track enabled events in a separate array
trace: count number of enabled events
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Wed, 3 Feb 2016 12:23:48 +0000 (12:23 +0000)]
Merge remote-tracking branch 'remotes/kraxel/tags/pull-vga-
20160203-1' into staging
virtio-gpu: bugfixes and spice support preparation
# gpg: Signature made Wed 03 Feb 2016 09:47:13 GMT using RSA key ID
D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
* remotes/kraxel/tags/pull-vga-
20160203-1:
virtio-gpu: block any rendering until client (ui) is done
virtio-gpu: add support to enable/disable command processing
virtio-gpu: maintain command queue
virtio-gpu: fix memory leak in error path
console: block rendering until client is done
zap qemu_egl_has_ext in include/ui/egl-helpers.h
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Wed, 3 Feb 2016 10:50:06 +0000 (10:50 +0000)]
Merge remote-tracking branch 'remotes/armbru/tags/pull-monitor-2016-02-03' into staging
Monitor patches for 2016-02-03
# gpg: Signature made Wed 03 Feb 2016 09:13:48 GMT using RSA key ID
EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
* remotes/armbru/tags/pull-monitor-2016-02-03:
hmp: fix sendkey out of bounds write (CVE-2015-8619)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Paolo Bonzini [Thu, 7 Jan 2016 13:55:32 +0000 (16:55 +0300)]
log: add "-d trace:PATTERN"
This is a bit easier to use than "-trace" if you are also enabling
other kinds of logging. It is also more discoverable for experienced
QEMU users, and accessible from user-mode emulators.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id:
1452174932-28657-12-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Paolo Bonzini [Thu, 7 Jan 2016 13:55:31 +0000 (16:55 +0300)]
trace: switch default backend to "log"
This enables integration with other QEMU logging facilities.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id:
1452174932-28657-11-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Paolo Bonzini [Thu, 7 Jan 2016 13:55:30 +0000 (16:55 +0300)]
trace: convert stderr backend to log
[Also update .travis.yml --enable-trace-backends=stderr
--Stefan]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id:
1452174932-28657-10-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Gerd Hoffmann [Wed, 2 Dec 2015 07:17:24 +0000 (08:17 +0100)]
virtio-gpu: block any rendering until client (ui) is done
Wire up gl_block callback, so ui code can request to stop
virtio-gpu rendering.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Gerd Hoffmann [Tue, 1 Dec 2015 12:18:38 +0000 (13:18 +0100)]
virtio-gpu: add support to enable/disable command processing
So we can stop rendering for a while in case we have to.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Gerd Hoffmann [Tue, 1 Dec 2015 11:05:14 +0000 (12:05 +0100)]
virtio-gpu: maintain command queue
We'll go take out the commands we receive out of the virt queue and put
them into a linked list, to decouple virtio queue handling from actual
command processing.
Also move cmd processing to new virtio_gpu_handle_ctrl func, so we can
easily kick it from different places.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Gerd Hoffmann [Fri, 18 Dec 2015 10:55:01 +0000 (11:55 +0100)]
virtio-gpu: fix memory leak in error path
Found by Coverity Scan, buf not freed on error.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Gerd Hoffmann [Thu, 3 Dec 2015 11:34:25 +0000 (12:34 +0100)]
console: block rendering until client is done
Allow gl user interfaces to block display device gl rendering.
The ui code might want to do that in case it takes a little
longer to bring things to screen, for example because we'll
hand over a dma-buf to another process (spice will do that).
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Gerd Hoffmann [Mon, 12 Oct 2015 09:59:39 +0000 (11:59 +0200)]
zap qemu_egl_has_ext in include/ui/egl-helpers.h
Drop leftover prototype which sneaked in by mistake
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Denis V. Lunev [Thu, 7 Jan 2016 13:55:29 +0000 (16:55 +0300)]
log: move qemu-log.c into util/ directory
log will become common facility with tracepoints support in next step.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id:
1452174932-28657-9-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Paolo Bonzini [Thu, 7 Jan 2016 13:55:28 +0000 (16:55 +0300)]
log: do not unnecessarily include qom/cpu.h
Split the bits that require it to exec/log.h.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id:
1452174932-28657-8-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Paolo Bonzini [Thu, 7 Jan 2016 13:55:27 +0000 (16:55 +0300)]
trace: add "-trace help"
Print a list of trace points
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id:
1452174932-28657-7-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Paolo Bonzini [Thu, 7 Jan 2016 13:55:26 +0000 (16:55 +0300)]
trace: add "-trace enable=..."
Allow enabling events without going through a file, for example:
qemu-system-x86_64 -trace bdrv_aio_writev -trace bdrv_aio_readv
or with globbing too:
qemu-system-x86_64 -trace 'bdrv_aio_*'
if an appropriate backend is enabled (simple, stderr, ftrace).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id:
1452174932-28657-6-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Denis V. Lunev [Thu, 7 Jan 2016 13:55:25 +0000 (16:55 +0300)]
trace: no need to call trace_backend_init in different branches now
original idea to split calling locations was to spawn tracing thread
in the final child process according to
commit
8a745f2a9296ad2cf6bda33534ed298f2625a4ad
Author: Michael Mueller
Date: Mon Sep 23 16:36:54 2013 +0200
os_daemonize is now on top of both locations. Drop unneeded ifs.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id:
1452174932-28657-5-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Paolo Bonzini [Thu, 7 Jan 2016 13:55:24 +0000 (16:55 +0300)]
trace: split trace_init_file out of trace_init_backends
This is cleaner, and improves error reporting with -daemonize.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id:
1452174932-28657-4-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Paolo Bonzini [Thu, 7 Jan 2016 13:55:23 +0000 (16:55 +0300)]
trace: split trace_init_events out of trace_init_backends
This is cleaner and has two advantages. First, it improves error
reporting with -daemonize. Second, multiple "-trace events" options
now cumulate.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id:
1452174932-28657-3-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Paolo Bonzini [Thu, 7 Jan 2016 13:55:22 +0000 (16:55 +0300)]
trace: fix documentation
Mention the ftrace backend too.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id:
1452174932-28657-2-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Paolo Bonzini [Wed, 28 Oct 2015 06:06:27 +0000 (07:06 +0100)]
trace: track enabled events in a separate array
This is more cache friendly on the fast path, where we already have
the event id available.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Paolo Bonzini [Wed, 28 Oct 2015 06:06:26 +0000 (07:06 +0100)]
trace: count number of enabled events
This lets trace_event_get_state_dynamic quickly return false. Right
now there is hardly any benefit because there are also many assertions
and indirections, but the next patch will streamline all of this.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Wolfgang Bumiller [Wed, 13 Jan 2016 08:09:58 +0000 (09:09 +0100)]
hmp: fix sendkey out of bounds write (CVE-2015-8619)
When processing 'sendkey' command, hmp_sendkey routine null
terminates the 'keyname_buf' array. This results in an OOB
write issue, if 'keyname_len' was to fall outside of
'keyname_buf' array.
Since the keyname's length is known the keyname_buf can be
removed altogether by adding a length parameter to
index_from_key() and using it for the error output as well.
Reported-by: Ling Liu <liuling-it@360.cn>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Message-Id: <
20160113080958.GA18934@olga>
[Comparison with "<" dumbed down, test for junk after strtoul()
tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Peter Maydell [Tue, 2 Feb 2016 18:04:04 +0000 (18:04 +0000)]
Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-for-peter-2016-02-02' into staging
Block patches
# gpg: Signature made Tue 02 Feb 2016 17:23:44 GMT using RSA key ID
E838ACAD
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
* remotes/maxreitz/tags/pull-block-for-peter-2016-02-02: (50 commits)
block: qemu-iotests - add test for snapshot, commit, snapshot bug
block: set device_list.tqe_prev to NULL on BDS removal
iotests: Add "qemu-img map" test for VMDK extents
qemu-img: Make MapEntry a QAPI struct
qemu-img: In "map", use the returned "file" from bdrv_get_block_status
block: Use returned *file in bdrv_co_get_block_status
vmdk: Return extent's file in bdrv_get_block_status
vmdk: Fix calculation of block status's offset
vpc: Assign bs->file->bs to file in vpc_co_get_block_status
vdi: Assign bs->file->bs to file in vdi_co_get_block_status
sheepdog: Assign bs to file in sd_co_get_block_status
qed: Assign bs->file->bs to file in bdrv_qed_co_get_block_status
parallels: Assign bs->file->bs to file in parallels_co_get_block_status
iscsi: Assign bs to file in iscsi_co_get_block_status
raw: Assign bs to file in raw_co_get_block_status
qcow2: Assign bs->file->bs to file in qcow2_co_get_block_status
qcow: Assign bs->file->bs to file in qcow_co_get_block_status
block: Add "file" output parameter to block status query functions
block: acquire in bdrv_query_image_info
iotests: Add test for block jobs and BDS ejection
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Jeff Cody [Tue, 2 Feb 2016 01:33:11 +0000 (20:33 -0500)]
block: qemu-iotests - add test for snapshot, commit, snapshot bug
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id:
2dbc05efba2f683cb3aaf71aaa9b776ebf7ec57c.
1454376655.git.jcody@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
[Moved test number from 143 to 144]
Signed-off-by: Max Reitz <mreitz@redhat.com>
Jeff Cody [Tue, 2 Feb 2016 01:33:10 +0000 (20:33 -0500)]
block: set device_list.tqe_prev to NULL on BDS removal
This fixes a regression introduced with commit
3f09bfbc7. Multiple
bugs arise in conjunction with live snapshots and mirroring operations
(which include active layer commit).
After a live snapshot occurs, the active layer and the base layer both
have a non-NULL tqe_prev field in the device_list, although the base
node's tqe_prev field points to a NULL entry. This non-NULL tqe_prev
field occurs after the bdrv_append() in the external snapshot calls
change_parent_backing_link().
In change_parent_backing_link(), when the previous active layer is
removed from device_list, the device_list.tqe_prev pointer is not
set to NULL.
The operating scheme in the block layer is to indicate that a BDS belongs
in the bdrv_states device_list iff the device_list.tqe_prev pointer
is non-NULL.
This patch does two things:
1.) Introduces a new block layer helper bdrv_device_remove() to remove a
BDS from the device_list, and
2.) uses that new API, which also fixes the regression once used in
change_parent_backing_link().
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id:
0cd51e11c0666c04ddb7c05293fe94afeb551e89.
1454376655.git.jcody@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Peter Maydell [Tue, 2 Feb 2016 17:01:56 +0000 (17:01 +0000)]
Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-
20160202-1' into staging
usb: two ehci fixes.
# gpg: Signature made Tue 02 Feb 2016 13:12:00 GMT using RSA key ID
D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
* remotes/kraxel/tags/pull-usb-
20160202-1:
ehci: update irq on reset
usb: check page select value while processing iTD
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Fam Zheng [Tue, 26 Jan 2016 03:59:03 +0000 (11:59 +0800)]
iotests: Add "qemu-img map" test for VMDK extents
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id:
1453780743-16806-17-git-send-email-famz@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Fam Zheng [Tue, 26 Jan 2016 03:59:02 +0000 (11:59 +0800)]
qemu-img: Make MapEntry a QAPI struct
The "flags" bit mask is expanded to two booleans, "data" and "zero";
"bs" is replaced with "filename" string.
Refactor the merge conditions in img_map() into entry_mergeable().
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id:
1453780743-16806-16-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Fam Zheng [Tue, 26 Jan 2016 03:59:01 +0000 (11:59 +0800)]
qemu-img: In "map", use the returned "file" from bdrv_get_block_status
Now all drivers should return a correct "file", we can make use of it,
even with the recursion into backing chain above.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id:
1453780743-16806-15-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Fam Zheng [Tue, 26 Jan 2016 03:59:00 +0000 (11:59 +0800)]
block: Use returned *file in bdrv_co_get_block_status
Now that all drivers return the right "file" pointer, we can use it.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id:
1453780743-16806-14-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Fam Zheng [Tue, 26 Jan 2016 03:58:59 +0000 (11:58 +0800)]
vmdk: Return extent's file in bdrv_get_block_status
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id:
1453780743-16806-13-git-send-email-famz@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Fam Zheng [Tue, 26 Jan 2016 03:58:58 +0000 (11:58 +0800)]
vmdk: Fix calculation of block status's offset
"offset" is the offset of cluster and sector_num doesn't necessarily
refer to the start of it, it should add index_in_cluster.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id:
1453780743-16806-12-git-send-email-famz@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Fam Zheng [Tue, 26 Jan 2016 03:58:57 +0000 (11:58 +0800)]
vpc: Assign bs->file->bs to file in vpc_co_get_block_status
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id:
1453780743-16806-11-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Fam Zheng [Tue, 26 Jan 2016 03:58:56 +0000 (11:58 +0800)]
vdi: Assign bs->file->bs to file in vdi_co_get_block_status
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id:
1453780743-16806-10-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Fam Zheng [Tue, 26 Jan 2016 03:58:55 +0000 (11:58 +0800)]
sheepdog: Assign bs to file in sd_co_get_block_status
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id:
1453780743-16806-9-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Fam Zheng [Tue, 26 Jan 2016 03:58:54 +0000 (11:58 +0800)]
qed: Assign bs->file->bs to file in bdrv_qed_co_get_block_status
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id:
1453780743-16806-8-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Fam Zheng [Tue, 26 Jan 2016 03:58:53 +0000 (11:58 +0800)]
parallels: Assign bs->file->bs to file in parallels_co_get_block_status
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id:
1453780743-16806-7-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Fam Zheng [Tue, 26 Jan 2016 03:58:52 +0000 (11:58 +0800)]
iscsi: Assign bs to file in iscsi_co_get_block_status
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id:
1453780743-16806-6-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Fam Zheng [Tue, 26 Jan 2016 03:58:51 +0000 (11:58 +0800)]
raw: Assign bs to file in raw_co_get_block_status
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id:
1453780743-16806-5-git-send-email-famz@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Fam Zheng [Tue, 26 Jan 2016 03:58:50 +0000 (11:58 +0800)]
qcow2: Assign bs->file->bs to file in qcow2_co_get_block_status
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id:
1453780743-16806-4-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Fam Zheng [Tue, 26 Jan 2016 03:58:49 +0000 (11:58 +0800)]
qcow: Assign bs->file->bs to file in qcow_co_get_block_status
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id:
1453780743-16806-3-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Fam Zheng [Tue, 26 Jan 2016 03:58:48 +0000 (11:58 +0800)]
block: Add "file" output parameter to block status query functions
The added parameter can be used to return the BDS pointer which the
valid offset is referring to. Its value should be ignored unless
BDRV_BLOCK_OFFSET_VALID in ret is set.
Until block drivers fill in the right value, let's clear it explicitly
right before calling .bdrv_get_block_status.
The "bs->file" condition in bdrv_co_get_block_status is kept now to keep iotest
case 102 passing, and will be fixed once all drivers return the right file
pointer.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id:
1453780743-16806-2-git-send-email-famz@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Paolo Bonzini [Wed, 23 Dec 2015 10:48:23 +0000 (11:48 +0100)]
block: acquire in bdrv_query_image_info
NFS calls aio_poll inside bdrv_get_allocated_size. This requires
acquiring the AioContext.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id:
1450867706-19860-1-git-send-email-pbonzini@redhat.com
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Max Reitz [Fri, 29 Jan 2016 15:36:16 +0000 (16:36 +0100)]
iotests: Add test for block jobs and BDS ejection
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Fri, 29 Jan 2016 15:36:15 +0000 (16:36 +0100)]
iotests: Add test for multiple BB on BDS tree
This adds a test for having multiple BlockBackends in one BDS tree. In
this case, there is one BB for the protocol BDS and one BB for the
format BDS in a simple two-BDS tree (with the protocol BDS and BB added
first).
When bdrv_close_all() is executed, no cached data from any BDS should be
lost; the protocol BDS may not be closed until the format BDS is closed.
Otherwise, metadata updates may be lost.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Fri, 29 Jan 2016 15:36:14 +0000 (16:36 +0100)]
block: Rewrite bdrv_close_all()
This patch rewrites bdrv_close_all(): Until now, all root BDSs have been
force-closed. This is bad because it can lead to cached data not being
flushed to disk.
Instead, try to make all reference holders relinquish their reference
voluntarily:
1. All BlockBackend users are handled by making all BBs simply eject
their BDS tree. Since a BDS can never be on top of a BB, this will
not cause any of the issues as seen with the force-closing of BDSs.
The references will be relinquished and any further access to the BB
will fail gracefully.
2. All BDSs which are owned by the monitor itself (because they do not
have a BB) are relinquished next.
3. Besides BBs and the monitor, block jobs and other BDSs are the only
things left that can hold a reference to BDSs. After every remaining
block job has been canceled, there should not be any BDSs left (and
the loop added here will always terminate (as long as NDEBUG is not
defined), because either all_bdrv_states will be empty or there will
not be any block job left to cancel, failing the assertion).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Fri, 29 Jan 2016 15:36:13 +0000 (16:36 +0100)]
block: Add blk_remove_all_bs()
When bdrv_close_all() is called, instead of force-closing all root
BlockDriverStates, it is better to just drop the reference from all
BlockBackends and let them be closed automatically. This prevents BDS
from getting closed that are still referenced by other BDS, which may
result in loss of cached data.
This patch adds a function for doing that, but does not yet incorporate
it in bdrv_close_all().
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Fri, 29 Jan 2016 15:36:12 +0000 (16:36 +0100)]
blockdev: Keep track of monitor-owned BDS
As a side effect, we can now make x-blockdev-del's check whether a BDS
is actually owned by the monitor explicit.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Fri, 29 Jan 2016 15:36:11 +0000 (16:36 +0100)]
block: Add list of all BlockDriverStates
We need this list so that bdrv_close_all() can keep track of which BDSs
are still open after having removed the BDSs from all of the BBs and
having released all monitor BDS references.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Fri, 29 Jan 2016 15:36:10 +0000 (16:36 +0100)]
block: Make bdrv_close() static
There are no users of bdrv_close() left, except for one of bdrv_open()'s
failure paths, bdrv_close_all() and bdrv_delete(), and that is good.
Make bdrv_close() static so nobody makes the mistake of directly using
bdrv_close() again.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Fri, 29 Jan 2016 15:36:09 +0000 (16:36 +0100)]
blockdev: Use blk_remove_bs() in do_drive_del()
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Fri, 29 Jan 2016 15:36:08 +0000 (16:36 +0100)]
block: Use blk_remove_bs() in blk_delete()
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Fri, 29 Jan 2016 15:36:07 +0000 (16:36 +0100)]
block: Remove BDS close notifier
It is unused now, so we can remove it.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Fri, 29 Jan 2016 15:36:06 +0000 (16:36 +0100)]
nbd: Switch from close to eject notifier
The NBD code uses the BDS close notifier to determine when a medium is
ejected. However, now it should use the BB's BDS removal notifier for
that instead of the BDS's close notifier.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Fri, 29 Jan 2016 15:36:05 +0000 (16:36 +0100)]
virtio-scsi: Catch BDS-BB removal/insertion
Make use of the BDS-BB removal and insertion notifiers to remove or set
up, respectively, virtio-scsi's op blockers.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>