Si-Wei Liu [Sat, 7 May 2022 02:28:16 +0000 (19:28 -0700)]
vhost-vdpa: backend feature should set only once
The vhost_vdpa_one_time_request() branch in
vhost_vdpa_set_backend_cap() incorrectly sends down
ioctls on vhost_dev with non-zero index. This may
end up with multiple VHOST_SET_BACKEND_FEATURES
ioctl calls sent down on the vhost-vdpa fd that is
shared between all these vhost_dev's.
To fix it, send down ioctl only once via the first
vhost_dev with index 0. Toggle the polarity of the
vhost_vdpa_one_time_request() test should do the
trick.
Fixes: 4d191cfdc7de ("vhost-vdpa: classify one time request")
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <
1651890498-24478-6-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Si-Wei Liu [Sat, 7 May 2022 02:28:15 +0000 (19:28 -0700)]
vhost-net: fix improper cleanup in vhost_net_start
vhost_net_start() missed a corresponding stop_one() upon error from
vhost_set_vring_enable(). While at it, make the error handling for
err_start more robust. No real issue was found due to this though.
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
1651890498-24478-5-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Si-Wei Liu [Sat, 7 May 2022 02:28:14 +0000 (19:28 -0700)]
vhost-vdpa: fix improper cleanup in net_init_vhost_vdpa
... such that no memory leaks on dangling net clients in case of
error.
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
1651890498-24478-4-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Si-Wei Liu [Sat, 7 May 2022 02:28:13 +0000 (19:28 -0700)]
virtio-net: align ctrl_vq index for non-mq guest for vhost_vdpa
With MQ enabled vdpa device and non-MQ supporting guest e.g.
booting vdpa with mq=on over OVMF of single vqp, below assert
failure is seen:
../hw/virtio/vhost-vdpa.c:560: vhost_vdpa_get_vq_index: Assertion `idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs' failed.
0 0x00007f8ce3ff3387 in raise () at /lib64/libc.so.6
1 0x00007f8ce3ff4a78 in abort () at /lib64/libc.so.6
2 0x00007f8ce3fec1a6 in __assert_fail_base () at /lib64/libc.so.6
3 0x00007f8ce3fec252 in () at /lib64/libc.so.6
4 0x0000558f52d79421 in vhost_vdpa_get_vq_index (dev=<optimized out>, idx=<optimized out>) at ../hw/virtio/vhost-vdpa.c:563
5 0x0000558f52d79421 in vhost_vdpa_get_vq_index (dev=<optimized out>, idx=<optimized out>) at ../hw/virtio/vhost-vdpa.c:558
6 0x0000558f52d7329a in vhost_virtqueue_mask (hdev=0x558f55c01800, vdev=0x558f568f91f0, n=2, mask=<optimized out>) at ../hw/virtio/vhost.c:1557
7 0x0000558f52c6b89a in virtio_pci_set_guest_notifier (d=d@entry=0x558f568f0f60, n=n@entry=2, assign=assign@entry=true, with_irqfd=with_irqfd@entry=false)
at ../hw/virtio/virtio-pci.c:974
8 0x0000558f52c6c0d8 in virtio_pci_set_guest_notifiers (d=0x558f568f0f60, nvqs=3, assign=true) at ../hw/virtio/virtio-pci.c:1019
9 0x0000558f52bf091d in vhost_net_start (dev=dev@entry=0x558f568f91f0, ncs=0x558f56937cd0, data_queue_pairs=data_queue_pairs@entry=1, cvq=cvq@entry=1)
at ../hw/net/vhost_net.c:361
10 0x0000558f52d4e5e7 in virtio_net_set_status (status=<optimized out>, n=0x558f568f91f0) at ../hw/net/virtio-net.c:289
11 0x0000558f52d4e5e7 in virtio_net_set_status (vdev=0x558f568f91f0, status=15 '\017') at ../hw/net/virtio-net.c:370
12 0x0000558f52d6c4b2 in virtio_set_status (vdev=vdev@entry=0x558f568f91f0, val=val@entry=15 '\017') at ../hw/virtio/virtio.c:1945
13 0x0000558f52c69eff in virtio_pci_common_write (opaque=0x558f568f0f60, addr=<optimized out>, val=<optimized out>, size=<optimized out>) at ../hw/virtio/virtio-pci.c:1292
14 0x0000558f52d15d6e in memory_region_write_accessor (mr=0x558f568f19d0, addr=20, value=<optimized out>, size=1, shift=<optimized out>, mask=<optimized out>, attrs=...)
at ../softmmu/memory.c:492
15 0x0000558f52d127de in access_with_adjusted_size (addr=addr@entry=20, value=value@entry=0x7f8cdbffe748, size=size@entry=1, access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=0x558f52d15cf0 <memory_region_write_accessor>, mr=0x558f568f19d0, attrs=...) at ../softmmu/memory.c:554
16 0x0000558f52d157ef in memory_region_dispatch_write (mr=mr@entry=0x558f568f19d0, addr=20, data=<optimized out>, op=<optimized out>, attrs=attrs@entry=...)
at ../softmmu/memory.c:1504
17 0x0000558f52d078e7 in flatview_write_continue (fv=fv@entry=0x7f8accbc3b90, addr=addr@entry=
103079215124, attrs=..., ptr=ptr@entry=0x7f8ce6300028, len=len@entry=1, addr1=<optimized out>, l=<optimized out>, mr=0x558f568f19d0) at /home/opc/qemu-upstream/include/qemu/host-utils.h:165
18 0x0000558f52d07b06 in flatview_write (fv=0x7f8accbc3b90, addr=
103079215124, attrs=..., buf=0x7f8ce6300028, len=1) at ../softmmu/physmem.c:2822
19 0x0000558f52d0b36b in address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., buf=buf@entry=0x7f8ce6300028, len=<optimized out>)
at ../softmmu/physmem.c:2914
20 0x0000558f52d0b3da in address_space_rw (as=<optimized out>, addr=<optimized out>, attrs=...,
attrs@entry=..., buf=buf@entry=0x7f8ce6300028, len=<optimized out>, is_write=<optimized out>) at ../softmmu/physmem.c:2924
21 0x0000558f52dced09 in kvm_cpu_exec (cpu=cpu@entry=0x558f55c2da60) at ../accel/kvm/kvm-all.c:2903
22 0x0000558f52dcfabd in kvm_vcpu_thread_fn (arg=arg@entry=0x558f55c2da60) at ../accel/kvm/kvm-accel-ops.c:49
23 0x0000558f52f9f04a in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:556
24 0x00007f8ce4392ea5 in start_thread () at /lib64/libpthread.so.0
25 0x00007f8ce40bb9fd in clone () at /lib64/libc.so.6
The cause for the assert failure is due to that the vhost_dev index
for the ctrl vq was not aligned with actual one in use by the guest.
Upon multiqueue feature negotiation in virtio_net_set_multiqueue(),
if guest doesn't support multiqueue, the guest vq layout would shrink
to a single queue pair, consisting of 3 vqs in total (rx, tx and ctrl).
This results in ctrl_vq taking a different vhost_dev group index than
the default. We can map vq to the correct vhost_dev group by checking
if MQ is supported by guest and successfully negotiated. Since the
MQ feature is only present along with CTRL_VQ, we ensure the index
2 is only meant for the control vq while MQ is not supported by guest.
Fixes: 22288fe ("virtio-net: vhost control virtqueue support")
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
1651890498-24478-3-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Si-Wei Liu [Sat, 7 May 2022 02:28:12 +0000 (19:28 -0700)]
virtio-net: setup vhost_dev and notifiers for cvq only when feature is negotiated
When the control virtqueue feature is absent or not negotiated,
vhost_net_start() still tries to set up vhost_dev and install
vhost notifiers for the control virtqueue, which results in
erroneous ioctl calls with incorrect queue index sending down
to driver. Do that only when needed.
Fixes: 22288fe ("virtio-net: vhost control virtqueue support")
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
1651890498-24478-2-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Wei Huang [Fri, 22 Apr 2022 05:51:46 +0000 (00:51 -0500)]
hw/i386/amd_iommu: Fix IOMMU event log encoding errors
Coverity issues several UNINIT warnings against amd_iommu.c [1]. This
patch fixes them by clearing evt before encoding. On top of it, this
patch changes the event log size to 16 bytes per IOMMU specification,
and fixes the event log entry format in amdvi_encode_event().
[1] CID
1487116/
1487200/
1487190/
1487232/
1487115/
1487258
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Message-Id: <
20220422055146.
3312226-1-wei.huang2@amd.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Xiaoyao Li [Thu, 10 Mar 2022 12:28:11 +0000 (20:28 +0800)]
hw/i386: Make pic a property of common x86 base machine type
Legacy PIC (8259) cannot be supported for TDX guests since TDX module
doesn't allow directly interrupt injection. Using posted interrupts
for the PIC is not a viable option as the guest BIOS/kernel will not
do EOI for PIC IRQs, i.e. will leave the vIRR bit set.
Make PIC the property of common x86 machine type. Hence all x86
machines, including microvm, can disable it.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-Id: <
20220310122811.807794-3-xiaoyao.li@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Xiaoyao Li [Thu, 10 Mar 2022 12:28:10 +0000 (20:28 +0800)]
hw/i386: Make pit a property of common x86 base machine type
Both pc and microvm have pit property individually. Let's just make it
the property of common x86 base machine type.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-Id: <
20220310122811.807794-2-xiaoyao.li@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Francisco Iglesias [Mon, 11 Apr 2022 22:18:36 +0000 (00:18 +0200)]
include/hw/pci/pcie_host: Correct PCIE_MMCFG_SIZE_MAX
According to 7.2.2 in [1] bit 27 is the last bit that can be part of the
bus number, this makes the ECAM max size equal to '1 << 28'. This patch
restores back this value into the PCIE_MMCFG_SIZE_MAX define (which was
changed in commit
58d5b22bbd5 ("ppc4xx: Add device models found in PPC440
core SoCs")).
[1] PCI Express® Base Specification Revision 5.0 Version 1.0
Signed-off-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Message-Id: <
20220411221836.17699-3-frasse.iglesias@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Francisco Iglesias [Mon, 11 Apr 2022 22:18:35 +0000 (00:18 +0200)]
include/hw/pci/pcie_host: Correct PCIE_MMCFG_BUS_MASK
According to [1] address bits 27 - 20 are mapped to the bus number (the
TLPs bus number field is 8 bits). Below is the formula taken from Table
7-1 in [1].
"
Memory Address | PCI Express Configuration Space
A[(20+n-1):20] | Bus Number, 1 ≤ n ≤ 8
"
[1] PCI Express® Base Specification Revision 5.0 Version 1.0
Signed-off-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Message-Id: <
20220411221836.17699-2-frasse.iglesias@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Kevin Wolf [Thu, 7 Apr 2022 13:36:55 +0000 (15:36 +0200)]
docs/vhost-user: Clarifications for VHOST_USER_ADD/REM_MEM_REG
The specification for VHOST_USER_ADD/REM_MEM_REG messages is unclear
in several points, which has led to clients having incompatible
implementations. This changes the specification to be more explicit
about them:
* VHOST_USER_ADD_MEM_REG is not specified as receiving a file
descriptor, though it obviously does need to do so. All
implementations agree on this one, fix the specification.
* VHOST_USER_REM_MEM_REG is not specified as receiving a file
descriptor either, and it also has no reason to do so. rust-vmm does
not send file descriptors for removing a memory region (in agreement
with the specification), libvhost-user and QEMU do (which is a bug),
though libvhost-user doesn't actually make any use of it.
Change the specification so that for compatibility QEMU's behaviour
becomes legal, even if discouraged, but rust-vmm's behaviour becomes
the explicitly recommended mode of operation.
* VHOST_USER_ADD_MEM_REG doesn't have a documented return value, which
is the desired behaviour in the non-postcopy case. It also implemented
like this in QEMU and rust-vmm, though libvhost-user is buggy and
sometimes sends an unexpected reply. This will be fixed in a separate
patch.
However, in postcopy mode it does reply like VHOST_USER_SET_MEM_TABLE.
This behaviour is shared between libvhost-user and QEMU; rust-vmm
doesn't implement postcopy mode yet. Mention it explicitly in the
spec.
* The specification doesn't mention how VHOST_USER_REM_MEM_REG
identifies the memory region to be removed. Change it to describe the
existing behaviour of libvhost-user (guest address, user address and
size must match).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <
20220407133657.155281-2-kwolf@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Michael S. Tsirkin [Fri, 13 May 2022 10:56:23 +0000 (06:56 -0400)]
vhost-user: more master/slave things
we switched to front-end/back-end, but newer patches
reintroduced old language. Fix this up.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonah Palmer [Fri, 1 Apr 2022 13:23:19 +0000 (09:23 -0400)]
virtio: add vhost support for virtio devices
This patch adds a get_vhost() callback function for VirtIODevices that
returns the device's corresponding vhost_dev structure, if the vhost
device is running. This patch also adds a vhost_started flag for
VirtIODevices.
Previously, a VirtIODevice wouldn't be able to tell if its corresponding
vhost device was active or not.
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <
1648819405-25696-3-git-send-email-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonah Palmer [Fri, 1 Apr 2022 13:23:18 +0000 (09:23 -0400)]
virtio: drop name parameter for virtio_init()
This patch drops the name parameter for the virtio_init function.
The pair between the numeric device ID and the string device ID
(name) of a virtio device already exists, but not in a way that
lets us map between them.
This patch lets us do this and removes the need for the name
parameter in the virtio_init function.
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <
1648819405-25696-2-git-send-email-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Alex Bennée [Mon, 21 Mar 2022 15:30:37 +0000 (15:30 +0000)]
virtio/vhost-user: dynamically assign VhostUserHostNotifiers
At a couple of hundred bytes per notifier allocating one for every
potential queue is very wasteful as most devices only have a few
queues. Instead of having this handled statically dynamically assign
them and track in a GPtrArray.
[AJB: it's hard to trigger the vhost notifiers code, I assume as it
requires a KVM guest with appropriate backend]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220321153037.
3622127-14-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Alex Bennée [Mon, 21 Mar 2022 15:30:36 +0000 (15:30 +0000)]
hw/virtio/vhost-user: don't suppress F_CONFIG when supported
Previously we would silently suppress VHOST_USER_PROTOCOL_F_CONFIG
during the protocol negotiation if the QEMU stub hadn't implemented
the vhost_dev_config_notifier. However this isn't the only way we can
handle config messages, the existing vdc->get/set_config can do this
as well.
Lightly re-factor the code to check for both potential methods and
instead of silently squashing the feature error out. It is unlikely
that a vhost-user backend expecting to handle CONFIG messages will
behave correctly if they never get sent.
Fixes: 1c3e5a2617 ("vhost-user: back SET/GET_CONFIG requests with a protocol feature")
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220321153037.
3622127-13-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Alex Bennée [Mon, 21 Mar 2022 15:30:34 +0000 (15:30 +0000)]
include/hw: start documenting the vhost API
While trying to get my head around the nest of interactions for vhost
devices I though I could start by documenting the key API functions.
This patch documents the main API hooks for creating and starting a
vhost device as well as how the configuration changes are handled.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <
20220321153037.
3622127-11-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Alex Bennée [Mon, 21 Mar 2022 15:30:33 +0000 (15:30 +0000)]
docs/devel: start documenting writing VirtIO devices
While writing my own VirtIO devices I've gotten confused with how
things are structured and what sort of shared infrastructure there is.
If we can document how everything is supposed to work we can then
maybe start cleaning up inconsistencies in the code.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <
20220309164929.19395-1-alex.bennee@linaro.org>
Message-Id: <
20220321153037.
3622127-10-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Alex Bennée [Mon, 21 Mar 2022 15:30:32 +0000 (15:30 +0000)]
libvhost-user: expose vu_request_to_string
This is useful for more human readable debug messages in vhost-user
programs.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220321153037.
3622127-9-alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Alex Bennée [Mon, 21 Mar 2022 15:30:31 +0000 (15:30 +0000)]
vhost-user.rst: add clarifying language about protocol negotiation
Make the language about feature negotiation explicitly clear about the
handling of the VHOST_USER_F_PROTOCOL_FEATURES feature bit. Try and
avoid the sort of bug introduced in vhost.rs REPLY_ACK processing:
https://github.com/rust-vmm/vhost/pull/24
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: Jiang Liu <gerry@linux.alibaba.com>
Message-Id: <
20210226111619.21178-1-alex.bennee@linaro.org>
Message-Id: <
20220321153037.
3622127-8-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Bonzini [Mon, 21 Mar 2022 15:30:30 +0000 (15:30 +0000)]
docs: vhost-user: replace master/slave with front-end/back-end
This matches the nomenclature that is generally used. Also commonly used
is client/server, but it is not as clear because sometimes the front-end
exposes a passive (server) socket that the back-end connects to.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <
20210226143413.188046-4-pbonzini@redhat.com>
Message-Id: <
20220321153037.
3622127-7-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Bonzini [Mon, 21 Mar 2022 15:30:29 +0000 (15:30 +0000)]
docs: vhost-user: rewrite section on ring state machine
This section is using the word "back-end" to refer to the
"slave's back-end", and talking about the "client" for
what the rest of the document calls the "slave".
Rework it to free the use of the term "back-end", which in
the next patch will replace "slave".
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <
20210226143413.188046-3-pbonzini@redhat.com>
Message-Id: <
20220321153037.
3622127-6-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Paolo Bonzini [Mon, 21 Mar 2022 15:30:28 +0000 (15:30 +0000)]
docs: vhost-user: clean up request/reply description
It is not necessary to mention which side is sending/receiving
each payload; it is more interesting to say which is the request
and which is the reply. This also matches what vhost-user-gpu.rst
already does.
While at it, ensure that all messages list both the request and
the reply payload.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <
20210226143413.188046-2-pbonzini@redhat.com>
Message-Id: <
20220321153037.
3622127-5-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Alex Bennée [Mon, 21 Mar 2022 15:30:27 +0000 (15:30 +0000)]
hw/virtio: add vhost_user_[read|write] trace points
These are useful when trying to debug the initial vhost-user
negotiation, especially when it hard to get logging from the low level
library on the other side.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220321153037.
3622127-4-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Alex Bennée [Mon, 21 Mar 2022 15:30:26 +0000 (15:30 +0000)]
virtio-pci: add notification trace points
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <
20200925125147.26943-6-alex.bennee@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220321153037.
3622127-3-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Alex Bennée [Mon, 21 Mar 2022 15:30:25 +0000 (15:30 +0000)]
hw/virtio: move virtio-pci.h into shared include space
This allows other device classes that will be exposed via PCI to be
able to do so in the appropriate hw/ directory. I resisted the
temptation to re-order headers to be more aesthetically pleasing.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20200925125147.26943-4-alex.bennee@linaro.org>
Message-Id: <
20220321153037.
3622127-2-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ilya Maximets [Fri, 18 Mar 2022 14:04:40 +0000 (15:04 +0100)]
vhost_net: Print feature masks in hex
"0x200000000" is much more readable than "
8589934592".
The change saves one step (conversion) while debugging.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Message-Id: <
20220318140440.596019-1-i.maximets@ovn.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Jason Wang [Thu, 17 Mar 2022 08:05:22 +0000 (16:05 +0800)]
intel-iommu: update iq_dw during post load
We need to update iq_dw according to the DMA_IRQ_REG during post
load. Otherwise we may get wrong IOTLB invalidation descriptor after
migration.
Fixes: fb43cf739e ("intel_iommu: scalable mode emulation")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
20220317080522.14621-2-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Jason Wang [Thu, 17 Mar 2022 08:05:21 +0000 (16:05 +0800)]
intel-iommu: update root_scalable before switching as during post_load
We need check whether passthrough is enabled during
vtd_switch_address_space() by checking the context entries. This
requires the root_scalable to be set correctly otherwise we may try to
check legacy rsvd bits instead of scalable ones.
Fixing this by updating root_scalable before switching the address
spaces during post_load.
Fixes: fb43cf739e ("intel_iommu: scalable mode emulation")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
20220317080522.14621-1-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Jason Wang [Thu, 10 Feb 2022 09:28:15 +0000 (17:28 +0800)]
intel-iommu: block output address in interrupt address range
According to vtd spec v3.3 3.14:
"""
Software must not program paging-structure entries to remap any
address to the interrupt address range. Untranslated requests and
translation requests that result in an address in the interrupt range
will be blocked with condition code LGN.4 or SGN.8.
"""
This patch blocks the request that result in interrupt address range.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
20220210092815.45174-2-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Jason Wang [Thu, 10 Feb 2022 09:28:14 +0000 (17:28 +0800)]
intel-iommu: remove VTD_FR_RESERVED_ERR
This fault reason is not used and is duplicated with SPT.2 condition
code. So let's remove it.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
20220210092815.45174-1-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
David Woodhouse [Mon, 14 Mar 2022 14:25:44 +0000 (14:25 +0000)]
intel_iommu: Fix irqchip / X2APIC configuration checks
We don't need to check kvm_enable_x2apic(). It's perfectly OK to support
interrupt remapping even if we can't address CPUs above 254. Kind of
pointless, but still functional.
The check on kvm_enable_x2apic() needs to happen *anyway* in order to
allow CPUs above 254 even without an IOMMU, so allow that to happen
elsewhere.
However, we do require the *split* irqchip in order to rewrite I/OAPIC
destinations. So fix that check while we're here.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Acked-by: Claudio Fontana <cfontana@suse.de>
Message-Id: <
20220314142544.150555-4-dwmw2@infradead.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
David Woodhouse [Mon, 14 Mar 2022 14:25:43 +0000 (14:25 +0000)]
intel_iommu: Only allow interrupt remapping to be enabled if it's supported
We should probably check if we were meant to be exposing IR, before
letting the guest turn the IRE bit on.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
20220314142544.150555-3-dwmw2@infradead.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
David Woodhouse [Mon, 14 Mar 2022 14:25:42 +0000 (14:25 +0000)]
intel_iommu: Support IR-only mode without DMA translation
By setting none of the SAGAW bits we can indicate to a guest that DMA
translation isn't supported. Tested by booting Windows 10, as well as
Linux guests with the fix at https://git.kernel.org/torvalds/c/
c40aaaac10
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
20220314142544.150555-2-dwmw2@infradead.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
David Woodhouse [Mon, 14 Mar 2022 14:25:41 +0000 (14:25 +0000)]
target/i386: Fix sanity check on max APIC ID / X2APIC enablement
The check on x86ms->apic_id_limit in pc_machine_done() had two problems.
Firstly, we need KVM to support the X2APIC API in order to allow IRQ
delivery to APICs >= 255. So we need to call/check kvm_enable_x2apic(),
which was done elsewhere in *some* cases but not all.
Secondly, microvm needs the same check. So move it from pc_machine_done()
to x86_cpus_init() where it will work for both.
The check in kvm_cpu_instance_init() is now redundant and can be dropped.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Acked-by: Claudio Fontana <cfontana@suse.de>
Message-Id: <
20220314142544.150555-1-dwmw2@infradead.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eugenio Pérez [Thu, 12 May 2022 17:57:47 +0000 (19:57 +0200)]
vhost: Fix element in vhost_svq_add failure
Coverity rightly reports that is not free in that case.
Fixes: Coverity CID 1487559
Fixes: 100890f7ca ("vhost: Shadow virtqueue buffers forwarding")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <
20220512175747.142058-7-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Philippe Mathieu-Daudé [Thu, 12 May 2022 17:57:46 +0000 (19:57 +0200)]
hw/virtio: Replace g_memdup() by g_memdup2()
Per https://discourse.gnome.org/t/port-your-module-from-g-memdup-to-g-memdup2-now/5538
The old API took the size of the memory to duplicate as a guint,
whereas most memory functions take memory sizes as a gsize. This
made it easy to accidentally pass a gsize to g_memdup(). For large
values, that would lead to a silent truncation of the size from 64
to 32 bits, and result in a heap area being returned which is
significantly smaller than what the caller expects. This can likely
be exploited in various modules to cause a heap buffer overflow.
Replace g_memdup() by the safer g_memdup2() wrapper.
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <
20220512175747.142058-6-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eugenio Pérez [Thu, 12 May 2022 17:57:45 +0000 (19:57 +0200)]
vdpa: Fix index calculus at vhost_vdpa_svqs_start
With the introduction of MQ the index of the vq needs to be calculated
with the device model vq_index.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
20220512175747.142058-5-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eugenio Pérez [Thu, 12 May 2022 17:57:44 +0000 (19:57 +0200)]
vdpa: Fix bad index calculus at vhost_vdpa_get_vring_base
Fixes: 6d0b222666 ("vdpa: Adapt vhost_vdpa_get_vring_base to SVQ")
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <
20220512175747.142058-4-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eugenio Pérez [Thu, 12 May 2022 17:57:43 +0000 (19:57 +0200)]
vhost: Fix device's used descriptor dequeue
Only the first one of them were properly enqueued back.
Fixes: 100890f7ca ("vhost: Shadow virtqueue buffers forwarding")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <
20220512175747.142058-3-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eugenio Pérez [Thu, 12 May 2022 17:57:42 +0000 (19:57 +0200)]
vhost: Track descriptor chain in private at SVQ
The device could have access to modify them, and it definitely have
access when we implement packed vq. Harden SVQ maintaining a private
copy of the descriptor chain. Other fields like buffer addresses are
already maintained sepparatedly.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <
20220512175747.142058-2-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 29 Apr 2022 14:41:07 +0000 (15:41 +0100)]
docs/cxl: Add initial Compute eXpress Link (CXL) documentation.
Provide an introduction to the main components of a CXL system,
with detailed explanation of memory interleaving, example command
lines and kernel configuration.
This was a challenging document to write due to the need to extract
only that subset of CXL information which is relevant to either
users of QEMU emulation of CXL or to those interested in the
implementation. Much of CXL is concerned with specific elements of
the protocol, management of memory pooling etc which is simply
not relevant to what is currently planned for CXL emulation
in QEMU. All comments welcome
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20220429144110.25167-43-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:41:04 +0000 (15:41 +0100)]
qtest/cxl: Add more complex test cases with CFMWs
Add CXL Fixed Memory Windows to the CXL tests.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20220429144110.25167-40-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 29 Apr 2022 14:41:03 +0000 (15:41 +0100)]
tests/acpi: Add tables for CXL emulation.
Tables that differ from normal Q35 tables when running the CXL test.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20220429144110.25167-39-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 29 Apr 2022 14:41:02 +0000 (15:41 +0100)]
qtests/bios-tables-test: Add a test for CXL emulation.
The DSDT includes several CXL specific elements and the CEDT
table is only present if we enable CXL.
The test exercises all current functionality with several
CFMWS, CHBS structures in CEDT and ACPI0016/ACPI00017 and _OSC
entries in DSDT.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20220429144110.25167-38-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 29 Apr 2022 14:41:01 +0000 (15:41 +0100)]
tests/acpi: q35: Allow addition of a CXL test.
Add exceptions for the DSDT and the new CEDT tables
specific to a new CXL test in the following patch.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20220429144110.25167-37-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 29 Apr 2022 14:41:00 +0000 (15:41 +0100)]
i386/pc: Enable CXL fixed memory windows
Add the CFMWs memory regions to the memorymap and adjust the
PCI window to avoid hitting the same memory.
Signed-off-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Message-Id: <
20220429144110.25167-36-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:59 +0000 (15:40 +0100)]
hw/cxl/component Add a dumb HDM decoder handler
Add a trivial handler for now to cover the root bridge
where we could do some error checking in future.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20220429144110.25167-35-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 29 Apr 2022 14:40:58 +0000 (15:40 +0100)]
cxl/cxl-host: Add memops for CFMWS region.
These memops perform interleave decoding, walking down the
CXL topology from CFMWS described host interleave
decoder via CXL host bridge HDM decoders, through the CXL
root ports and finally call CXL type 3 specific read and write
functions.
Note that, whilst functional the current implementation does
not support:
* switches
* multiple HDM decoders at a given level.
* unaligned accesses across the interleave boundaries
Signed-off-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Message-Id: <
20220429144110.25167-34-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 29 Apr 2022 14:40:57 +0000 (15:40 +0100)]
mem/cxl_type3: Add read and write functions for associated hostmem.
Once a read or write reaches a CXL type 3 device, the HDM decoders
on the device are used to establish the Device Physical Address
which should be accessed. These functions peform the required maths
and then use a device specific address space to access the
hostmem->mr to fullfil the actual operation. Note that failed writes
are silent, but failed reads return poison. Note this is based
loosely on:
https://lore.kernel.org/qemu-devel/
20200817161853.593247-6-f4bug@amsat.org/
[RFC PATCH 0/9] hw/misc: Add support for interleaved memory accesses
Only lightly tested so far. More complex test cases yet to be written.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20220429144110.25167-33-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 29 Apr 2022 14:40:56 +0000 (15:40 +0100)]
CXL/cxl_component: Add cxl_get_hb_cstate()
Accessor to get hold of the cxl state for a CXL host bridge
without exposing the internals of the implementation.
Signed-off-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-32-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 29 Apr 2022 14:40:55 +0000 (15:40 +0100)]
pci/pcie_port: Add pci_find_port_by_pn()
Simple function to search a PCIBus to find a port by
it's port number.
CXL interleave decoding uses the port number as a target
so it is necessary to locate the port when doing interleave
decoding.
Signed-off-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-31-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 29 Apr 2022 14:40:54 +0000 (15:40 +0100)]
hw/pci-host/gpex-acpi: Add support for dsdt construction for pxb-cxl
This adds code to instantiate the slightly extended ACPI root port
description in DSDT as per the CXL 2.0 specification.
Basically a cut and paste job from the i386/pc code.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-30-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:53 +0000 (15:40 +0100)]
acpi/cxl: Introduce CFMWS structures in CEDT
The CEDT CXL Fixed Window Memory Window Structures (CFMWs)
define regions of the host phyiscal address map which
(via an impdef means) are configured such that they have
a particular interleave setup across one or more CXL Host Bridges.
Reported-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-29-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 29 Apr 2022 14:40:52 +0000 (15:40 +0100)]
hw/cxl/host: Add support for CXL Fixed Memory Windows.
The concept of these is introduced in [1] in terms of the
description the CEDT ACPI table. The principal is more general.
Unlike once traffic hits the CXL root bridges, the host system
memory address routing is implementation defined and effectively
static once observable by standard / generic system software.
Each CXL Fixed Memory Windows (CFMW) is a region of PA space
which has fixed system dependent routing configured so that
accesses can be routed to the CXL devices below a set of target
root bridges. The accesses may be interleaved across multiple
root bridges.
For QEMU we could have fully specified these regions in terms
of a base PA + size, but as the absolute address does not matter
it is simpler to let individual platforms place the memory regions.
ExampleS:
-cxl-fixed-memory-window targets.0=cxl.0,size=128G
-cxl-fixed-memory-window targets.0=cxl.1,size=128G
-cxl-fixed-memory-window targets.0=cxl0,targets.1=cxl.1,size=256G,interleave-granularity=2k
Specifies
* 2x 128G regions not interleaved across root bridges, one for each of
the root bridges with ids cxl.0 and cxl.1
* 256G region interleaved across root bridges with ids cxl.0 and cxl.1
with a 2k interleave granularity.
When system software enumerates the devices below a given root bridge
it can then decide which CFMW to use. If non interleave is desired
(or possible) it can use the appropriate CFMW for the root bridge in
question. If there are suitable devices to interleave across the
two root bridges then it may use the 3rd CFMS.
A number of other designs were considered but the following constraints
made it hard to adapt existing QEMU approaches to this particular problem.
1) The size must be known before a specific architecture / board brings
up it's PA memory map. We need to set up an appropriate region.
2) Using links to the host bridges provides a clean command line interface
but these links cannot be established until command line devices have
been added.
Hence the two step process used here of first establishing the size,
interleave-ways and granularity + caching the ids of the host bridges
and then, once available finding the actual host bridges so they can
be used later to support interleave decoding.
[1] CXL 2.0 ECN: CEDT CFMWS & QTG DSM (computeexpresslink.org / specifications)
Signed-off-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Markus Armbruster <armbru@redhat.com> # QAPI Schema
Message-Id: <
20220429144110.25167-28-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 29 Apr 2022 14:40:51 +0000 (15:40 +0100)]
hw/cxl/component: Add utils for interleave parameter encoding/decoding
Both registers and the CFMWS entries in CDAT use simple encodings
for the number of interleave ways and the interleave granularity.
Introduce simple conversion functions to/from the unencoded
number / size. So far the iw decode has not been needed so is
it not implemented.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-27-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:50 +0000 (15:40 +0100)]
acpi/cxl: Create the CEDT (9.14.1)
The CXL Early Discovery Table is defined in the CXL 2.0 specification as
a way for the OS to get CXL specific information from the system
firmware.
CXL 2.0 specification adds an _HID, ACPI0016, for CXL capable host
bridges, with a _CID of PNP0A08 (PCIe host bridge). CXL aware software
is able to use this initiate the proper _OSC method, and get the _UID
which is referenced by the CEDT. Therefore the existence of an ACPI0016
device allows a CXL aware driver perform the necessary actions. For a
CXL capable OS, this works. For a CXL unaware OS, this works.
CEDT awaremess requires more. The motivation for ACPI0017 is to provide
the possibility of having a Linux CXL module that can work on a legacy
Linux kernel. Linux core PCI/ACPI which won't be built as a module,
will see the _CID of PNP0A08 and bind a driver to it. If we later loaded
a driver for ACPI0016, Linux won't be able to bind it to the hardware
because it has already bound the PNP0A08 driver. The ACPI0017 device is
an opportunity to have an object to bind a driver will be used by a
Linux driver to walk the CXL topology and do everything that we would
have preferred to do with ACPI0016.
There is another motivation for an ACPI0017 device which isn't
implemented here. An operating system needs an attach point for a
non-volatile region provider that understands cross-hostbridge
interleaving. Since QEMU emulation doesn't support interleaving yet,
this is more important on the OS side, for now.
As of CXL 2.0 spec, only 1 sub structure is defined, the CXL Host Bridge
Structure (CHBS) which is primarily useful for telling the OS exactly
where the MMIO for the host bridge is.
Link: https://lore.kernel.org/linux-cxl/20210115034911.nkgpzc756d6qmjpl@intel.com/T/#t
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-26-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:49 +0000 (15:40 +0100)]
acpi/cxl: Add _OSC implementation (9.14.2)
CXL 2.0 specification adds 2 new dwords to the existing _OSC definition
from PCIe. The new dwords are accessed with a new uuid. This
implementation supports what is in the specification.
iasl -d decodes the result of this patch as:
Name (SUPP, Zero)
Name (CTRL, Zero)
Name (SUPC, Zero)
Name (CTRC, Zero)
Method (_OSC, 4, NotSerialized) // _OSC: Operating System Capabilities
{
CreateDWordField (Arg3, Zero, CDW1)
If (((Arg0 == ToUUID ("
33db4d5b-1ff7-401c-9657-
7441c03dd766") /* PCI Host Bridge Device */) || (Arg0 == ToUUID ("
68f2d50b-c469-4d8a-bd3d-
941a103fd3fc") /* Unknown UUID */)))
{
CreateDWordField (Arg3, 0x04, CDW2)
CreateDWordField (Arg3, 0x08, CDW3)
Local0 = CDW3 /* \_SB_.PC0C._OSC.CDW3 */
Local0 &= 0x1F
If ((Arg1 != One))
{
CDW1 |= 0x08
}
If ((CDW3 != Local0))
{
CDW1 |= 0x10
}
SUPP = CDW2 /* \_SB_.PC0C._OSC.CDW2 */
CTRL = CDW3 /* \_SB_.PC0C._OSC.CDW3 */
CDW3 = Local0
If ((Arg0 == ToUUID ("
68f2d50b-c469-4d8a-bd3d-
941a103fd3fc") /* Unknown UUID */))
{
CreateDWordField (Arg3, 0x0C, CDW4)
CreateDWordField (Arg3, 0x10, CDW5)
SUPC = CDW4 /* \_SB_.PC0C._OSC.CDW4 */
CTRC = CDW5 /* \_SB_.PC0C._OSC.CDW5 */
CDW5 |= One
}
Return (Arg3)
}
Else
{
CDW1 |= 0x04
Return (Arg3)
}
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20220429144110.25167-25-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:48 +0000 (15:40 +0100)]
hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
CXL host bridges themselves may have MMIO. Since host bridges don't have
a BAR they are treated as special for MMIO. This patch includes
i386/pc support.
Also hook up the device reset now that we have have the MMIO
space in which the results are visible.
Note that we duplicate the PCI express case for the aml_build but
the implementations will diverge when the CXL specific _OSC is
introduced.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-24-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 29 Apr 2022 14:40:47 +0000 (15:40 +0100)]
qtests/cxl: Add initial root port and CXL type3 tests
At this stage we can boot configurations with host bridges,
root ports and type 3 memory devices, so add appropriate
tests.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-23-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:46 +0000 (15:40 +0100)]
hw/cxl/device: Implement get/set Label Storage Area (LSA)
Implement get and set handlers for the Label Storage Area
used to hold data describing persistent memory configuration
so that it can be ensured it is seen in the same configuration
after reboot.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20220429144110.25167-22-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:45 +0000 (15:40 +0100)]
hw/cxl/device: Plumb real Label Storage Area (LSA) sizing
This should introduce no change. Subsequent work will make use of this
new class member.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20220429144110.25167-21-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:44 +0000 (15:40 +0100)]
hw/cxl/device: Add some trivial commands
GET_FW_INFO and GET_PARTITION_INFO, for this emulation, is equivalent to
info already returned in the IDENTIFY command. To have a more robust
implementation, add those.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20220429144110.25167-20-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:43 +0000 (15:40 +0100)]
hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12)
A device's volatile and persistent memory are known Host Defined Memory
(HDM) regions. The mechanism by which the device is programmed to claim
the addresses associated with those regions is through dedicated logic
known as the HDM decoder. In order to allow the OS to properly program
the HDMs, the HDM decoders must be modeled.
There are two ways the HDM decoders can be implemented, the legacy
mechanism is through the PCIe DVSEC programming from CXL 1.1 (8.1.3.8),
and MMIO is found in 8.2.5.12 of the spec. For now, 8.1.3.8 is not
implemented.
Much of CXL device logic is implemented in cxl-utils. The HDM decoder
however is implemented directly by the device implementation.
Whilst the implementation currently does no validity checks on the
encoder set up, future work will add sanity checking specific to
the type of cxl component.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-19-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:42 +0000 (15:40 +0100)]
hw/cxl/device: Add a memory device (8.2.8.5)
A CXL memory device (AKA Type 3) is a CXL component that contains some
combination of volatile and persistent memory. It also implements the
previously defined mailbox interface as well as the memory device
firmware interface.
Although the memory device is configured like a normal PCIe device, the
memory traffic is on an entirely separate bus conceptually (using the
same physical wires as PCIe, but different protocol).
Once the CXL topology is fully configure and address decoders committed,
the guest physical address for the memory device is part of a larger
window which is owned by the platform. The creation of these windows
is later in this series.
The following example will create a 256M device in a 512M window:
-object "memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M"
-device "cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0"
Note: Dropped PCDIMM info interfaces for now. They can be added if
appropriate at a later date.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20220429144110.25167-18-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:41 +0000 (15:40 +0100)]
hw/cxl/rp: Add a root port
This adds just enough of a root port implementation to be able to
enumerate root ports (creating the required DVSEC entries). What's not
here yet is the MMIO nor the ability to write some of the DVSEC entries.
This can be added with the qemu commandline by adding a rootport to a
specific CXL host bridge. For example:
-device cxl-rp,id=rp0,bus="cxl.0",addr=0.0,chassis=4
Like the host bridge patch, the ACPI tables aren't generated at this
point and so system software cannot use it.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-17-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 29 Apr 2022 14:40:40 +0000 (15:40 +0100)]
qtest/cxl: Introduce initial test for pxb-cxl only.
Initial test with just pxb-cxl. Other tests will be added
alongside functionality.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-16-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:39 +0000 (15:40 +0100)]
hw/pxb: Allow creation of a CXL PXB (host bridge)
This works like adding a typical pxb device, except the name is
'pxb-cxl' instead of 'pxb-pcie'. An example command line would be as
follows:
-device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1
A CXL PXB is backward compatible with PCIe. What this means in practice
is that an operating system that is unaware of CXL should still be able
to enumerate this topology as if it were PCIe.
One can create multiple CXL PXB host bridges, but a host bridge can only
be connected to the main root bus. Host bridges cannot appear elsewhere
in the topology.
Note that as of this patch, the ACPI tables needed for the host bridge
(specifically, an ACPI object in _SB named ACPI0016 and the CEDT) aren't
created. So while this patch internally creates it, it cannot be
properly used by an operating system or other system software.
Also necessary is to add an exception to scripts/device-crash-test
similar to that for exiting pxb as both must created on a PCIexpress
host bus.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Jonathan.Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-15-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 29 Apr 2022 14:40:38 +0000 (15:40 +0100)]
cxl: Machine level control on whether CXL support is enabled
There are going to be some potential overheads to CXL enablement,
for example the host bridge region reserved in memory maps.
Add a machine level control so that CXL is disabled by default.
Signed-off-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-14-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:37 +0000 (15:40 +0100)]
hw/pci/cxl: Create a CXL bus type
The easiest way to differentiate a CXL bus, and a PCIE bus is using a
flag. A CXL bus, in hardware, is backward compatible with PCIE, and
therefore the code tries pretty hard to keep them in sync as much as
possible.
The other way to implement this would be to try to cast the bus to the
correct type. This is less code and useful for debugging via simply
looking at the flags.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-13-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:36 +0000 (15:40 +0100)]
hw/pxb: Use a type for realizing expanders
This opens up the possibility for more types of expanders (other than
PCI and PCIe). We'll need this to create a CXL expander.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-12-Jonathan.Cameron@huawei.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:35 +0000 (15:40 +0100)]
hw/cxl/device: Add log commands (8.2.9.4) + CEL
CXL specification provides for the ability to obtain logs from the
device. Logs are either spec defined, like the "Command Effects Log"
(CEL), or vendor specific. UUIDs are defined for all log types.
The CEL is a mechanism to provide information to the host about which
commands are supported. It is useful both to determine which spec'd
optional commands are supported, as well as provide a list of vendor
specified commands that might be used. The CEL is already created as
part of mailbox initialization, but here it is now exported to hosts
that use these log commands.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-11-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:34 +0000 (15:40 +0100)]
hw/cxl/device: Timestamp implementation (8.2.9.3)
Errata F4 to CXL 2.0 clarified the meaning of the timer as the
sum of the value set with the timestamp set command and the number
of nano seconds since it was last set.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-10-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:33 +0000 (15:40 +0100)]
hw/cxl/device: Add cheap EVENTS implementation (8.2.9.1)
Using the previously implemented stubbed helpers, it is now possible to
easily add the missing, required commands to the implementation.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-9-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:32 +0000 (15:40 +0100)]
hw/cxl/device: Add memory device utilities
Memory devices implement extra capabilities on top of CXL devices. This
adds support for that.
A large part of memory devices is the mailbox/command interface. All of
the mailbox handling is done in the mailbox-utils library. Longer term,
new CXL devices that are being emulated may want to handle commands
differently, and therefore would need a mechanism to opt in/out of the
specific generic handlers. As such, this is considered sufficient for
now, but may need more depth in the future.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-8-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:31 +0000 (15:40 +0100)]
hw/cxl/device: Implement basic mailbox (8.2.8.4)
This is the beginning of implementing mailbox support for CXL 2.0
devices. The implementation recognizes when the doorbell is rung,
handles the command/payload, clears the doorbell while returning error
codes and data.
Generally the mailbox mechanism is designed to permit communication
between the host OS and the firmware running on the device. For our
purposes, we emulate both the firmware, implemented primarily in
cxl-mailbox-utils.c, and the hardware.
No commands are implemented yet.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-7-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:30 +0000 (15:40 +0100)]
hw/cxl/device: Implement the CAP array (8.2.8.1-2)
This implements all device MMIO up to the first capability. That
includes the CXL Device Capabilities Array Register, as well as all of
the CXL Device Capability Header Registers. The latter are filled in as
they are implemented in the following patches.
Endianness and alignment are managed by softmmu memory core.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-6-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:29 +0000 (15:40 +0100)]
hw/cxl/device: Introduce a CXL device (8.2.8)
A CXL device is a type of CXL component. Conceptually, a CXL device
would be a leaf node in a CXL topology. From an emulation perspective,
CXL devices are the most complex and so the actual implementation is
reserved for discrete commits.
This new device type is specifically catered towards the eventual
implementation of a Type3 CXL.mem device, 8.2.8.5 in the CXL 2.0
specification.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Adam Manzanares <a.manzanares@samsung.com>
Message-Id: <
20220429144110.25167-5-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 29 Apr 2022 14:40:28 +0000 (15:40 +0100)]
MAINTAINERS: Add entry for Compute Express Link Emulation
The CXL emulation will be jointly maintained by Ben Widawsky
and Jonathan Cameron. Broken out as a separate patch
to improve visibility.
Signed-off-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20220429144110.25167-4-Jonathan.Cameron@huawei.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:27 +0000 (15:40 +0100)]
hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
A CXL 2.0 component is any entity in the CXL topology. All components
have a analogous function in PCIe. Except for the CXL host bridge, all
have a PCIe config space that is accessible via the common PCIe
mechanisms. CXL components are enumerated via DVSEC fields in the
extended PCIe header space. CXL components will minimally implement some
subset of CXL.mem and CXL.cache registers defined in 8.2.5 of the CXL
2.0 specification. Two headers and a utility library are introduced to
support the minimum functionality needed to enumerate components.
The cxl_pci header manages bits associated with PCI, specifically the
DVSEC and related fields. The cxl_component.h variant has data
structures and APIs that are useful for drivers implementing any of the
CXL 2.0 components. The library takes care of making use of the DVSEC
bits and the CXL.[mem|cache] registers. Per spec, the registers are
little endian.
None of the mechanisms required to enumerate a CXL capable hostbridge
are introduced at this point.
Note that the CXL.mem and CXL.cache registers used are always 4B wide.
It's possible in the future that this constraint will not hold.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Adam Manzanares <a.manzanares@samsung.com>
Message-Id: <
20220429144110.25167-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ben Widawsky [Fri, 29 Apr 2022 14:40:26 +0000 (15:40 +0100)]
hw/pci/cxl: Add a CXL component type (interface)
A CXL component is a hardware entity that implements CXL component
registers from the CXL 2.0 spec (8.2.3). Currently these represent 3
general types.
1. Host Bridge
2. Ports (root, upstream, downstream)
3. Devices (memory, other)
A CXL component can be conceptually thought of as a PCIe device with
extra functionality when enumerated and enabled. For this reason, CXL
does here, and will continue to add on to existing PCI code paths.
Host bridges will typically need to be handled specially and so they can
implement this newly introduced interface or not. All other components
should implement this interface. Implementing this interface allows the
core PCI code to treat these devices as special where appropriate.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Adam Manzanares <a.manzanares@samsung.com>
Message-Id: <
20220429144110.25167-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jason Wang [Fri, 1 Apr 2022 02:28:24 +0000 (10:28 +0800)]
intel-iommu: correct the value used for error_setg_errno()
error_setg_errno() expects a normal errno value, not a negated
one, so we should use ENOTSUP instead of -ENOSUP.
Fixes: Coverity CID 1487174
Fixes: ("intel_iommu: support snoop control")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
20220401022824.9337-1-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Halil Pasic [Mon, 7 Mar 2022 11:29:39 +0000 (12:29 +0100)]
virtio: fix feature negotiation for ACCESS_PLATFORM
Unlike most virtio features ACCESS_PLATFORM is considered mandatory by
QEMU, i.e. the driver must accept it if offered by the device. The
virtio specification says that the driver SHOULD accept the
ACCESS_PLATFORM feature if offered, and that the device MAY fail to
operate if ACCESS_PLATFORM was offered but not negotiated.
While a SHOULD ain't exactly a MUST, we are certainly allowed to fail
the device when the driver fences ACCESS_PLATFORM. With commit
2943b53f68 ("virtio: force VIRTIO_F_IOMMU_PLATFORM") we already made the
decision to do so whenever the get_dma_as() callback is implemented (by
the bus), which in practice means for the entirety of virtio-pci.
That means, if the device needs to translate I/O addresses, then
ACCESS_PLATFORM is mandatory. The aforementioned commit tells us in the
commit message that this is for security reasons. More precisely if we
were to allow a less then trusted driver (e.g. an user-space driver, or
a nested guest) to make the device bypass the IOMMU by not negotiating
ACCESS_PLATFORM, then the guest kernel would have no ability to
control/police (by programming the IOMMU) what pieces of guest memory
the driver may manipulate using the device. Which would break security
assumptions within the guest.
If ACCESS_PLATFORM is offered not because we want the device to utilize
an IOMMU and do address translation, but because the device does not
have access to the entire guest RAM, and needs the driver to grant
access to the bits it needs access to (e.g. confidential guest support),
we still require the guest to have the corresponding logic and to accept
ACCESS_PLATFORM. If the driver does not accept ACCESS_PLATFORM, then
things are bound to go wrong, and we may see failures much less graceful
than failing the device because the driver didn't negotiate
ACCESS_PLATFORM.
So let us make ACCESS_PLATFORM mandatory for the driver regardless
of whether the get_dma_as() callback is implemented or not.
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Fixes: 2943b53f68 ("virtio: force VIRTIO_F_IOMMU_PLATFORM")
Message-Id: <
20220307112939.
2780117-1-pasic@linux.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Richard Henderson [Thu, 12 May 2022 17:52:15 +0000 (10:52 -0700)]
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* small cleanups for pc-bios/optionrom Makefiles
* checkpatch: fix g_malloc check
* fix mremap() and RDMA detection
* confine igd-passthrough-isa-bridge to Xen-enabled builds
* cover PCI in arm-virt machine qtests
* add -M boot and -M mem compound properties
* bump SLIRP submodule
* support CFI with system libslirp (>= 4.7)
* clean up CoQueue wakeup functions
* fix vhost-vsock regression
* fix --disable-vnc compilation
* other minor bugfixes
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmJ8/KMUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroNTTAf9Et1C8iZn+OlZi99wMEeMy8a4mIE5
# CpkBpFphhkBvt3AH7XNsCyL4Gea4QgsI7nOIEVUwvW7gPf85PiBUX8mjrIVg3x1k
# bmMEwMKSTYPmDieAnYBP9zCqZQXNYP8L8WxVs2jFY2GXZ2ZogODYFbvCY4yEEB72
# UR6uIvQRdpiB6BEj8UZ+5i+sDtb0zxqrjzUz8T/PJC9/2JSNgi+sAWWQoQT3PPU7
# R7z2nmEa1VeVLPP6mUHvJKhBltVXF+LyIjQHvo+Tp9tSqp9JwXfFBNQ5W/MFes2D
# skF47N7PdgKRH9Dp4r0j+MqBwoAq86+ao+MKsbQ1Gb91HhoCWt/MrVrVyg==
# =1E6P
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 12 May 2022 05:25:07 AM PDT
# gpg: using RSA key
F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [undefined]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (27 commits)
vmxcap: add tertiary execution controls
vl: make machine type deprecation a warning
meson: link libpng independent of vnc
vhost-backend: do not depend on CONFIG_VHOST_VSOCK
coroutine-lock: qemu_co_queue_restart_all is a coroutine-only qemu_co_enter_all
coroutine-lock: introduce qemu_co_queue_enter_all
coroutine-lock: qemu_co_queue_next is a coroutine-only qemu_co_enter_next
net: slirp: allow CFI with libslirp >= 4.7
net: slirp: add support for CFI-friendly timer API
net: slirp: switch to slirp_new
net: slirp: introduce a wrapper struct for QemuTimer
slirp: bump submodule past 4.7 release
machine: move more memory validation to Machine object
machine: make memory-backend a link property
machine: add mem compound property
machine: add boot compound property
machine: use QAPI struct for boot configuration
tests/qtest/libqos: Add generic pci host bridge in arm-virt machine
tests/qtest/libqos: Skip hotplug tests if pci root bus is not hotpluggable
tests/qtest/libqos/pci: Introduce pio_limit
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Thu, 12 May 2022 15:37:28 +0000 (08:37 -0700)]
Merge tag 'for-upstream' of git://repo.or.cz/qemu/kevin into staging
Block layer patches
- coroutine: Fix crashes due to too large pool batch size
- fdc: Prevent end-of-track overrun
- nbd: MULTI_CONN for shared writable exports
- iotests test runner improvements
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmJ9KCkRHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9ZtSRAAmYDFBPqxfutpFXM7kIKwL6COXJC12MOx
# Tmu8cDiGB/jNChdi3kl6I5h5njzo3U0ZlL/Ign6EzHoeoXLAPSeUWmuRsARwsZ+A
# rL61gf6yrMjAo45FZuIS0GlMDk8BauRwPl9qPWeqQcrtOMYpxwZfyFGmcMpQgAOI
# MSC1I8p3FA7oJhGpKIHDPOjaZA97Lm2rLnDIwZ4f0YgssbybFBcFCXOQbhpsVhLy
# Tjp/L+qRUtna9xBsPHQvHZW0kITQbCQPdX+oVqqUmwzSvuHqfXKe1YppyPjBt/S0
# H7nxtx4HOgP0lP5Kea+wbIRAk9Da5uaOW8hlMWRLShEKv1iTUenQSKteBB6CD03t
# GD9ze1kGoR9b6szw795BXxZxcWii0cn359lIVHeKR/U8zDuz5w3zhyl0klK8xeJy
# nj+JErLwQ7BD8kNR+7WAfXTF3tk2dQao1AvsBjn087KjMiJ/Mg8HY4K2zrjBUrHL
# DLTyAIjzct3BWJDZ02fb5jb8pHmIP3JO6m9Zvjm7ibP65BqJOwIXUTFpbgnrOg45
# oFLDV4JgC4Hh4GEtdm+UhQE51A0VVW5pDaqWTdWkCcuk3QgxUdM3Wm3SW6pw1Gvb
# T0X0j5RgF/k3YrW576R/VIy6z4YPbzAtiG4O/zSlsujHoDcVNWnxApgSB/unaDh8
# LNkFPGEMeSs=
# =JmTm
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 12 May 2022 08:30:49 AM PDT
# gpg: using RSA key
DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
* tag 'for-upstream' of git://repo.or.cz/qemu/kevin:
qemu-iotests: inline common.config into common.rc
nbd/server: Allow MULTI_CONN for shared writable exports
qemu-nbd: Pass max connections to blockdev layer
tests/qtest/fdc-test: Add a regression test for CVE-2021-3507
hw/block/fdc: Prevent end-of-track overrun (CVE-2021-3507)
.gitlab-ci.d: export meson testlog.txt as an artifact
tests/qemu-iotests: print intent to run a test in TAP mode
iotests/testrunner: Flush after run_test()
coroutine: Revert to constant batch size
coroutine: Rename qemu_coroutine_inc/dec_pool_size()
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Paolo Bonzini [Thu, 5 May 2022 09:47:23 +0000 (11:47 +0200)]
qemu-iotests: inline common.config into common.rc
common.rc has some complicated logic to find the common.config that
dates back to xfstests and is completely unnecessary now. Just include
the contents of the file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <
20220505094723.732116-1-pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Paolo Bonzini [Wed, 11 May 2022 16:39:12 +0000 (18:39 +0200)]
vmxcap: add tertiary execution controls
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Wed, 11 May 2022 17:50:43 +0000 (13:50 -0400)]
vl: make machine type deprecation a warning
error_report should generally be followed by a failure; if we can proceed
anyway, that is just a warning and should be communicated properly to
the user with warn_report.
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <
20220511175043.27327-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Eric Blake [Thu, 12 May 2022 00:49:24 +0000 (19:49 -0500)]
nbd/server: Allow MULTI_CONN for shared writable exports
According to the NBD spec, a server that advertises
NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will
not see any cache inconsistencies: when properly separated by a single
flush, actions performed by one client will be visible to another
client, regardless of which client did the flush.
We always satisfy these conditions in qemu - even when we support
multiple clients, ALL clients go through a single point of reference
into the block layer, with no local caching. The effect of one client
is instantly visible to the next client. Even if our backend were a
network device, we argue that any multi-path caching effects that
would cause inconsistencies in back-to-back actions not seeing the
effect of previous actions would be a bug in that backend, and not the
fault of caching in qemu. As such, it is safe to unconditionally
advertise CAN_MULTI_CONN for any qemu NBD server situation that
supports parallel clients.
Note, however, that we don't want to advertise CAN_MULTI_CONN when we
know that a second client cannot connect (for historical reasons,
qemu-nbd defaults to a single connection while nbd-server-add and QMP
commands default to unlimited connections; but we already have
existing means to let either style of NBD server creation alter those
defaults). This is visible by no longer advertising MULTI_CONN for
'qemu-nbd -r' without -e, as in the iotest nbd-qemu-allocation.
The harder part of this patch is setting up an iotest to demonstrate
behavior of multiple NBD clients to a single server. It might be
possible with parallel qemu-io processes, but I found it easier to do
in python with the help of libnbd, and help from Nir and Vladimir in
writing the test.
Signed-off-by: Eric Blake <eblake@redhat.com>
Suggested-by: Nir Soffer <nsoffer@redhat.com>
Suggested-by: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru>
Message-Id: <
20220512004924.417153-3-eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eric Blake [Thu, 12 May 2022 00:49:23 +0000 (19:49 -0500)]
qemu-nbd: Pass max connections to blockdev layer
The next patch wants to adjust whether the NBD server code advertises
MULTI_CONN based on whether it is known if the server limits to
exactly one client. For a server started by QMP, this information is
obtained through nbd_server_start (which can support more than one
export); but for qemu-nbd (which supports exactly one export), it is
controlled only by the command-line option -e/--shared. Since we
already have a hook function used by qemu-nbd, it's easiest to just
alter its signature to fit our needs.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <
20220512004924.417153-2-eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Philippe Mathieu-Daudé [Thu, 18 Nov 2021 11:57:33 +0000 (12:57 +0100)]
tests/qtest/fdc-test: Add a regression test for CVE-2021-3507
Add the reproducer from https://gitlab.com/qemu-project/qemu/-/issues/339
Without the previous commit, when running 'make check-qtest-i386'
with QEMU configured with '--enable-sanitizers' we get:
==
4028352==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x619000062a00 at pc 0x5626d03c491a bp 0x7ffdb4199410 sp 0x7ffdb4198bc0
READ of size 786432 at 0x619000062a00 thread T0
#0 0x5626d03c4919 in __asan_memcpy (qemu-system-i386+0x1e65919)
#1 0x5626d1c023cc in flatview_write_continue softmmu/physmem.c:2787:13
#2 0x5626d1bf0c0f in flatview_write softmmu/physmem.c:2822:14
#3 0x5626d1bf0798 in address_space_write softmmu/physmem.c:2914:18
#4 0x5626d1bf0f37 in address_space_rw softmmu/physmem.c:2924:16
#5 0x5626d1bf14c8 in cpu_physical_memory_rw softmmu/physmem.c:2933:5
#6 0x5626d0bd5649 in cpu_physical_memory_write include/exec/cpu-common.h:82:5
#7 0x5626d0bd0a07 in i8257_dma_write_memory hw/dma/i8257.c:452:9
#8 0x5626d09f825d in fdctrl_transfer_handler hw/block/fdc.c:1616:13
#9 0x5626d0a048b4 in fdctrl_start_transfer hw/block/fdc.c:1539:13
#10 0x5626d09f4c3e in fdctrl_write_data hw/block/fdc.c:2266:13
#11 0x5626d09f22f7 in fdctrl_write hw/block/fdc.c:829:9
#12 0x5626d1c20bc5 in portio_write softmmu/ioport.c:207:17
0x619000062a00 is located 0 bytes to the right of 512-byte region [0x619000062800,0x619000062a00)
allocated by thread T0 here:
#0 0x5626d03c66ec in posix_memalign (qemu-system-i386+0x1e676ec)
#1 0x5626d2b988d4 in qemu_try_memalign util/oslib-posix.c:210:11
#2 0x5626d2b98b0c in qemu_memalign util/oslib-posix.c:226:27
#3 0x5626d09fbaf0 in fdctrl_realize_common hw/block/fdc.c:2341:20
#4 0x5626d0a150ed in isabus_fdc_realize hw/block/fdc-isa.c:113:5
#5 0x5626d2367935 in device_set_realized hw/core/qdev.c:531:13
SUMMARY: AddressSanitizer: heap-buffer-overflow (qemu-system-i386+0x1e65919) in __asan_memcpy
Shadow bytes around the buggy address:
0x0c32800044f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c3280004500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c3280004510: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c3280004520: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c3280004530: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c3280004540:[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c3280004550: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c3280004560: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c3280004570: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c3280004580: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c3280004590: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Heap left redzone: fa
Freed heap region: fd
==
4028352==ABORTING
[ kwolf: Added snapshot=on to prevent write file lock failure ]
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Philippe Mathieu-Daudé [Thu, 18 Nov 2021 11:57:32 +0000 (12:57 +0100)]
hw/block/fdc: Prevent end-of-track overrun (CVE-2021-3507)
Per the 82078 datasheet, if the end-of-track (EOT byte in
the FIFO) is more than the number of sectors per side, the
command is terminated unsuccessfully:
* 5.2.5 DATA TRANSFER TERMINATION
The 82078 supports terminal count explicitly through
the TC pin and implicitly through the underrun/over-
run and end-of-track (EOT) functions. For full sector
transfers, the EOT parameter can define the last
sector to be transferred in a single or multisector
transfer. If the last sector to be transferred is a par-
tial sector, the host can stop transferring the data in
mid-sector, and the 82078 will continue to complete
the sector as if a hardware TC was received. The
only difference between these implicit functions and
TC is that they return "abnormal termination" result
status. Such status indications can be ignored if they
were expected.
* 6.1.3 READ TRACK
This command terminates when the EOT specified
number of sectors have been read. If the 82078
does not find an I D Address Mark on the diskette
after the second· occurrence of a pulse on the
INDX# pin, then it sets the IC code in Status Regis-
ter 0 to "01" (Abnormal termination), sets the MA bit
in Status Register 1 to "1", and terminates the com-
mand.
* 6.1.6 VERIFY
Refer to Table 6-6 and Table 6-7 for information
concerning the values of MT and EC versus SC and
EOT value.
* Table 6·6. Result Phase Table
* Table 6-7. Verify Command Result Phase Table
Fix by aborting the transfer when EOT > # Sectors Per Side.
Cc: qemu-stable@nongnu.org
Cc: Hervé Poussineau <hpoussin@reactos.org>
Fixes: baca51faff0 ("floppy driver: disk geometry auto detect")
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/339
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <
20211118115733.
4038610-2-philmd@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Kshitij Suri [Tue, 10 May 2022 16:19:32 +0000 (16:19 +0000)]
meson: link libpng independent of vnc
Currently png support is dependent on vnc for linking object file to
libpng. This commit makes the parameter independent of vnc as it breaks
system emulator with --disable-vnc unless --disable-png is added.
Fixes: 9a0a119a38 ("Added parameter to take screenshot with screendump as PNG", 2022-04-27)
Signed-off-by: Kshitij Suri <kshitij.suri@nutanix.com>
Message-Id: <
20220510161932.228481-1-kshitij.suri@nutanix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Wed, 11 May 2022 07:40:35 +0000 (09:40 +0200)]
vhost-backend: do not depend on CONFIG_VHOST_VSOCK
The vsock callbacks .vhost_vsock_set_guest_cid and
.vhost_vsock_set_running are the only ones to be conditional
on #ifdef CONFIG_VHOST_VSOCK. This is different from any other
device-dependent callbacks like .vhost_scsi_set_endpoint, and it
also broke when CONFIG_VHOST_VSOCK was changed to a per-target
symbol.
It would be possible to also use the CONFIG_DEVICES include, but
really there is no reason for most virtio files to be per-target
so just remove the #ifdef to fix the issue.
Reported-by: Dov Murik <dovmurik@linux.ibm.com>
Fixes: 9972ae314f ("build: move vhost-vsock configuration to Kconfig")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Wed, 27 Apr 2022 13:08:30 +0000 (15:08 +0200)]
coroutine-lock: qemu_co_queue_restart_all is a coroutine-only qemu_co_enter_all
qemu_co_queue_restart_all is basically the same as qemu_co_enter_all
but without a QemuLockable argument. That's perfectly fine, but only as
long as the function is marked coroutine_fn. If used outside coroutine
context, qemu_co_queue_wait will attempt to take the lock and that
is just broken: if you are calling qemu_co_queue_restart_all outside
coroutine context, the lock is going to be a QemuMutex which cannot be
taken twice by the same thread.
The patch adds the marker to qemu_co_queue_restart_all and to its sole
non-coroutine_fn caller; it then reimplements the function in terms of
qemu_co_enter_all_impl, to remove duplicated code and to clarify that the
latter also works in coroutine context.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <
20220427130830.150180-4-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Wed, 27 Apr 2022 13:08:29 +0000 (15:08 +0200)]
coroutine-lock: introduce qemu_co_queue_enter_all
Because qemu_co_queue_restart_all does not release the lock, it should
be used only in coroutine context. Introduce a new function that,
like qemu_co_enter_next, does release the lock, and use it whenever
qemu_co_queue_restart_all was used outside coroutine context.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <
20220427130830.150180-3-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Wed, 27 Apr 2022 13:08:28 +0000 (15:08 +0200)]
coroutine-lock: qemu_co_queue_next is a coroutine-only qemu_co_enter_next
qemu_co_queue_next is basically the same as qemu_co_enter_next but
without a QemuLockable argument. That's perfectly fine, but only
as long as the function is marked coroutine_fn. If used outside
coroutine context, qemu_co_queue_wait will attempt to take the lock
and that is just broken: if you are calling qemu_co_queue_next outside
coroutine context, the lock is going to be a QemuMutex which cannot be
taken twice by the same thread.
The patch adds the marker and reimplements qemu_co_queue_next in terms of
qemu_co_enter_next_impl, to remove duplicated code and to clarify that the
latter also works in coroutine context.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <
20220427130830.150180-2-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Mon, 11 Apr 2022 07:41:27 +0000 (09:41 +0200)]
net: slirp: allow CFI with libslirp >= 4.7
slirp 4.7 introduces a new CFI-friendly timer callback that does
not pass function pointers within libslirp as callbacks for timers.
Check the version number and, if it is new enough, allow using CFI
even with a system libslirp.
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Marc-André Lureau <malureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Mon, 11 Apr 2022 07:39:16 +0000 (09:39 +0200)]
net: slirp: add support for CFI-friendly timer API
libslirp 4.7 introduces a CFI-friendly version of the .timer_new callback.
The new callback replaces the function pointer with an enum; invoking the
callback is done with a new function slirp_handle_timer.
Support the new API so that CFI can be made compatible with using a system
libslirp.
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Marc-André Lureau <malureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Mon, 11 Apr 2022 08:16:36 +0000 (10:16 +0200)]
net: slirp: switch to slirp_new
Replace slirp_init with slirp_new, so that a more recent cfg.version
can be specified. The function appeared in version 4.1.0.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>