Richard Henderson [Wed, 24 Jul 2024 02:58:46 +0000 (12:58 +1000)]
Merge tag 'pull-vfio-
20240723-1' of https://github.com/legoater/qemu into staging
vfio queue:
* IOMMUFD Dirty Tracking support
* Fix for a possible SEGV in IOMMU type1 container
* Dropped initialization of host IOMMU device with mdev devices
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmafyVUACgkQUaNDx8/7
# 7KGebRAAzEYxvstDxSPNF+1xx937TKbRpiKYtspTfEgu4Ht50MwO2ZqnVWzTBSwa
# qcjhDf2avMBpBvkp4O9fR7nXR0HRN2KvYrBSThZ3Qpqu4KjxCAGcHI5uYmgfizYh
# BBLrw3eWME5Ry220TinQF5KFl50vGq7Z/mku5N5Tgj2qfTfCXYK1Kc19SyAga49n
# LSokTIjZAGJa4vxrE7THawaEUjFRjfCJey64JUs/TPJaGr4R1snJcWgETww6juUE
# 9OSw/xl0AoQhaN/ZTRC1qCsBLUI2MVPsC+x+vqVK62HlTjCx+uDRVQ8KzfDzjCeH
# gaLkMjxJSuJZMpm4UU7DBzDGEGcEBCGeNyFt37BSqqPPpX55CcFhj++d8vqTiwpF
# YzmTNd/znxcZTw6OJN9sQZohh+NeS86CVZ3x31HD3dXifhRf17jbh7NoIyi+0ZCb
# N+mytOH5BXsD+ddwbk+yMaxXV43Fgz7ThG5tB1tjhhNtLZHDA5ezFvGZ5F/FJrqE
# xAbjOhz5MC+RcOVNSzQJCULNqFpfE6Gqeys6btEDm/ltf4LpAe6W1HYuv8BJc19T
# UsqGK2yKAuQX8GErYxJ1zqZCttVrgpsmXFYTC5iGbxC84mvsF0Iti96IdXz9gfzN
# Vlb2OxoefcOwVqIhbkvTZW0ZwYGGDDPAYhLMfr5lSuRqj123OOo=
# =cViP
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 24 Jul 2024 01:16:37 AM AEST
# gpg: using RSA key
A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-vfio-
20240723-1' of https://github.com/legoater/qemu:
vfio/common: Allow disabling device dirty page tracking
vfio/migration: Don't block migration device dirty tracking is unsupported
vfio/iommufd: Implement VFIOIOMMUClass::query_dirty_bitmap support
vfio/iommufd: Implement VFIOIOMMUClass::set_dirty_tracking support
vfio/iommufd: Probe and request hwpt dirty tracking capability
vfio/{iommufd, container}: Invoke HostIOMMUDevice::realize() during attach_device()
vfio/iommufd: Add hw_caps field to HostIOMMUDeviceCaps
vfio/{iommufd,container}: Remove caps::aw_bits
vfio/iommufd: Introduce auto domain creation
vfio/ccw: Don't initialize HOST_IOMMU_DEVICE with mdev
vfio/ap: Don't initialize HOST_IOMMU_DEVICE with mdev
vfio/iommufd: Return errno in iommufd_cdev_attach_ioas_hwpt()
backends/iommufd: Extend iommufd_backend_get_device_info() to fetch HW capabilities
vfio/iommufd: Don't initialize nor set a HOST_IOMMU_DEVICE with mdev
vfio/pci: Extract mdev check into an helper
hw/vfio/container: Fix SIGSEV on vfio_container_instance_finalize()
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Wed, 24 Jul 2024 01:25:40 +0000 (11:25 +1000)]
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* target/i386/kvm: support for reading RAPL MSRs using a helper program
* hpet: emulation improvements
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaelL4UHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroMXoQf+K77lNlHLETSgeeP3dr7yZPOmXjjN
# qFY/18jiyLw7MK1rZC09fF+n9SoaTH8JDKupt0z9M1R10HKHLIO04f8zDE+dOxaE
# Rou3yKnlTgFPGSoPPFr1n1JJfxtYlLZRoUzaAcHUaa4W7JR/OHJX90n1Rb9MXeDk
# jV6P0v1FWtIDdM6ERm9qBGoQdYhj6Ra2T4/NZKJFXwIhKEkxgu4yO7WXv8l0dxQz
# jE4fKotqAvrkYW1EsiVZm30lw/19duhvGiYeQXoYhk8KKXXjAbJMblLITSNWsCio
# 3l6Uud/lOxekkJDAq5nH3H9hCBm0WwvwL+0vRf3Mkr+/xRGvrhtmUdp8NQ==
# =00mB
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 03:19:58 AM AEST
# gpg: using RSA key
F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
hpet: avoid timer storms on periodic timers
hpet: store full 64-bit target value of the counter
hpet: accept 64-bit reads and writes
hpet: place read-only bits directly in "new_val"
hpet: remove unnecessary variable "index"
hpet: ignore high bits of comparator in 32-bit mode
hpet: fix and cleanup persistence of interrupt status
Add support for RAPL MSRs in KVM/Qemu
tools: build qemu-vmsr-helper
qio: add support for SO_PEERCRED for socket channel
target/i386: do not crash if microvm guest uses SGX CPUID leaves
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Tue, 23 Jul 2024 23:32:04 +0000 (09:32 +1000)]
Merge tag 'for_upstream' of https://git./virt/kvm/mst/qemu into staging
virtio,pci,pc: features,fixes
pci: Initial support for SPDM Responders
cxl: Add support for scan media, feature commands, device patrol scrub
control, DDR5 ECS control, firmware updates
virtio: in-order support
virtio-net: support for SR-IOV emulation (note: known issues on s390,
might get reverted if not fixed)
smbios: memory device size is now configurable per Machine
cpu: architecture agnostic code to support vCPU Hotplug
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmae9l8PHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp8fYH/impBH9nViO/WK48io4mLSkl0EUL8Y/xrMvH
# zKFCKaXq8D96VTt1Z4EGKYgwG0voBKZaCEKYU/0ARGnSlSwxINQ8ROCnBWMfn2sx
# yQt08EXVMznNLtXjc6U5zCoCi6SaV85GH40No3MUFXBQt29ZSlFqO/fuHGZHYBwS
# wuVKvTjjNF4EsGt3rS4Qsv6BwZWMM+dE6yXpKWk68kR8IGp+6QGxkMbWt9uEX2Md
# VuemKVnFYw0XGCGy5K+ZkvoA2DGpEw0QxVSOMs8CI55Oc9SkTKz5fUSzXXGo1if+
# M1CTjOPJu6pMym6gy6XpFa8/QioDA/jE2vBQvfJ64TwhJDV159s=
# =k8e9
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 10:16:31 AM AEST
# gpg: using RSA key
5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (61 commits)
hw/nvme: Add SPDM over DOE support
backends: Initial support for SPDM socket support
hw/pci: Add all Data Object Types defined in PCIe r6.0
tests/acpi: Add expected ACPI AML files for RISC-V
tests/qtest/bios-tables-test.c: Enable basic testing for RISC-V
tests/acpi: Add empty ACPI data files for RISC-V
tests/qtest/bios-tables-test.c: Remove the fall back path
tests/acpi: update expected DSDT blob for aarch64 and microvm
acpi/gpex: Create PCI link devices outside PCI root bridge
tests/acpi: Allow DSDT acpi table changes for aarch64
hw/riscv/virt-acpi-build.c: Update the HID of RISC-V UART
hw/riscv/virt-acpi-build.c: Add namespace devices for PLIC and APLIC
virtio-iommu: Add trace point on virtio_iommu_detach_endpoint_from_domain
hw/vfio/common: Add vfio_listener_region_del_iommu trace event
virtio-iommu: Remove the end point on detach
virtio-iommu: Free [host_]resv_ranges on unset_iommu_devices
virtio-iommu: Remove probe_done
Revert "virtio-iommu: Clear IOMMUDevice when VFIO device is unplugged"
gdbstub: Add helper function to unregister GDB register space
physmem: Add helper function to destroy CPU AddressSpace
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Joao Martins [Mon, 22 Jul 2024 21:13:26 +0000 (22:13 +0100)]
vfio/common: Allow disabling device dirty page tracking
The property 'x-pre-copy-dirty-page-tracking' allows disabling the whole
tracking of VF pre-copy phase of dirty page tracking, though it means
that it will only be used at the start of the switchover phase.
Add an option that disables the VF dirty page tracking, and fall
back into container-based dirty page tracking. This also allows to
use IOMMU dirty tracking even on VFs with their own dirty
tracker scheme.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Joao Martins [Mon, 22 Jul 2024 21:13:25 +0000 (22:13 +0100)]
vfio/migration: Don't block migration device dirty tracking is unsupported
By default VFIO migration is set to auto, which will support live
migration if the migration capability is set *and* also dirty page
tracking is supported.
For testing purposes one can force enable without dirty page tracking
via enable-migration=on, but that option is generally left for testing
purposes.
So starting with IOMMU dirty tracking it can use to accommodate the lack of
VF dirty page tracking allowing us to minimize the VF requirements for
migration and thus enabling migration by default for those too.
While at it change the error messages to mention IOMMU dirty tracking as
well.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
[ clg: - spelling in commit log ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Joao Martins [Mon, 22 Jul 2024 21:13:24 +0000 (22:13 +0100)]
vfio/iommufd: Implement VFIOIOMMUClass::query_dirty_bitmap support
ioctl(iommufd, IOMMU_HWPT_GET_DIRTY_BITMAP, arg) is the UAPI
that fetches the bitmap that tells what was dirty in an IOVA
range.
A single bitmap is allocated and used across all the hwpts
sharing an IOAS which is then used in log_sync() to set Qemu
global bitmaps.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Joao Martins [Mon, 22 Jul 2024 21:13:23 +0000 (22:13 +0100)]
vfio/iommufd: Implement VFIOIOMMUClass::set_dirty_tracking support
ioctl(iommufd, IOMMU_HWPT_SET_DIRTY_TRACKING, arg) is the UAPI that
enables or disables dirty page tracking. The ioctl is used if the hwpt
has been created with dirty tracking supported domain (stored in
hwpt::flags) and it is called on the whole list of iommu domains.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Joao Martins [Mon, 22 Jul 2024 21:13:22 +0000 (22:13 +0100)]
vfio/iommufd: Probe and request hwpt dirty tracking capability
In preparation to using the dirty tracking UAPI, probe whether the IOMMU
supports dirty tracking. This is done via the data stored in
hiod::caps::hw_caps initialized from GET_HW_INFO.
Qemu doesn't know if VF dirty tracking is supported when allocating
hardware pagetable in iommufd_cdev_autodomains_get(). This is because
VFIODevice migration state hasn't been initialized *yet* hence it can't pick
between VF dirty tracking vs IOMMU dirty tracking. So, if IOMMU supports
dirty tracking it always creates HWPTs with IOMMU_HWPT_ALLOC_DIRTY_TRACKING
even if later on VFIOMigration decides to use VF dirty tracking instead.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[ clg: - Fixed vbasedev->iommu_dirty_tracking assignment in
iommufd_cdev_autodomains_get()
- Added warning for heterogeneous dirty page tracking support
in iommufd_cdev_autodomains_get() ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Joao Martins [Mon, 22 Jul 2024 21:13:21 +0000 (22:13 +0100)]
vfio/{iommufd, container}: Invoke HostIOMMUDevice::realize() during attach_device()
Move the HostIOMMUDevice::realize() to be invoked during the attach of the device
before we allocate IOMMUFD hardware pagetable objects (HWPT). This allows the use
of the hw_caps obtained by IOMMU_GET_HW_INFO that essentially tell if the IOMMU
behind the device supports dirty tracking.
Note: The HostIOMMUDevice data from legacy backend is static and doesn't
need any information from the (type1-iommu) backend to be initialized.
In contrast however, the IOMMUFD HostIOMMUDevice data requires the
iommufd FD to be connected and having a devid to be able to successfully
GET_HW_INFO. This means vfio_device_hiod_realize() is called in
different places within the backend .attach_device() implementation.
Suggested-by: Cédric Le Goater <clg@redhat.cm>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
[ clg: Fixed error handling in iommufd_cdev_attach() ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Joao Martins [Mon, 22 Jul 2024 21:13:20 +0000 (22:13 +0100)]
vfio/iommufd: Add hw_caps field to HostIOMMUDeviceCaps
Store the value of @caps returned by iommufd_backend_get_device_info()
in a new field HostIOMMUDeviceCaps::hw_caps. Right now the only value is
whether device IOMMU supports dirty tracking (IOMMU_HW_CAP_DIRTY_TRACKING).
This is in preparation for HostIOMMUDevice::realize() being called early
during attach_device().
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Joao Martins [Mon, 22 Jul 2024 21:13:19 +0000 (22:13 +0100)]
vfio/{iommufd,container}: Remove caps::aw_bits
Remove caps::aw_bits which requires the bcontainer::iova_ranges being
initialized after device is actually attached. Instead defer that to
.get_cap() and call vfio_device_get_aw_bits() directly.
This is in preparation for HostIOMMUDevice::realize() being called early
during attach_device().
Suggested-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Joao Martins [Mon, 22 Jul 2024 21:13:18 +0000 (22:13 +0100)]
vfio/iommufd: Introduce auto domain creation
There's generally two modes of operation for IOMMUFD:
1) The simple user API which intends to perform relatively simple things
with IOMMUs e.g. DPDK. The process generally creates an IOAS and attaches
to VFIO and mainly performs IOAS_MAP and UNMAP.
2) The native IOMMUFD API where you have fine grained control of the
IOMMU domain and model it accordingly. This is where most new feature
are being steered to.
For dirty tracking 2) is required, as it needs to ensure that
the stage-2/parent IOMMU domain will only attach devices
that support dirty tracking (so far it is all homogeneous in x86, likely
not the case for smmuv3). Such invariant on dirty tracking provides a
useful guarantee to VMMs that will refuse incompatible device
attachments for IOMMU domains.
Dirty tracking insurance is enforced via HWPT_ALLOC, which is
responsible for creating an IOMMU domain. This is contrast to the
'simple API' where the IOMMU domain is created by IOMMUFD automatically
when it attaches to VFIO (usually referred as autodomains) but it has
the needed handling for mdevs.
To support dirty tracking with the advanced IOMMUFD API, it needs
similar logic, where IOMMU domains are created and devices attached to
compatible domains. Essentially mimicking kernel
iommufd_device_auto_get_domain(). With mdevs given there's no IOMMU domain
it falls back to IOAS attach.
The auto domain logic allows different IOMMU domains to be created when
DMA dirty tracking is not desired (and VF can provide it), and others where
it is. Here it is not used in this way given how VFIODevice migration
state is initialized after the device attachment. But such mixed mode of
IOMMU dirty tracking + device dirty tracking is an improvement that can
be added on. Keep the 'all of nothing' of type1 approach that we have
been using so far between container vs device dirty tracking.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
[ clg: Added ERRP_GUARD() in iommufd_cdev_autodomains_get() ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Zhenzhong Duan [Mon, 22 Jul 2024 07:07:13 +0000 (15:07 +0800)]
vfio/ccw: Don't initialize HOST_IOMMU_DEVICE with mdev
mdevs aren't "physical" devices and when asking for backing IOMMU info,
it fails the entire provisioning of the guest. Fix that by setting
vbasedev->mdev true so skipping HostIOMMUDevice initialization in the
presence of mdevs.
Fixes: 930589520128 ("vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Acked-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Zhenzhong Duan [Mon, 22 Jul 2024 07:07:12 +0000 (15:07 +0800)]
vfio/ap: Don't initialize HOST_IOMMU_DEVICE with mdev
mdevs aren't "physical" devices and when asking for backing IOMMU info,
it fails the entire provisioning of the guest. Fix that by setting
vbasedev->mdev true so skipping HostIOMMUDevice initialization in the
presence of mdevs.
Fixes: 930589520128 ("vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Joao Martins [Fri, 19 Jul 2024 12:04:52 +0000 (13:04 +0100)]
vfio/iommufd: Return errno in iommufd_cdev_attach_ioas_hwpt()
In preparation to implement auto domains have the attach function
return the errno it got during domain attach instead of a bool.
-EINVAL is tracked to track domain incompatibilities, and decide whether
to create a new IOMMU domain.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Joao Martins [Fri, 19 Jul 2024 12:04:51 +0000 (13:04 +0100)]
backends/iommufd: Extend iommufd_backend_get_device_info() to fetch HW capabilities
The helper will be able to fetch vendor agnostic IOMMU capabilities
supported both by hardware and software. Right now it is only iommu dirty
tracking.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Joao Martins [Fri, 19 Jul 2024 12:04:50 +0000 (13:04 +0100)]
vfio/iommufd: Don't initialize nor set a HOST_IOMMU_DEVICE with mdev
mdevs aren't "physical" devices and when asking for backing IOMMU info, it
fails the entire provisioning of the guest. Fix that by skipping
HostIOMMUDevice initialization in the presence of mdevs, and skip setting
an iommu device when it is known to be an mdev.
Cc: Zhenzhong Duan <zhenzhong.duan@intel.com>
Fixes: 930589520128 ("vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler")
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Joao Martins [Fri, 19 Jul 2024 12:04:49 +0000 (13:04 +0100)]
vfio/pci: Extract mdev check into an helper
In preparation to skip initialization of the HostIOMMUDevice for mdev,
extract the checks that validate if a device is an mdev into helpers.
A vfio_device_is_mdev() is created, and subsystems consult VFIODevice::mdev
to check if it's mdev or not.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Eric Auger [Fri, 19 Jul 2024 16:50:11 +0000 (18:50 +0200)]
hw/vfio/container: Fix SIGSEV on vfio_container_instance_finalize()
In vfio_connect_container's error path, the base container is
removed twice form the VFIOAddressSpace QLIST: first on the
listener_release_exit label and second, on free_container_exit
label, through object_unref(container), which calls
vfio_container_instance_finalize().
Let's remove the first instance.
Fixes: 938026053f4 ("vfio/container: Switch to QOM")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Richard Henderson [Tue, 23 Jul 2024 15:00:22 +0000 (01:00 +1000)]
Merge tag 'qga-pull-2024-07-23' of https://github.com/kostyanf14/qemu into staging
qga-pull-2024-07-23
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEwsLBCepDxjwUI+uE711egWG6hOcFAmafUs0ACgkQ711egWG6
# hOffwQ/+PMFMOq3jwV11Na0GnrFHT0SLlcxNWYGQjE0Q/nwuYWMTKdo2iB9rVC7T
# qxaT6PLtTZPgRsJudJ5kkvLFw88Nr6BuWl31tCVeALUO7C0oTg/oRDfYVeH4/jfG
# PS5TiM6ie27SvI5lhGZhd9sRAy8N6NGgT6Fh+pS2tVVfftcfVYKVmnzgtvk314A+
# MpeW8ukVruSW+9G+suXaE750g/drZJAoepC5pW1HXdHE+IuzXNdMWZqwMqBZSM5T
# X8VcLvMjFrFrfLOP2el6mloriw67aJyKe9Uwsp548HdXfZKrLCmaR7cZK5zKVQDK
# Rzolyuw19wNNi0TZAwmP+MBioDiIHcM4nNhVDCHIVCbXzQHa4BhAr/cr8uucyfM5
# hdCWmaTl4Tksk4q4ooHurDWshV26QNRbLRD1Vx1Rhrwz42MmU2VG13PsSWqLj00I
# fj1LzhQOmr26cewgayIL7ODwHDXiwKi+6lKS1OyTjXXubucScgxSyTNC785T6Rvk
# T58KAnBRD3vDhE7Dn/4KdRClRFY+7R2/jcHdFnA4vfvOVV8ZXp/m0O0wfLEikH6/
# dGDDVBLNG5gqV477++0wdqkYFq6MmON3PH/EA6rgZYc4At5kS+HFNASBvnFRYMGf
# dgtyj8jV5uoffqYOqyXxClP6eTgV1EZ0/wKZ8uJipivB7azjnkE=
# =xzjT
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 04:50:53 PM AEST
# gpg: using RSA key
C2C2C109EA43C63C1423EB84EF5D5E8161BA84E7
# gpg: Good signature from "Kostiantyn Kostiuk (Upstream PR sign) <kkostiuk@redhat.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: C2C2 C109 EA43 C63C 1423 EB84 EF5D 5E81 61BA 84E7
* tag 'qga-pull-2024-07-23' of https://github.com/kostyanf14/qemu: (25 commits)
qga/linux: Add new api 'guest-network-get-route'
guest-agent: document allow-rpcs in config file section
qga/commands-posix: Make ga_wait_child() return boolean
qga: centralize logic for disabling/enabling commands
qga: allow configuration file path via the cli
qga: remove pointless 'blockrpcs_key' variable
qga: move declare of QGAConfig struct to top of file
qga: don't disable fsfreeze commands if vss_init fails
qga: conditionalize schema for commands not supported on other UNIX
qga: conditionalize schema for commands requiring utmpx
qga: conditionalize schema for commands requiring libudev
qga: conditionalize schema for commands requiring fstrim
qga: conditionalize schema for commands requiring fsfreeze
qga: conditionalize schema for commands only supported on Windows
qga: conditionalize schema for commands requiring linux/win32
qga: conditionalize schema for commands requiring getifaddrs
qga: conditionalize schema for commands unsupported on non-Linux POSIX
qga: conditionalize schema for commands unsupported on Windows
qga: move CONFIG_FSFREEZE/TRIM to be meson defined options
qga: move linux memory block command impls to commands-linux.c
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Dehan Meng [Thu, 13 Jun 2024 09:28:02 +0000 (17:28 +0800)]
qga/linux: Add new api 'guest-network-get-route'
The Route information of the Linux VM needs to be used
by administrators and users when debugging network problems
and troubleshooting.
Signed-off-by: Dehan Meng <demeng@redhat.com>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Message-ID: <
20240613092802.346246-2-demeng@redhat.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Richard Henderson [Tue, 23 Jul 2024 05:23:05 +0000 (15:23 +1000)]
Merge tag 'ui-pull-request' of https://gitlab.com/marcandre.lureau/qemu into staging
UI-related for 9.1
# -----BEGIN PGP SIGNATURE-----
#
# iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAmaeu44cHG1hcmNhbmRy
# ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5ZvNEACES6y1D4rzBtZBV/FY
# OvWHzM/2Uycma3CO2pTl8DzwucgUuVxVjrAppi+iIXza+qEHlN0e9tbmR8u3ypdV
# tu0ijRm1MWeV9EHw8fQxSIci9cgoPzJzfvrmGD9rPEJTPh44yifL3CiE97y/5SJx
# FkrmYoDeuLQ4WAgZqIhkFOZ3eX+bQ+sI49ZVm0vSIeZ2wYuWlw7JwMKq2Xb4fCsZ
# 7wJZcL7gNGHk3rsH2Sfukv5LRw64+eDwpQMkXS2scYp64xwhdd5bAqKchicBA0zh
# jBw+KszCpAW7XunQtXjiiQZco9x6auu2c+4erDyNcTfqBtSRNjArMauL2/609EVv
# 7xsLmwZvXgrbO7fRCGCnC4M5NCuisDbMeON+7tKdS8kfEMgFX0FNfM1Jp9z4Rh7T
# I/vy8mLlBIy4BNZA7jV1jyIJZeVYBYGc+ieBEeE1sK7L5RIxeoOwP1S20Xu9A9bO
# VFBohKcMt5x0HlUg0oSH8OJLbpQ8vDQDkIcDMIOQCqj+PX0erc2u9oHQ7xB1k3BB
# os83zWDTLJTJ+ZdoI2tp9FHQj56wdGJxDQNrRjFOP5KL1AoHGz+Y5fF7BvGB3jnK
# JsPV2OSkEs6Q/be6pLTiVEoUUEpqy40Kh/7NlzdbM+oHX5h0TlcIqJ16I2QsfM/N
# sRXAmzqCe00STyhxopR1BMZnjg==
# =aCj6
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 06:05:34 AM AEST
# gpg: using RSA key
87A9BD933F87C606D276F62DDAE8E10975969CE5
# gpg: issuer "marcandre.lureau@redhat.com"
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full]
# gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full]
* tag 'ui-pull-request' of https://gitlab.com/marcandre.lureau/qemu:
chardev/char-win-stdio.c: restore old console mode
ui/vdagent: send caps on fe_open
ui/vdagent: notify clipboard peers of serial reset
ui/vdagent: improve vdagent_fe_open() trace
ui: add more tracing for dbus
Cursor: 8 -> 1 bit alpha downsampling improvement
virtio-gpu-gl: declare dependency on ui-opengl
vnc: increase max display size
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Tue, 23 Jul 2024 05:19:39 +0000 (15:19 +1000)]
Merge tag 'pull-tcg-
20240723' of https://gitlab.com/rth7680/qemu into staging
accel/tcg: Export set/clear_helper_retaddr
target/arm: Use set_helper_retaddr for dc_zva, sve and sme
target/ppc: Tidy dcbz helpers
target/ppc: Use set_helper_retaddr for dcbz
target/s390x: Use set_helper_retaddr in mem_helper.c
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmafJKIdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+FBAf7Bup+karxeGHZx2rN
# cPeF248bcCWTxBWHK7dsYze4KqzsrlNIJlPeOKErU2bbbRDZGhOp1/N95WVz+P8V
# 6Ny63WTsAYkaFWKxE6Jf0FWJlGw92btk75pTV2x/TNZixg7jg0vzVaYkk0lTYc5T
# m5e4WycYEbzYm0uodxI09i+wFvpd+7WCnl6xWtlJPWZENukvJ36Ss43egFMDtuMk
# vTJuBkS9wpwZ9MSi6EY6M+Raieg8bfaotInZeDvE/yRPNi7CwrA7Dgyc1y626uBA
# joGkYRLzhRgvT19kB3bvFZi1AXa0Pxr+j0xJqwspP239Gq5qezlS5Bv/DrHdmGHA
# jaqSwg==
# =XgUE
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 01:33:54 PM AEST
# gpg: using RSA key
7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* tag 'pull-tcg-
20240723' of https://gitlab.com/rth7680/qemu:
target/riscv: Simplify probing in vext_ldff
target/s390x: Use set/clear_helper_retaddr in mem_helper.c
target/s390x: Use user_or_likely in access_memmove
target/s390x: Use user_or_likely in do_access_memset
target/ppc: Improve helper_dcbz for user-only
target/ppc: Merge helper_{dcbz,dcbzep}
target/ppc: Split out helper_dbczl for 970
target/ppc: Hoist dcbz_size out of dcbz_common
target/ppc/mem_helper.c: Remove a conditional from dcbz_common()
target/arm: Use set/clear_helper_retaddr in SVE and SME helpers
target/arm: Use set/clear_helper_retaddr in helper-a64.c
accel/tcg: Move {set,clear}_helper_retaddr to cpu_ldst.h
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Tue, 23 Jul 2024 03:55:45 +0000 (13:55 +1000)]
Merge tag 'nvme-next-pull-request' of https://gitlab.com/birkelund/qemu into staging
hw/nvme patches
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEUigzqnXi3OaiR2bATeGvMW1PDekFAmaeiz4ACgkQTeGvMW1P
# Dem5DggAkudAwZYUlKLz/FuxmOJsZ/CKL7iIu6wE3P93WTTbi4m2AL5lMFz1bOUH
# 33LtjHz51bDvOsnhAwLs2TwjfhICiMJCOXEmxF9zJnO4Yo8ih9UbeE7sEukpxsVr
# FJlAg5OXhdIHuo48ow7hu7BqMs58jnXhVA6zSvLU5rbKTSdG/369jyQKy5aoFPN0
# Rk+S6hqDmVMiN7u6E+QqPyB2tSbmNKkhPICu3O9fbHmaOoMFmrcvyxkd1wJ9JxwF
# 8MWbuEZlIpLIIL/mCN4wzDw8VKlJ26sBJJC1b+NHmWIWmPkqMeXwcmQtWhUqsrcs
# xAGUcjgJuJ3Fu6Xzt+09Y+FXO8v0oQ==
# =vCDb
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 02:39:26 AM AEST
# gpg: using RSA key
522833AA75E2DCE6A24766C04DE1AF316D4F0DE9
# gpg: Good signature from "Klaus Jensen <its@irrelevant.dk>" [unknown]
# gpg: aka "Klaus Jensen <k.jensen@samsung.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: DDCA 4D9C 9EF9 31CC 3468 4272 63D5 6FC5 E55D A838
# Subkey fingerprint: 5228 33AA 75E2 DCE6 A247 66C0 4DE1 AF31 6D4F 0DE9
* tag 'nvme-next-pull-request' of https://gitlab.com/birkelund/qemu:
hw/nvme: remove useless type cast
hw/nvme: actually implement abort
hw/nvme: add cross namespace copy support
hw/nvme: fix memory leak in nvme_dsm
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Tue, 23 Jul 2024 02:15:11 +0000 (12:15 +1000)]
Merge tag 'pull-maintainer-9.1-rc0-230724-1' of https://gitlab.com/stsquad/qemu into staging
Maintainer updates for testing, gdbstub, semihosting, plugins
- bump python in *BSD images via libvirt-ci
- remove old unused Leon3 Avocado test
- re-factor gdb command extension
- add stoptrigger plugin to contrib
- ensure plugin mem callbacks properly sized
- reduce check-tcg noise of inline plugin test
- fix register dumping in execlog plugin
- restrict semihosting to TCG builds
- fix regex in MTE test
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmae5OcACgkQ+9DbCVqe
# KkR8cgf/eM2Sm7EG7zIQ8SbY53DS07ls6uT7Mfn4374GEmj4Cy1I+WNoLGM5vq1r
# qWAC9q2LgJVMQoWJA6Fi3SCKiylBp3/jIdJ7CWN5qj/NmePHSV3EisQXf2qOWWL9
# qOX2hJI7IIYNI2v3IvCzN/fB8F8U60iXERFHRypBH2p6Mz+EGMC3CEhesOEUta6o
# 2IMkRW8MoDv9x4B+FnNYav6CfqZjhRenu1CGgVGvWYRds2QDVNB/14kOunmBuwSs
# gPb7AhhnpobDYVxMarlJNPMbOdFjtDkYCajCNW7ffLcl+OjhoVR6cJcFpbOMv4kZ
# 8Nok8aDjUDWwUbmU0rBynca+1k8OTg==
# =TjRc
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 09:01:59 AM AEST
# gpg: using RSA key
6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
* tag 'pull-maintainer-9.1-rc0-230724-1' of https://gitlab.com/stsquad/qemu:
tests/tcg/aarch64: Fix test-mte.py
semihosting: Restrict to TCG
target/xtensa: Restrict semihosting to TCG
target/riscv: Restrict semihosting to TCG
target/mips: Restrict semihosting to TCG
target/m68k: Restrict semihosting to TCG
target/mips: Add semihosting stub
target/m68k: Add semihosting stub
semihosting: Include missing 'gdbstub/syscalls.h' header
plugins/execlog.c: correct dump of registers values
tests/plugins: use qemu_plugin_outs for inline stats
plugins: fix mem callback array size
plugins/stoptrigger: TCG plugin to stop execution under conditions
gdbstub: Re-factor gdb command extensions
tests/avocado: Remove non-working sparc leon3 test
testing: bump to latest libvirt-ci
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Wed, 10 Jul 2024 03:10:55 +0000 (20:10 -0700)]
target/riscv: Simplify probing in vext_ldff
The current pairing of tlb_vaddr_to_host with extra is either
inefficient (user-only, with page_check_range) or incorrect
(system, with probe_pages).
For proper non-fault behaviour, use probe_access_flags with
its nonfault parameter set to true.
Reviewed-by: Max Chou <max.chou@sifive.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Wed, 10 Jul 2024 01:40:58 +0000 (18:40 -0700)]
target/s390x: Use set/clear_helper_retaddr in mem_helper.c
Avoid a race condition with munmap in another thread.
For access_memset and access_memmove, manage the value
within the helper. For uses of access_{get,set}_byte,
manage the value across the for loops.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Wed, 10 Jul 2024 01:59:13 +0000 (18:59 -0700)]
target/s390x: Use user_or_likely in access_memmove
Invert the conditional, indent the block, and use the macro
that expands to true for user-only.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Wed, 10 Jul 2024 01:05:59 +0000 (18:05 -0700)]
target/s390x: Use user_or_likely in do_access_memset
Eliminate the ifdef by using a predicate that is
always true with CONFIG_USER_ONLY.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Tue, 9 Jul 2024 23:56:48 +0000 (16:56 -0700)]
target/ppc: Improve helper_dcbz for user-only
Mark the reserve_addr check unlikely. Use tlb_vaddr_to_host
instead of probe_write, relying on the memset itself to test
for page writability. Use set/clear_helper_retaddr so that
we can properly unwind on segfault.
With this, a trivial loop around guest memset will no longer
spend nearly 25% of runtime within page_get_flags.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Tue, 2 Jul 2024 03:10:53 +0000 (20:10 -0700)]
target/ppc: Merge helper_{dcbz,dcbzep}
Merge the two and pass the mmu_idx directly from translation.
Swap the argument order in dcbz_common to avoid extra swaps.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Tue, 2 Jul 2024 02:46:15 +0000 (19:46 -0700)]
target/ppc: Split out helper_dbczl for 970
We can determine at translation time whether the insn is or
is not dbczl. We must retain a runtime check against the
HID5 register, but we can move that to a separate function
that never affects other ppc models.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Tue, 2 Jul 2024 01:17:50 +0000 (18:17 -0700)]
target/ppc: Hoist dcbz_size out of dcbz_common
The 970 logic does not apply to dcbzep, which is an e500 insn.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
BALATON Zoltan [Sat, 22 Jun 2024 20:48:33 +0000 (22:48 +0200)]
target/ppc/mem_helper.c: Remove a conditional from dcbz_common()
Instead of passing a bool and select a value within dcbz_common() let
the callers pass in the right value to avoid this conditional
statement. On PPC dcbz is often used to zero memory and some code uses
it a lot. This change improves the run time of a test case that copies
memory with a dcbz call in every iteration from 6.23 to 5.83 seconds.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <
20240622204833.
5F7C74E6000@zero.eik.bme.hu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Richard Henderson [Tue, 9 Jul 2024 22:02:07 +0000 (15:02 -0700)]
target/arm: Use set/clear_helper_retaddr in SVE and SME helpers
Avoid a race condition with munmap in another thread.
Use around blocks that exclusively use "host_fn".
Keep the blocks as small as possible, but without setting
and clearing for every operation on one page.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Tue, 9 Jul 2024 21:08:24 +0000 (14:08 -0700)]
target/arm: Use set/clear_helper_retaddr in helper-a64.c
Use these in helper_dc_dva and the FEAT_MOPS routines to
avoid a race condition with munmap in another thread.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Tue, 9 Jul 2024 19:52:40 +0000 (12:52 -0700)]
accel/tcg: Move {set,clear}_helper_retaddr to cpu_ldst.h
Use of these in helpers goes hand-in-hand with tlb_vaddr_to_host
and other probing functions.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Wilfred Mallawa [Wed, 3 Jul 2024 09:20:27 +0000 (19:20 +1000)]
hw/nvme: Add SPDM over DOE support
Setup Data Object Exchange (DOE) as an extended capability for the NVME
controller and connect SPDM to it (CMA) to it.
Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Klaus Jensen <k.jensen@samsung.com>
Message-Id: <
20240703092027.644758-4-alistair.francis@wdc.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Huai-Cheng Kuo [Wed, 3 Jul 2024 09:20:26 +0000 (19:20 +1000)]
backends: Initial support for SPDM socket support
SPDM enables authentication, attestation and key exchange to assist in
providing infrastructure security enablement. It's a standard published
by the DMTF [1].
SPDM supports multiple transports, including PCIe DOE and MCTP.
This patch adds support to QEMU to connect to an external SPDM
instance.
SPDM support can be added to any QEMU device by exposing a
TCP socket to a SPDM server. The server can then implement the SPDM
decoding/encoding support, generally using libspdm [2].
This is similar to how the current TPM implementation works and means
that the heavy lifting of setting up certificate chains, capabilities,
measurements and complex crypto can be done outside QEMU by a well
supported and tested library.
1: https://www.dmtf.org/standards/SPDM
2: https://github.com/DMTF/libspdm
Signed-off-by: Huai-Cheng Kuo <hchkuo@avery-design.com.tw>
Signed-off-by: Chris Browy <cbrowy@avery-design.com>
Co-developed-by: Jonathan Cameron <Jonathan.cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
[ Changes by WM
- Bug fixes from testing
]
Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
[ Changes by AF:
- Convert to be more QEMU-ified
- Move to backends as it isn't PCIe specific
]
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <
20240703092027.644758-3-alistair.francis@wdc.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Alistair Francis [Wed, 3 Jul 2024 09:20:25 +0000 (19:20 +1000)]
hw/pci: Add all Data Object Types defined in PCIe r6.0
Add all of the defined protocols/features from the PCIe-SIG r6.0
"Table 6-32 PCI-SIG defined Data Object Types (Vendor ID = 0001h)"
table.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Message-Id: <
20240703092027.644758-2-alistair.francis@wdc.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Sunil V L [Tue, 16 Jul 2024 14:43:06 +0000 (20:13 +0530)]
tests/acpi: Add expected ACPI AML files for RISC-V
As per the step 5 in the process documented in bios-tables-test.c,
generate the expected ACPI AML data files for RISC-V using the
rebuild-expected-aml.sh script and update the
bios-tables-test-allowed-diff.h.
These are all new files being added for the first time. Hence, iASL diff
output is not added.
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <
20240716144306.
2432257-10-sunilvl@ventanamicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Sunil V L [Tue, 16 Jul 2024 14:43:05 +0000 (20:13 +0530)]
tests/qtest/bios-tables-test.c: Enable basic testing for RISC-V
Add basic ACPI table test case for RISC-V.
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <
20240716144306.
2432257-9-sunilvl@ventanamicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Sunil V L [Tue, 16 Jul 2024 14:43:04 +0000 (20:13 +0530)]
tests/acpi: Add empty ACPI data files for RISC-V
As per process documented (steps 1-3) in bios-tables-test.c, add empty
AML data files for RISC-V ACPI tables and add the entries in
bios-tables-test-allowed-diff.h.
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <
20240716144306.
2432257-8-sunilvl@ventanamicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Sunil V L [Tue, 16 Jul 2024 14:43:03 +0000 (20:13 +0530)]
tests/qtest/bios-tables-test.c: Remove the fall back path
The expected ACPI AML files are moved now under ${arch}/{machine} path.
Hence, there is no need to search in old path which didn't have ${arch}.
Remove the code which searches for the expected AML files under old path
as well.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <
20240716144306.
2432257-7-sunilvl@ventanamicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Sunil V L [Tue, 16 Jul 2024 14:43:02 +0000 (20:13 +0530)]
tests/acpi: update expected DSDT blob for aarch64 and microvm
After PCI link devices are moved out of the scope of PCI root complex,
the DSDT files of machines which use GPEX, will change. So, update the
expected AML files with these changes for these machines.
Mainly, there are 2 changes.
1) Since the link devices are created now directly under _SB for all PCI
root bridges in the system, they should have unique names. So, instead
of GSIx, named those devices as LXXY where L means link, XX will have
PCI bus number and Y will have the INTx number (ex: L000 or L001). The
_PRT entries will also be updated to reflect this name change.
2) PCI link devices are moved from the scope of each PCI root bridge to
directly under _SB.
Below is the sample iASL difference for one such link device.
Scope (\_SB)
{
Name (_HID, "LNRO0005") // _HID: Hardware ID
Name (_UID, 0x1F) // _UID: Unique ID
Name (_CCA, One) // _CCA: Cache Coherency Attribute
Name (_CRS, ResourceTemplate () // _CRS: Current Resource Settings
{
Memory32Fixed (ReadWrite,
0x0A003E00, // Address Base
0x00000200, // Address Length
)
Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive, ,, )
{
0x0000004F,
}
})
+ Device (L000)
+ {
+ Name (_HID, "PNP0C0F" /* PCI Interrupt Link Device */)
+ Name (_UID, Zero) // _UID: Unique ID
+ Name (_PRS, ResourceTemplate ()
+ {
+ Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive, ,, )
+ {
+ 0x00000023,
+ }
+ })
+ Name (_CRS, ResourceTemplate ()
+ {
+ Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive, ,, )
+ {
+ 0x00000023,
+ }
+ })
+ Method (_SRS, 1, NotSerialized) // _SRS: Set Resource Settings
+ {
+ }
+ }
+
Device (PCI0)
{
Name (_HID, "PNP0A08" /* PCI Express Bus */) // _HID: Hardware ID
Name (_CID, "PNP0A03" /* PCI Bus */) // _CID: Compatible ID
Name (_SEG, Zero) // _SEG: PCI Segment
Name (_BBN, Zero) // _BBN: BIOS Bus Number
Name (_UID, Zero) // _UID: Unique ID
Name (_STR, Unicode ("PCIe 0 Device")) // _STR: Description String
Name (_CCA, One) // _CCA: Cache Coherency Attribute
Name (_PRT, Package (0x80) // _PRT: PCI Routing Table
{
Package (0x04)
{
0xFFFF,
Zero,
- GSI0,
+ L000,
Zero
},
.....
})
Device (GSI0)
{
Name (_HID, "PNP0C0F" /* PCI Interrupt Link Device */)
Name (_UID, Zero) // _UID: Unique ID
Name (_PRS, ResourceTemplate ()
{
Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive, ,, )
{
0x00000023,
}
})
Name (_CRS, ResourceTemplate ()
{
Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive, ,, )
{
0x00000023,
}
})
Method (_SRS, 1, NotSerialized) // _SRS: Set Resource Settings
{
}
}
}
}
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Message-Id: <
20240716144306.
2432257-6-sunilvl@ventanamicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Sunil V L [Tue, 16 Jul 2024 14:43:01 +0000 (20:13 +0530)]
acpi/gpex: Create PCI link devices outside PCI root bridge
Currently, PCI link devices (PNP0C0F) are always created within the
scope of the PCI root bridge. However, RISC-V needs these link devices
to be created outside to ensure the probing order in the OS. This
matches the example given in the ACPI specification [1] as well. Hence,
create these link devices directly under _SB instead of under the PCI
root bridge.
To keep these link device names unique for multiple PCI bridges, change
the device name from GSIx to LXXY format where XX is the PCI bus number
and Y is the INTx.
GPEX is currently used by riscv, aarch64/virt and x86/microvm machines.
So, this change will alter the DSDT for those systems.
[1] - ACPI 5.1: 6.2.13.1 Example: Using _PRT to Describe PCI IRQ Routing
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <
20240716144306.
2432257-5-sunilvl@ventanamicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Sunil V L [Tue, 16 Jul 2024 14:43:00 +0000 (20:13 +0530)]
tests/acpi: Allow DSDT acpi table changes for aarch64
so that CI tests don't fail when those ACPI tables are updated in the
next patch. This is as per the documentation in bios-tables-tests.c.
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <
20240716144306.
2432257-4-sunilvl@ventanamicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Sunil V L [Tue, 16 Jul 2024 14:42:59 +0000 (20:12 +0530)]
hw/riscv/virt-acpi-build.c: Update the HID of RISC-V UART
The requirement ACPI_060 in the RISC-V BRS specification [1], requires
NS16550 compatible UART to have the HID RSCV0003. So, update the HID for
the UART.
[1] - https://github.com/riscv-non-isa/riscv-brs/releases/download/v0.0.2/riscv-brs-spec.pdf
(Chapter 6)
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <
20240716144306.
2432257-3-sunilvl@ventanamicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Sunil V L [Tue, 16 Jul 2024 14:42:58 +0000 (20:12 +0530)]
hw/riscv/virt-acpi-build.c: Add namespace devices for PLIC and APLIC
As per the requirement ACPI_080 in the RISC-V Boot and Runtime Services
(BRS) specification [1], PLIC and APLIC should be in namespace as well.
So, add them using the defined HID.
[1] - https://github.com/riscv-non-isa/riscv-brs/releases/download/v0.0.2/riscv-brs-spec.pdf
(Chapter 6)
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <
20240716144306.
2432257-2-sunilvl@ventanamicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eric Auger [Tue, 16 Jul 2024 09:45:08 +0000 (11:45 +0200)]
virtio-iommu: Add trace point on virtio_iommu_detach_endpoint_from_domain
Add a trace point on virtio_iommu_detach_endpoint_from_domain().
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <
20240716094619.
1713905-7-eric.auger@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eric Auger [Tue, 16 Jul 2024 09:45:07 +0000 (11:45 +0200)]
hw/vfio/common: Add vfio_listener_region_del_iommu trace event
Trace when VFIO gets notified about the deletion of an IOMMU MR.
Also trace the name of the region in the add_iommu trace message.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <
20240716094619.
1713905-6-eric.auger@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eric Auger [Tue, 16 Jul 2024 09:45:06 +0000 (11:45 +0200)]
virtio-iommu: Remove the end point on detach
We currently miss the removal of the endpoint in case of detach.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <
20240716094619.
1713905-5-eric.auger@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eric Auger [Tue, 16 Jul 2024 09:45:05 +0000 (11:45 +0200)]
virtio-iommu: Free [host_]resv_ranges on unset_iommu_devices
We are currently missing the deallocation of the [host_]resv_regions
in case of hot unplug. Also to make things more simple let's rule
out the case where multiple HostIOMMUDevices would be aliased and
attached to the same IOMMUDevice. This allows to remove the handling
of conflicting Host reserved regions. Anyway this is not properly
supported at guest kernel level. On hotunplug the reserved regions
are reset to the ones set by virtio-iommu property.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <
20240716094619.
1713905-4-eric.auger@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eric Auger [Tue, 16 Jul 2024 09:45:04 +0000 (11:45 +0200)]
virtio-iommu: Remove probe_done
Now we have switched to PCIIOMMUOps to convey host IOMMU information,
the host reserved regions are transmitted when the PCIe topology is
built. This happens way before the virtio-iommu driver calls the probe
request. So let's remove the probe_done flag that allowed to check
the probe was not done before the IOMMU MR got enabled. Besides this
probe_done flag had a flaw wrt migration since it was not saved/restored.
The only case at risk is if 2 devices were plugged to a
PCIe to PCI bridge and thus aliased. First of all we
discovered in the past this case was not properly supported for
neither SMMU nor virtio-iommu on guest kernel side: see
[RFC] virtio-iommu: Take into account possible aliasing in virtio_iommu_mr()
https://lore.kernel.org/all/
20230116124709.793084-1-eric.auger@redhat.com/
If this were supported by the guest kernel, it is unclear what the call
sequence would be from a virtio-iommu driver point of view.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <
20240716094619.
1713905-3-eric.auger@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eric Auger [Tue, 16 Jul 2024 09:45:03 +0000 (11:45 +0200)]
Revert "virtio-iommu: Clear IOMMUDevice when VFIO device is unplugged"
This reverts commit
1b889d6e39c32d709f1114699a014b381bcf1cb1.
There are different problems with that tentative fix:
- Some resources are left dangling (resv_regions,
host_resv_ranges) and memory subregions are left attached to
the root MR although freed as embedded in the sdev IOMMUDevice.
Finally the sdev->as is not destroyed and associated listeners
are left.
- Even when fixing the above we observe a memory corruption
associated with the deallocation of the IOMMUDevice. This can
be observed when a VFIO device is hotplugged, hot-unplugged
and a system reset is issued. At this stage we have not been
able to identify the root cause (IOMMU MR or as structs beeing
overwritten and used later on?).
- Another issue is HostIOMMUDevice are indexed by non aliased
BDF whereas the IOMMUDevice is indexed by aliased BDF - yes the
current naming is really misleading -. Given the state of the
code I don't think the virtio-iommu device works in non
singleton group case though.
So let's revert the patch for now. This means the IOMMU MR/as survive
the hotunplug. This is what is done in the intel_iommu for instance.
It does not sound very logical to keep those but currently there is
no symetric function to pci_device_iommu_address_space().
probe_done issue will be handled in a subsequent patch. Also
resv_regions and host_resv_regions will be deallocated separately.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <
20240716094619.
1713905-2-eric.auger@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Salil Mehta [Tue, 16 Jul 2024 11:15:02 +0000 (12:15 +0100)]
gdbstub: Add helper function to unregister GDB register space
Add common function to help unregister the GDB register space. This shall be
done in context to the CPU unrealization.
Note: These are common functions exported to arch specific code. For example,
for ARM this code is being referred in associated arch specific patch-set:
Link: https://lore.kernel.org/qemu-devel/20230926103654.34424-1-salil.mehta@huawei.com/
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Xianglai Li <lixianglai@loongson.cn>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Reviewed-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Tested-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <
20240716111502.202344-8-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Salil Mehta [Tue, 16 Jul 2024 11:15:01 +0000 (12:15 +0100)]
physmem: Add helper function to destroy CPU AddressSpace
Virtual CPU Hot-unplug leads to unrealization of a CPU object. This also
involves destruction of the CPU AddressSpace. Add common function to help
destroy the CPU AddressSpace.
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Xianglai Li <lixianglai@loongson.cn>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Tested-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <
20240716111502.202344-7-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Salil Mehta [Tue, 16 Jul 2024 11:15:00 +0000 (12:15 +0100)]
hw/acpi: Update CPUs AML with cpu-(ctrl)dev change
CPUs Control device(\\_SB.PCI0) register interface for the x86 arch is IO port
based and existing CPUs AML code assumes _CRS objects would evaluate to a system
resource which describes IO Port address. But on ARM arch CPUs control
device(\\_SB.PRES) register interface is memory-mapped hence _CRS object should
evaluate to system resource which describes memory-mapped base address. Update
build CPUs AML function to accept both IO/MEMORY region spaces and accordingly
update the _CRS object.
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Xianglai Li <lixianglai@loongson.cn>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Tested-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <
20240716111502.202344-6-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Salil Mehta [Tue, 16 Jul 2024 11:14:59 +0000 (12:14 +0100)]
hw/acpi: Update GED _EVT method AML with CPU scan
OSPM evaluates _EVT method to map the event. The CPU hotplug event eventually
results in start of the CPU scan. Scan figures out the CPU and the kind of
event(plug/unplug) and notifies it back to the guest. Update the GED AML _EVT
method with the call to method \\_SB.CPUS.CSCN (via \\_SB.GED.CSCN)
Architecture specific code [1] might initialize its CPUs AML code by calling
common function build_cpus_aml() like below for ARM:
build_cpus_aml(scope, ms, opts, xx_madt_cpu_entry, memmap[VIRT_CPUHP_ACPI].base,
"\\_SB", "\\_SB.GED.CSCN", AML_SYSTEM_MEMORY);
[1] https://lore.kernel.org/qemu-devel/
20240613233639.202896-13-salil.mehta@huawei.com/
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Tested-by: Xianglai Li <lixianglai@loongson.cn>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Tested-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <
20240716111502.202344-5-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Salil Mehta [Tue, 16 Jul 2024 11:14:58 +0000 (12:14 +0100)]
hw/acpi: Update ACPI GED framework to support vCPU Hotplug
ACPI GED (as described in the ACPI 6.4 spec) uses an interrupt listed in the
_CRS object of GED to intimate OSPM about an event. Later then demultiplexes the
notified event by evaluating ACPI _EVT method to know the type of event. Use
ACPI GED to also notify the guest kernel about any CPU hot(un)plug events.
Note, GED interface is used by many hotplug events like memory hotplug, NVDIMM
hotplug and non-hotplug events like system power down event. Each of these can
be selected using a bit in the 32 bit GED IO interface. A bit has been reserved
for the CPU hotplug event.
ACPI CPU hotplug related initialization should only happen if ACPI_CPU_HOTPLUG
support has been enabled for particular architecture. Add cpu_hotplug_hw_init()
stub to avoid compilation break.
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Tested-by: Xianglai Li <lixianglai@loongson.cn>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Tested-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <
20240716111502.202344-4-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Salil Mehta [Tue, 16 Jul 2024 11:14:57 +0000 (12:14 +0100)]
hw/acpi: Move CPU ctrl-dev MMIO region len macro to common header file
CPU ctrl-dev MMIO region length could be used in ACPI GED and various other
architecture specific places. Move ACPI_CPU_HOTPLUG_REG_LEN macro to more
appropriate common header file.
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Tested-by: Xianglai Li <lixianglai@loongson.cn>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Tested-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <
20240716111502.202344-3-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Salil Mehta [Tue, 16 Jul 2024 11:14:56 +0000 (12:14 +0100)]
accel/kvm: Extract common KVM vCPU {creation,parking} code
KVM vCPU creation is done once during the vCPU realization when Qemu vCPU thread
is spawned. This is common to all the architectures as of now.
Hot-unplug of vCPU results in destruction of the vCPU object in QOM but the
corresponding KVM vCPU object in the Host KVM is not destroyed as KVM doesn't
support vCPU removal. Therefore, its representative KVM vCPU object/context in
Qemu is parked.
Refactor architecture common logic so that some APIs could be reused by vCPU
Hotplug code of some architectures likes ARM, Loongson etc. Update new/old APIs
with trace events. New APIs qemu_{create,park,unpark}_vcpu() can be externally
called. No functional change is intended here.
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Xianglai Li <lixianglai@loongson.cn>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Reviewed-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <
20240716111502.202344-2-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Igor Mammedov [Mon, 15 Jul 2024 12:24:17 +0000 (14:24 +0200)]
smbios: make memory device size configurable per Machine
Currently QEMU describes initial[1] RAM* in SMBIOS as a series of
virtual DIMMs (capped at 16Gb max) using type 17 structure entries.
Which is fine for the most cases. However when starting guest
with terabytes of RAM this leads to too many memory device
structures, which eventually upsets linux kernel as it reserves
only 64K for these entries and when that border is crossed out
it runs out of reserved memory.
Instead of partitioning initial RAM on 16Gb DIMMs, use maximum
possible chunk size that SMBIOS spec allows[2]. Which lets
encode RAM in lower 31 bits of 32bit field (which amounts upto
2047Tb per DIMM).
As result initial RAM will generate only one type 17 structure
until host/guest reach ability to use more RAM in the future.
Compat changes:
We can't unconditionally change chunk size as it will break
QEMU<->guest ABI (and migration). Thus introduce a new machine
class field that would let older versioned machines to use
legacy 16Gb chunks, while new(er) machine type[s] use maximum
possible chunk size.
PS:
While it might seem to be risky to rise max entry size this large
(much beyond of what current physical RAM modules support),
I'd not expect it causing much issues, modulo uncovering bugs
in software running within guest. And those should be fixed
on guest side to handle SMBIOS spec properly, especially if
guest is expected to support so huge RAM configs.
In worst case, QEMU can reduce chunk size later if we would
care enough about introducing a workaround for some 'unfixable'
guest OS, either by fixing up the next machine type or
giving users a CLI option to customize it.
1) Initial RAM - is RAM configured with help '-m SIZE' CLI option/
implicitly defined by machine. It doesn't include memory
configured with help of '-device' option[s] (pcdimm,nvdimm,...)
2) SMBIOS 3.1.0 7.18.5 Memory Device — Extended Size
PS:
* tested on 8Tb host with RHEL6 guest, which seems to parse
type 17 SMBIOS table entries correctly (according to 'dmidecode').
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <
20240715122417.
4059293-1-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Akihiko Odaki [Mon, 15 Jul 2024 05:19:14 +0000 (14:19 +0900)]
docs: Document composable SR-IOV device
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <
20240715-sriov-v5-8-
3f5539093ffc@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Akihiko Odaki [Mon, 15 Jul 2024 05:19:13 +0000 (14:19 +0900)]
virtio-net: Implement SR-IOV VF
A virtio-net device can be added as a SR-IOV VF to another virtio-pci
device that will be the PF.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <
20240715-sriov-v5-7-
3f5539093ffc@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Akihiko Odaki [Mon, 15 Jul 2024 05:19:12 +0000 (14:19 +0900)]
virtio-pci: Implement SR-IOV PF
Allow user to attach SR-IOV VF to a virtio-pci PF.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <
20240715-sriov-v5-6-
3f5539093ffc@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Akihiko Odaki [Mon, 15 Jul 2024 05:19:11 +0000 (14:19 +0900)]
pcie_sriov: Allow user to create SR-IOV device
A user can create a SR-IOV device by specifying the PF with the
sriov-pf property of the VFs. The VFs must be added before the PF.
A user-creatable VF must have PCIDeviceClass::sriov_vf_user_creatable
set. Such a VF cannot refer to the PF because it is created before the
PF.
A PF that user-creatable VFs can be attached calls
pcie_sriov_pf_init_from_user_created_vfs() during realization and
pcie_sriov_pf_exit() when exiting.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <
20240715-sriov-v5-5-
3f5539093ffc@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Akihiko Odaki [Mon, 15 Jul 2024 05:19:10 +0000 (14:19 +0900)]
pcie_sriov: Check PCI Express for SR-IOV PF
SR-IOV requires PCI Express.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <
20240715-sriov-v5-4-
3f5539093ffc@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Akihiko Odaki [Mon, 15 Jul 2024 05:19:09 +0000 (14:19 +0900)]
pcie_sriov: Ensure PF and VF are mutually exclusive
A device cannot be a SR-IOV PF and a VF at the same time.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <
20240715-sriov-v5-3-
3f5539093ffc@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Akihiko Odaki [Mon, 15 Jul 2024 05:19:08 +0000 (14:19 +0900)]
hw/pci: Fix SR-IOV VF number calculation
pci_config_get_bar_addr() had a division by vf_stride. vf_stride needs
to be non-zero when there are multiple VFs, but the specification does
not prohibit to make it zero when there is only one VF.
Do not perform the division for the first VF to avoid division by zero.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <
20240715-sriov-v5-2-
3f5539093ffc@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Richard Henderson [Mon, 22 Jul 2024 22:31:21 +0000 (08:31 +1000)]
Merge tag 'pull-request-2024-07-22' of https://gitlab.com/thuth/qemu into staging
* Minor clean-ups and fixes for the qtests and Avocado tests
* Fix crash that happens when introspecting scsi-block on older machine types
* s390x: filter deprecated properties based on model expansion type
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmaeSUMRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbVdQw/8DvGymXKwpS0F2aSHg3AZvjSCpkv3Y+fK
# myQrzh30cv9Vhe/Y9do47HpfJ6Ug9SK6xG64K2o+BIW+G3+ZUwSHk24PoiALsrJf
# 9qqya1upBJkEC5B4PhqRPS3GlbvBnKKEk8W6BMpUa2BToFV9MsG256cBVhUrRpGc
# 6u80DgTNxCI1czsNkWVGJAt1oVLYYJIjz7UZ4VbZCH48o6r0iSUV6C01wccOFmNy
# IXbspyyUftWFh9lO0i8PiYlXG2YEAmFry3gqD5vc+6BsFT4lMeoRFFxbVCddGKFc
# iNwlH4ayjeISlEJeClImIdbHyZ+sDhPyy5x4cpQqmZudEPn+GVnZ0arm7OvXW/k8
# Yog4n7/cUz7GHnWbqYIFZMS1g1wmqm/9VPsVTzXAlTva4dTTs2p0tKAADHIAtPCI
# jxSPpbuCuukDzUZGsNZyRGbex6g4B0tP4TMHRFxo5LVy9dKn2BLOHBWuzPevD9OO
# FphZHUuGngcPi4GSFmlv7aCS0pqyWsCO+5EqoYUgO8yadyfiXN9pwjB6OnBZux0U
# kbJOkkBJwEalhsiHmPFMnS8rkWa4Ye4ZJjj8XHRiecxSZOcNOcxyE+l2x8CV2aFB
# UBR83nm86vXXpu86Yod3E+txDEUzKN5+B8X0q7Se0YvsWbB+1Dq/Co0Bdh/Wp70E
# EPk5eqaSp8k=
# =zB5F
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 22 Jul 2024 09:57:55 PM AEST
# gpg: using RSA key
27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
* tag 'pull-request-2024-07-22' of https://gitlab.com/thuth/qemu:
target/s390x: filter deprecated properties based on model expansion type
tests: increase timeout per instance of bios-tables-test
qtest/fuzz: make range overlap check more readable
hw: Fix crash that happens when introspecting scsi-block on older machine types
tests/avocado/machine_aspeed.py: Increase timeout for TPM test
tests/avocado: Remove the remainders of the virtiofs_submounts test
tests/avocado/mem-addr-space-check: Remove unused "import signal"
tests/avocado: Move LinuxTest related code into a separate file
tests/avocado: Allow overwriting AVOCADO_SHOW env variable
tests/avocado/boot_xen.py: use class attribute
tests/avocado/boot_xen.py: unify tags
tests/avocado/boot_xen.py: merge base classes
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
songziming [Mon, 22 Jul 2024 09:52:55 +0000 (17:52 +0800)]
chardev/char-win-stdio.c: restore old console mode
If I use `-serial stdio` on Windows, after QEMU exits, the terminal
could not handle arrow keys and tab any more. Because stdio backend
on Windows sets console mode to virtual terminal input when starts,
but does not restore the old mode when finalize.
This small patch saves the old console mode and set it back.
Signed-off-by: Ziming Song <s.ziming@hotmail.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <ME3P282MB25488BE7C39BF0C35CD0DA5D8CA82@ME3P282MB2548.AUSP282.PROD.OUTLOOK.COM>
Paolo Bonzini [Tue, 16 Jul 2024 10:35:39 +0000 (12:35 +0200)]
hpet: avoid timer storms on periodic timers
If the period is set to a value that is too low, there could be no
time left to run the rest of QEMU. Do not trigger interrupts faster
than 1 MHz.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 16 Jul 2024 10:26:36 +0000 (12:26 +0200)]
hpet: store full 64-bit target value of the counter
Store the full 64-bit value at which the timer should fire.
This makes it possible to skip the imprecise hpet_calculate_diff()
step, and to remove the clamping of the period to 31 or 63 bits.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Wed, 10 Jul 2024 07:58:01 +0000 (09:58 +0200)]
hpet: accept 64-bit reads and writes
Declare the MemoryRegionOps so that 64-bit reads and writes to the HPET
are received directly. This makes it possible to unify the code to
process low and high parts: for 32-bit reads, extract the desired word;
for 32-bit writes, just merge the desired part into the old value and
proceed as with a 64-bit write.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Wed, 10 Jul 2024 08:55:13 +0000 (10:55 +0200)]
hpet: place read-only bits directly in "new_val"
The variable "val" is used for two different purposes. As an intermediate
value when writing configuration registers, and to store the cleared bits
when writing ISR.
Use "new_val" for the former, and rename the variable so that it is clearer
for the latter case.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Wed, 10 Jul 2024 08:31:48 +0000 (10:31 +0200)]
hpet: remove unnecessary variable "index"
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Wed, 10 Jul 2024 08:53:05 +0000 (10:53 +0200)]
hpet: ignore high bits of comparator in 32-bit mode
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 16 Jul 2024 09:36:44 +0000 (11:36 +0200)]
hpet: fix and cleanup persistence of interrupt status
There are several bugs in the handling of the ISR register:
- switching level->edge was not lowering the interrupt and
clearing ISR
- switching on the enable bit was not raising a level-triggered
interrupt if the timer had fired
- the timer must be kept running even if not enabled, in
order to set the ISR flag, so writes to HPET_TN_CFG must
not call hpet_del_timer()
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Anthony Harivel [Wed, 22 May 2024 15:34:52 +0000 (17:34 +0200)]
Add support for RAPL MSRs in KVM/Qemu
Starting with the "Sandy Bridge" generation, Intel CPUs provide a RAPL
interface (Running Average Power Limit) for advertising the accumulated
energy consumption of various power domains (e.g. CPU packages, DRAM,
etc.).
The consumption is reported via MSRs (model specific registers) like
MSR_PKG_ENERGY_STATUS for the CPU package power domain. These MSRs are
64 bits registers that represent the accumulated energy consumption in
micro Joules. They are updated by microcode every ~1ms.
For now, KVM always returns 0 when the guest requests the value of
these MSRs. Use the KVM MSR filtering mechanism to allow QEMU handle
these MSRs dynamically in userspace.
To limit the amount of system calls for every MSR call, create a new
thread in QEMU that updates the "virtual" MSR values asynchronously.
Each vCPU has its own vMSR to reflect the independence of vCPUs. The
thread updates the vMSR values with the ratio of energy consumed of
the whole physical CPU package the vCPU thread runs on and the
thread's utime and stime values.
All other non-vCPU threads are also taken into account. Their energy
consumption is evenly distributed among all vCPUs threads running on
the same physical CPU package.
To overcome the problem that reading the RAPL MSR requires priviliged
access, a socket communication between QEMU and the qemu-vmsr-helper is
mandatory. You can specified the socket path in the parameter.
This feature is activated with -accel kvm,rapl=true,path=/path/sock.sock
Actual limitation:
- Works only on Intel host CPU because AMD CPUs are using different MSR
adresses.
- Only the Package Power-Plane (MSR_PKG_ENERGY_STATUS) is reported at
the moment.
Signed-off-by: Anthony Harivel <aharivel@redhat.com>
Link: https://lore.kernel.org/r/20240522153453.1230389-4-aharivel@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Yao Xingtao [Mon, 22 Jul 2024 09:17:28 +0000 (05:17 -0400)]
hw/nvme: remove useless type cast
The type of req->cmd is NvmeCmd, cast the pointer of this type to
NvmeCmd* is useless.
Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Ayush Mishra [Tue, 2 Jul 2024 08:02:32 +0000 (13:32 +0530)]
hw/nvme: actually implement abort
Abort was not implemented previously, but we can implement it for AERs
and asynchrnously for I/O.
Signed-off-by: Ayush Mishra <ayush.m55@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Arun Kumar [Mon, 1 Jul 2024 07:08:55 +0000 (12:38 +0530)]
hw/nvme: add cross namespace copy support
Extend copy command to copy user data across different namespaces via
support for specifying a namespace for each source range
Signed-off-by: Arun Kumar <arun.kka@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Collin Walling [Fri, 19 Jul 2024 18:17:41 +0000 (14:17 -0400)]
target/s390x: filter deprecated properties based on model expansion type
Currently, there is no way to execute the query-cpu-model-expansion
command to retrieve a comprehenisve list of deprecated properties, as
the result is dependent per-model. To enable this, the expansion output
is modified as such:
When reporting a "full" CPU model, show the *entire* list of deprecated
properties regardless if they are supported on the model. A full
expansion outputs all known CPU model properties anyway, so it makes
sense to report all deprecated properties here too.
This allows management apps to query a single model (e.g. host) to
acquire the full list of deprecated properties.
Additionally, when reporting a "static" CPU model, the command will
only show deprecated properties that are a subset of the model's
*enabled* properties. This is more accurate than how the query was
handled before, which blindly reported deprecated properties that
were never otherwise introduced for certain models.
Acked-by: David Hildenbrand <david@redhat.com>
Suggested-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Message-ID: <
20240719181741.35146-1-walling@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Igor Mammedov [Tue, 16 Jul 2024 12:59:30 +0000 (14:59 +0200)]
tests: increase timeout per instance of bios-tables-test
CI often fails 'cross-i686-tci' job due to runner slowness
Log shows that test almost complete, with a few remaining
when bios-tables-test timeout hits:
19/270 qemu:qtest+qtest-aarch64 / qtest-aarch64/bios-tables-test
TIMEOUT 610.02s killed by signal 15 SIGTERM
...
stderr:
TAP parsing error: Too few tests run (expected 8, got 7)
At the same time overall job running time is only ~30 out of 1hr allowed.
Increase bios-tables-test instance timeout on 5min as a fix
for slow CI runners.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-ID: <
20240716125930.620861-1-imammedo@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Yao Xingtao [Mon, 22 Jul 2024 04:07:36 +0000 (00:07 -0400)]
qtest/fuzz: make range overlap check more readable
use ranges_overlap() instead of open-coding the overlap check to improve
the readability of the code.
Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Message-ID: <
20240722040742.11513-8-yaoxt.fnst@fujitsu.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Thomas Huth [Wed, 3 Jul 2024 09:09:04 +0000 (11:09 +0200)]
hw: Fix crash that happens when introspecting scsi-block on older machine types
"make check SPEED=slow" is currently failing the device-introspect-test on
older machine types since introspecting "scsi-block" is causing an abort:
$ ./qemu-system-x86_64 -M pc-q35-8.0 -monitor stdio
QEMU 9.0.50 monitor - type 'help' for more information
(qemu) device_add scsi-block,help
Unexpected error in object_property_find_err() at
../../devel/qemu/qom/object.c:1357:
can't apply global scsi-disk-base.migrate-emulated-scsi-request=false:
Property 'scsi-block.migrate-emulated-scsi-request' not found
Aborted (core dumped)
The problem is that the compat code tries to change the
"migrate-emulated-scsi-request" property for all devices that are
derived from "scsi-block", but the property has only been added
to "scsi-hd" and "scsi-cd" via the DEFINE_SCSI_DISK_PROPERTIES macro.
Thus let's fix the problem by only changing the property on the devices
that really have this property.
Fixes: b4912afa5f ("scsi-disk: Fix crash for VM configured with USB CDROM after live migration")
Message-ID: <
20240703090904.909720-1-thuth@redhat.com>
Acked-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Cédric Le Goater [Mon, 22 Jul 2024 08:55:47 +0000 (10:55 +0200)]
tests/avocado/machine_aspeed.py: Increase timeout for TPM test
On some runners, test_arm_ast2600_evb_buildroot_tpm can take longer
than 90s to complete. Increase timeout for these.
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <
20240722085547.90650-1-clg@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Thomas Huth [Thu, 18 Jul 2024 17:31:25 +0000 (19:31 +0200)]
tests/avocado: Remove the remainders of the virtiofs_submounts test
The virtiofs_submounts test has been removed in commit
5da7701e2a
("virtiofsd: Remove test"), so we don't need this files anymore.
Message-ID: <
20240718173125.489901-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Thomas Huth [Fri, 19 Jul 2024 09:54:08 +0000 (11:54 +0200)]
tests/avocado/mem-addr-space-check: Remove unused "import signal"
The "signal" module is not used here, so we can remove this import
statement.
Message-ID: <
20240719095408.33298-1-thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Thomas Huth [Fri, 19 Jul 2024 09:50:31 +0000 (11:50 +0200)]
tests/avocado: Move LinuxTest related code into a separate file
Only some few tests are using the LinuxTest class. Move the related
code into a separate file so that this does not pollute the main
namespace.
Message-ID: <
20240719095031.32814-1-thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Philippe Mathieu-Daudé [Fri, 19 Jul 2024 18:02:11 +0000 (20:02 +0200)]
tests/avocado: Allow overwriting AVOCADO_SHOW env variable
The 'app' level logging is useful, but sometimes we want
more, for example QEMU leverages the 'console' logging.
Allow overwriting AVOCADO_SHOW from environment, i.e.:
$ make check-avocado AVOCADO_SHOW='app,console'
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <
20240719180211.48073-1-philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Anthony Harivel [Wed, 22 May 2024 15:34:51 +0000 (17:34 +0200)]
tools: build qemu-vmsr-helper
Introduce a privileged helper to access RAPL MSR.
The privileged helper tool, qemu-vmsr-helper, is designed to provide
virtual machines with the ability to read specific RAPL (Running Average
Power Limit) MSRs without requiring CAP_SYS_RAWIO privileges or relying
on external, out-of-tree patches.
The helper tool leverages Unix permissions and SO_PEERCRED socket
options to enforce access control, ensuring that only processes
explicitly requesting read access via readmsr() from a valid Thread ID
can access these MSRs.
The list of RAPL MSRs that are allowed to be read by the helper tool is
defined in rapl-msr-index.h. This list corresponds to the RAPL MSRs that
will be supported in the next commit titled "Add support for RAPL MSRs
in KVM/QEMU."
The tool is intentionally designed to run on the Linux x86 platform.
This initial implementation is tailored for Intel CPUs but can be
extended to support AMD CPUs in the future.
Signed-off-by: Anthony Harivel <aharivel@redhat.com>
Link: https://lore.kernel.org/r/20240522153453.1230389-3-aharivel@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Anthony Harivel [Wed, 22 May 2024 15:34:50 +0000 (17:34 +0200)]
qio: add support for SO_PEERCRED for socket channel
The function qio_channel_get_peercred() returns a pointer to the
credentials of the peer process connected to this socket.
This credentials structure is defined in <sys/socket.h> as follows:
struct ucred {
pid_t pid; /* Process ID of the sending process */
uid_t uid; /* User ID of the sending process */
gid_t gid; /* Group ID of the sending process */
};
The use of this function is possible only for connected AF_UNIX stream
sockets and for AF_UNIX stream and datagram socket pairs.
On platform other than Linux, the function return 0.
Signed-off-by: Anthony Harivel <aharivel@redhat.com>
Link: https://lore.kernel.org/r/20240522153453.1230389-2-aharivel@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 16 Jul 2024 16:53:11 +0000 (18:53 +0200)]
target/i386: do not crash if microvm guest uses SGX CPUID leaves
sgx_epc_get_section assumes a PC platform is in use:
bool sgx_epc_get_section(int section_nr, uint64_t *addr, uint64_t *size)
{
PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
However, sgx_epc_get_section is called by CPUID regardless of whether
SGX state has been initialized or which platform is in use. Check
whether the machine has the right QOM class and if not behave as if
there are no EPC sections.
Fixes: 1dec2e1f19f ("i386: Update SGX CPUID info according to hardware/KVM/user input", 2021-09-30)
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2142
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Zheyu Ma [Tue, 2 Jul 2024 23:13:03 +0000 (01:13 +0200)]
hw/nvme: fix memory leak in nvme_dsm
The allocated memory to hold LBA ranges leaks in the nvme_dsm function. This
happens because the allocated memory for iocb->range is not freed in all
error handling paths.
Fix this by adding a free to ensure that the allocated memory is properly freed.
ASAN log:
==
3075137==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 480 byte(s) in 6 object(s) allocated from:
#0 0x55f1f8a0eddd in malloc llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3
#1 0x7f531e0f6738 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738)
#2 0x55f1faf1f091 in blk_aio_get block/block-backend.c:2583:12
#3 0x55f1f945c74b in nvme_dsm hw/nvme/ctrl.c:2609:30
#4 0x55f1f945831b in nvme_io_cmd hw/nvme/ctrl.c:4470:16
#5 0x55f1f94561b7 in nvme_process_sq hw/nvme/ctrl.c:7039:29
Cc: qemu-stable@nongnu.org
Fixes: d7d1474fd85d ("hw/nvme: reimplement dsm to allow cancellation")
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Marc-André Lureau [Wed, 17 Jul 2024 17:15:40 +0000 (21:15 +0400)]
ui/vdagent: send caps on fe_open
The spice-vdagentd doesn't send capabilities again on host/client
disconnect (but when the session agent connects and sends a
GUEST_XORG_RESOLUTION message)
When the dbus client disconnects, vdagent_disconnect() is called to
reset the agent state. Capabilities must be negotiated again on
reconnection.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <
20240717171541.201525-5-marcandre.lureau@redhat.com>
Marc-André Lureau [Wed, 17 Jul 2024 17:15:39 +0000 (21:15 +0400)]
ui/vdagent: notify clipboard peers of serial reset
Since we reset the serial counters, peers should also be reset to be sync.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <
20240717171541.201525-4-marcandre.lureau@redhat.com>
Marc-André Lureau [Wed, 17 Jul 2024 17:15:38 +0000 (21:15 +0400)]
ui/vdagent: improve vdagent_fe_open() trace
Place the trace when the function enters, with arg value.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <
20240717171541.201525-3-marcandre.lureau@redhat.com>
Marc-André Lureau [Wed, 17 Jul 2024 17:15:37 +0000 (21:15 +0400)]
ui: add more tracing for dbus
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <
20240717171541.201525-2-marcandre.lureau@redhat.com>