Philippe Mathieu-Daudé [Wed, 8 Jan 2025 12:10:53 +0000 (12:10 +0000)]
dockerfiles: Remove 'MAINTAINER' entry in debian-tricore-cross.docker
AMSAT closed its email service [*] so my personal email
address is now defunct. Remove it to avoid bouncing emails.
[*] https://web.archive.org/web/
20240617194936/https://forum.amsat-dl.org/index.php?thread/4581-amsat-mail-alias-service-to-end-august-1-2024/
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <
20250102152513.61065-1-philmd@linaro.org>
[AJB: update URL to web.archive.org]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-32-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:52 +0000 (12:10 +0000)]
pc-bios: ensure keymaps dependencies set vnc tests
I was seeing failures on vnc-display-test on FreeBSD:
make vm-build-freebsd V=1 TARGET_LIST=aarch64-softmmu BUILD_TARGET=check-qtest QEMU_LOCAL=1 DEBUG=1
Leads to:
qemu-system-aarch64: -vnc none: could not read keymap file: 'en-us'
Broken pipe
../src/tests/qtest/libqtest.c:196: kill_qemu() tried to terminate QEMU process but encountered exit status 1 (expected 0)
which was as far as I could tell because we don't populate the
$BLD/pc-bios/keymaps (although scripts/symlink-install-tree.py
attempts to symlink qemu-bundle/usr/local/share/qemu/keymaps/ to that
dir).
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-31-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:51 +0000 (12:10 +0000)]
tests/vm: allow interactive login as root
This is useful when debugging and you want to add packages to an
image.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-30-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:50 +0000 (12:10 +0000)]
tests/vm: partially un-tabify help output
While the make syntax itself uses tabs having a mixture of tabs and
spaces in the vm-help output make no sense and confuses things lining
up between terminal and editor. Fix that.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-29-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:49 +0000 (12:10 +0000)]
tests/vm: fix build_path based path
We no longer need to go into the per-arch build directories to find
the build directories binary. Lets call it directly.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-28-alex.bennee@linaro.org>
Daniel P. Berrangé [Wed, 8 Jan 2025 12:10:48 +0000 (12:10 +0000)]
tests/lcitool: remove temp workaround for debian mips64el
The workaround applied in
commit
c60473d29254b79d9437eface8b342e84663ba66
Author: Alex Bennée <alex.bennee@linaro.org>
Date: Wed Oct 2 10:03:33 2024 +0200
testing: bump mips64el cross to bookworm and fix package list
Is no longer required since the affected builds are now fixed.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <
20241217133525.
3836570-1-berrange@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-27-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:47 +0000 (12:10 +0000)]
tests/docker: move riscv64 cross container from sid to trixie
Although riscv64 isn't going to be a release architecture for trixie
the packages are still built while it is testing. Moving from sid will
also avoid some of the volatility we get from tracking the bleeding
edge.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-26-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:46 +0000 (12:10 +0000)]
tests/lcitool: bump to latest version of libvirt-ci
We will shortly need this to build our riscv64 cross container.
However to keep the delta down just do the bump first. As ccache4 is
now preferred for FreeBSD to get the latest version there is a little
update in the FreeBSD metadata.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-25-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:45 +0000 (12:10 +0000)]
tests/functional: extend test_aarch64_virt with vulkan test
Now that we have virtio-gpu Vulkan support, let's add a test for it.
Currently this is using images build by buildroot:
https://lists.buildroot.org/pipermail/buildroot/2024-December/768196.html
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-24-alex.bennee@linaro.org>
Xiaoyao Li [Thu, 19 Dec 2024 11:01:25 +0000 (06:01 -0500)]
i386/cpu: Set and track CPUID_EXT3_CMP_LEG in env->features[FEAT_8000_0001_ECX]
The correct usage is tracking and maintaining features in env->features[]
instead of manually set it in cpu_x86_cpuid().
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20241219110125.1266461-11-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Xiaoyao Li [Thu, 19 Dec 2024 11:01:24 +0000 (06:01 -0500)]
i386/cpu: Set up CPUID_HT in x86_cpu_expand_features() instead of cpu_x86_cpuid()
Currently CPUID_HT is evaluated in cpu_x86_cpuid() each time. It's not a
correct usage of how feature bit is maintained and evaluated. The
expected practice is that features are tracked in env->features[] and
cpu_x86_cpuid() should be the consumer of env->features[].
Track CPUID_HT in env->features[FEAT_1_EDX] instead and evaluate it in
cpu's realizefn().
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20241219110125.1266461-10-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Xiaoyao Li [Thu, 19 Dec 2024 11:01:23 +0000 (06:01 -0500)]
cpu: Remove nr_cores from struct CPUState
There is no user of it now, remove it.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20241219110125.1266461-9-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Xiaoyao Li [Thu, 19 Dec 2024 11:01:22 +0000 (06:01 -0500)]
i386/cpu: Hoist check of CPUID_EXT3_TOPOEXT against threads_per_core
Now it changes to use env->topo_info.threads_per_core and doesn't depend
on qemu_init_vcpu() anymore. Put it together with other feature checks
before qemu_init_vcpu()
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20241219110125.1266461-8-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Xiaoyao Li [Thu, 19 Dec 2024 11:01:21 +0000 (06:01 -0500)]
i386/cpu: Track a X86CPUTopoInfo directly in CPUX86State
The name of nr_modules/nr_dies are ambiguous and they mislead people.
The purpose of them is to record and form the topology information. So
just maintain a X86CPUTopoInfo member in CPUX86State instead. Then
nr_modules and nr_dies can be dropped.
As the benefit, x86 can switch to use information in
CPUX86State::topo_info and get rid of the nr_cores and nr_threads in
CPUState. This helps remove the dependency on qemu_init_vcpu(), so that
x86 can get and use topology info earlier in x86_cpu_realizefn(); drop
the comment that highlighted the depedency.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20241219110125.1266461-7-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Xiaoyao Li [Thu, 19 Dec 2024 11:01:20 +0000 (06:01 -0500)]
i386/topology: Introduce helpers for various topology info of different level
Introduce various helpers for getting the topology info of different
semantics. Using the helper is more self-explanatory.
Besides, the semantic of the helper will stay unchanged even when new
topology is added in the future. At that time, updating the
implementation of the helper without affecting the callers.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20241219110125.1266461-6-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Xiaoyao Li [Thu, 19 Dec 2024 11:01:19 +0000 (06:01 -0500)]
i386/topology: Update the comment of x86_apicid_from_topo_ids()
Update the comment of x86_apicid_from_topo_ids() to match the current
implementation,
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20241219110125.1266461-5-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Xiaoyao Li [Thu, 19 Dec 2024 11:01:18 +0000 (06:01 -0500)]
i386/cpu: Drop cores_per_pkg in cpu_x86_cpuid()
Local variable cores_per_pkg is only used to calculate threads_per_pkg.
No need for it. Drop it and open-code it instead.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20241219110125.1266461-4-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Xiaoyao Li [Thu, 19 Dec 2024 11:01:17 +0000 (06:01 -0500)]
i386/cpu: Drop the variable smp_cores and smp_threads in x86_cpu_pre_plug()
No need to define smp_cores and smp_threads, just using ms->smp.cores
and ms->smp.threads is straightforward. It's also consistent with other
checks of socket/die/module.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20241219110125.1266461-3-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Xiaoyao Li [Thu, 19 Dec 2024 11:01:16 +0000 (06:01 -0500)]
i386/cpu: Extract a common fucntion to setup value of MSR_CORE_THREAD_COUNT
There are duplicated code to setup the value of MSR_CORE_THREAD_COUNT.
Extract a common function for it.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20241219110125.1266461-2-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 24 Dec 2024 15:59:12 +0000 (16:59 +0100)]
target/i386/kvm: Replace ARRAY_SIZE(msr_handlers) with KVM_MSR_FILTER_MAX_RANGES
kvm_install_msr_filters() uses KVM_MSR_FILTER_MAX_RANGES as the bound
when traversing msr_handlers[], while other places still compute the
size by ARRAY_SIZE(msr_handlers).
In fact, msr_handlers[] is an array with the fixed size
KVM_MSR_FILTER_MAX_RANGES, and this has to be true because
kvm_install_msr_filters copies from one array to the other.
For code consistency, assert that they match and use
ARRAY_SIZE(msr_handlers) everywehere.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Zhao Liu [Wed, 6 Nov 2024 03:07:27 +0000 (11:07 +0800)]
target/i386/kvm: Clean up error handling in kvm_arch_init()
Currently, there're following incorrect error handling cases in
kvm_arch_init():
* Missed to handle failure of kvm_get_supported_feature_msrs().
* Missed to return when kvm_vm_enable_disable_exits() fails.
* MSR filter related cases called exit() directly instead of returning
to kvm_init(). (The caller of kvm_arch_init() - kvm_init() - needs to
know if kvm_arch_init() fails in order to perform cleanup).
Fix the above cases.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Link: https://lore.kernel.org/r/20241106030728.553238-11-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Zhao Liu [Wed, 6 Nov 2024 03:07:26 +0000 (11:07 +0800)]
target/i386/kvm: Return -1 when kvm_msr_energy_thread_init() fails
It is common practice to return a negative value (like -1) to indicate
an error, and other functions in kvm_arch_init() follow this style.
To avoid confusion (sometimes returned -1 indicates failure, and
sometimes -1, in a same function), return -1 when
kvm_msr_energy_thread_init() fails.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20241106030728.553238-10-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Zhao Liu [Wed, 6 Nov 2024 03:07:25 +0000 (11:07 +0800)]
target/i386/kvm: Clean up return values of MSR filter related functions
Before commit
0cc42e63bb54 ("kvm/i386: refactor kvm_arch_init and split
it into smaller functions"), error_report() attempts to print the error
code from kvm_filter_msr(). However, printing error code does not work
due to kvm_filter_msr() returns bool instead int.
0cc42e63bb54 fixed the error by removing error code printing, but this
lost useful error messages. Bring it back by making kvm_filter_msr()
return int.
This also makes the function call chain processing clearer, allowing for
better handling of error result propagation from kvm_filter_msr() to
kvm_arch_init(), preparing for the subsequent cleanup work of error
handling in kvm_arch_init().
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Link: https://lore.kernel.org/r/20241106030728.553238-9-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Zhao Liu [Wed, 6 Nov 2024 03:07:24 +0000 (11:07 +0800)]
target/i386/confidential-guest: Fix comment of x86_confidential_guest_kvm_type()
Update the comment to match the X86ConfidentialGuestClass
implementation.
Reported-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Link: https://lore.kernel.org/r/20241106030728.553238-8-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Zhao Liu [Wed, 6 Nov 2024 03:07:23 +0000 (11:07 +0800)]
target/i386/kvm: Drop workaround for KVM_X86_DISABLE_EXITS_HTL typo
The KVM_X86_DISABLE_EXITS_HTL typo has been fixed in commit
77d361b13c19 ("linux-headers: Update to kernel mainline commit
b357bf602").
Drop the related workaround.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Link: https://lore.kernel.org/r/20241106030728.553238-7-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Zhao Liu [Wed, 6 Nov 2024 03:07:21 +0000 (11:07 +0800)]
target/i386/kvm: Only save/load kvmclock MSRs when kvmclock enabled
MSR_KVM_SYSTEM_TIME and MSR_KVM_WALL_CLOCK are attached with the (old)
kvmclock feature (KVM_FEATURE_CLOCKSOURCE).
So, just save/load them only when kvmclock (KVM_FEATURE_CLOCKSOURCE) is
enabled.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Link: https://lore.kernel.org/r/20241106030728.553238-5-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Zhao Liu [Wed, 6 Nov 2024 03:07:20 +0000 (11:07 +0800)]
target/i386/kvm: Remove local MSR_KVM_WALL_CLOCK and MSR_KVM_SYSTEM_TIME definitions
These 2 MSRs have been already defined in kvm_para.h (standard-headers/
asm-x86/kvm_para.h).
Remove QEMU local definitions to avoid duplication.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Link: https://lore.kernel.org/r/20241106030728.553238-4-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Zhao Liu [Wed, 6 Nov 2024 03:07:19 +0000 (11:07 +0800)]
target/i386/kvm: Add feature bit definitions for KVM CPUID
Add feature definitions for KVM_CPUID_FEATURES in CPUID (
CPUID[4000_0001].EAX and CPUID[4000_0001].EDX), to get rid of lots of
offset calculations.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Link: https://lore.kernel.org/r/20241106030728.553238-3-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Zhao Liu [Wed, 6 Nov 2024 03:07:18 +0000 (11:07 +0800)]
i386/cpu: Mark avx10_version filtered when prefix is NULL
In x86_cpu_filter_features(), if host doesn't support AVX10, the
configured avx10_version should be marked as filtered regardless of
whether prefix is NULL or not.
Check prefix before warn_report() instead of checking for
have_filtered_features.
Cc: qemu-stable@nongnu.org
Fixes: commit bccfb846fd52 ("target/i386: add AVX10 feature and AVX10 version property")
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241106030728.553238-2-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Thu, 21 Nov 2024 12:01:45 +0000 (13:01 +0100)]
target/i386: use shr to load high-byte registers into T0/T1
Using a sextract or extract operation is only necessary if a
sign or zero extended value is needed. If not, a shift is
enough.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Thu, 17 Oct 2024 10:10:39 +0000 (12:10 +0200)]
target/i386: improve code generation for BT
Because BT does not write back to the source operand, it can modify it to
ensure that one of the operands of TSTNE is a constant (after either gen_BT
or the optimizer's constant propagation). This produces better and more
optimizable TCG ops. For example, the sequence
movl $0x60013f, %ebx
btl %ecx, %ebx
becomes just
and_i32 tmp1,ecx,$0x1f dead: 1 2 pref=0xffff
shr_i32 tmp0,$0x60013f,tmp1 dead: 1 2 pref=0xffff
and_i32 tmp16,tmp0,$0x1 dead: 1 pref=0xbf80
On s390x, it can use four instructions to isolate bit 0 of 0x60013f >> (ecx & 31):
nilf %r12, 0x1f
lgfi %r11, 0x60013f
srlk %r12, %r11, 0(%r12)
nilf %r12, 1
Previously, it used five instructions to build 1 << (ecx & 31) and compute
TSTEQ, and also needed two more to construct the result of setcond:
nilf %r12, 0x1f
lghi %r11, 1
sllk %r12, %r11, 0(%r12)
lgfi %r9, 0x60013f
nrk %r0, %r12, %r9
lghi %r12, 0
locghilh %r12, 1
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Thu, 19 Dec 2024 10:24:13 +0000 (11:24 +0100)]
make-release: only leave tarball of wrap-file subprojects
The QEMU source archive is including the sources downloaded from crates.io
in both tarball form (in subprojects/packagecache) and expanded/patched
form (in the subprojects directory). The former is the more authoritative
form, as it has a hash that can be verified in the wrap file and checked
against the download URL, so keep that one only. This works also with
--disable-download; when building QEMU for the first time from the
tarball, Meson will print something like
Using proc-macro2-1-rs source from cache.
for each subproject, and then go on to extract the tarball and apply the
overlay or the patches in subprojects/packagefiles.
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2719
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 7 Jan 2025 10:42:49 +0000 (11:42 +0100)]
qom: remove unused field
The "concrete_class" field of InterfaceClass is only ever written, and as far
as I can tell is not particularly useful when debugging either; remove it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Fri, 20 Dec 2024 12:10:03 +0000 (13:10 +0100)]
rust: hide warnings for subprojects
This matches cargo's own usage of "--cap-lints allow" when building
dependencies. The dummy changes to the .wrap files help Meson notice
that the subproject is out of date.
Also remove an unnecessary subprojects/unicode-ident-1-rs/meson.build file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Fri, 29 Nov 2024 09:46:44 +0000 (10:46 +0100)]
rust: qdev: expose inherited methods to subclasses of SysBusDevice
The ObjectDeref trait now provides all the magic that is required to fake
inheritance. Replace the "impl SysBusDevice" block of qemu_api::sysbus
with a trait, so that sysbus_init_irq() can be invoked as "self.init_irq()"
without any intermediate upcast.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Wed, 4 Dec 2024 07:58:46 +0000 (08:58 +0100)]
rust: qemu-api-macros: add automatic TryFrom/TryInto derivation
This is going to be fairly common. Using a custom procedural macro
provides better error messages and automatically finds the right
type.
Note that this is different from the same-named macro in the
derive_more crate. That one provides conversion from e.g. tuples
to enums with tuple variants, not from integers to enums.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Wed, 4 Dec 2024 07:57:27 +0000 (08:57 +0100)]
rust: qemu-api-macros: extend error reporting facility to parse errors
Generalize the CompileError tuple to an enum, that can be either an error
message or a parse error from syn.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Fri, 29 Nov 2024 07:48:07 +0000 (08:48 +0100)]
rust: qom: make INSTANCE_POST_INIT take a shared reference
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Mon, 2 Dec 2024 12:16:19 +0000 (13:16 +0100)]
rust: pl011: only leave embedded object initialization in instance_init
Leave IRQ and MMIO initialization to instance_post_init. In Rust the
two callbacks are more distinct, because only instance_post_init has a
fully initialized object available.
While at it, add a wrapper for sysbus_init_mmio so that accesses to
the SysBusDevice correctly use shared references.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Fri, 29 Nov 2024 10:38:59 +0000 (11:38 +0100)]
rust: qom: move device_id to PL011 class side
There is no need to monkeypatch DeviceId::Luminary into the already-initialized
PL011State. Instead, now that we can define a class hierarchy, we can define
PL011Class and make device_id a field in there.
There is also no need anymore to have "Arm" as zero, so change DeviceId into a
wrapper for the array; all it does is provide an Index<hwaddr> implementation
because arrays can only be indexed by usize.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Wed, 11 Dec 2024 09:33:31 +0000 (10:33 +0100)]
rust: qom: automatically use Drop trait to implement instance_finalize
Replace the customizable INSTANCE_FINALIZE with a generic function
that drops the Rust object.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Wed, 11 Dec 2024 10:48:44 +0000 (11:48 +0100)]
rust: macros: check that the first field of a #[derive(Object)] struct is a ParentField
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Thu, 24 Oct 2024 09:57:02 +0000 (11:57 +0200)]
rust: macros: check that #[derive(Object)] requires #[repr(C)]
Convert derive_object to the same pattern of first making a
Result<proc_macro2::TokenStream, CompileError>, and then doing
.unwrap_or_else(Into::into) to support checking the validity of
the input. Add is_c_repr to check that all QOM structs include
a #[repr(C)] attribute.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Wed, 11 Dec 2024 11:18:06 +0000 (12:18 +0100)]
rust: add a utility module for compile-time type checks
It is relatively common in the low-level qemu_api code to assert that
a field of a struct has a specific type; for example, it can be used
to ensure that the fields match what the qemu_api and C code expects
for safety.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Wed, 11 Dec 2024 10:38:20 +0000 (11:38 +0100)]
rust: qom: add ParentField
Add a type that, together with the C function object_deinit, ensures the
correct drop order for QOM objects relative to their superclasses.
Right now it is not possible to implement the Drop trait for QOM classes
that are defined in Rust, as the drop() function would not be called when
the object goes away; instead what is called is ObjectImpl::INSTANCE_FINALIZE.
It would be nice for INSTANCE_FINALIZE to just drop the object, but this has
a problem: suppose you have
pub struct MySuperclass {
parent: DeviceState,
field: Box<MyData>,
...
}
impl Drop for MySuperclass {
...
}
pub struct MySubclass {
parent: MySuperclass,
...
}
and an instance_finalize implementation that is like
unsafe extern "C" fn drop_object<T: ObjectImpl>(obj: *mut Object) {
unsafe { std::ptr::drop_in_place(obj.cast::<T>()) }
}
When instance_finalize is called for MySubclass, it will walk the struct's
list of fields and call the drop method for MySuperclass. Then, object_deinit
recurses to the superclass and calls the same drop method again. This
will cause double-freeing of the Box<Data>.
What's happening here is that QOM wants to control the drop order of
MySuperclass and MySubclass's fields. To do so, the parent field must
be marked ManuallyDrop<>, which is quite ugly. Instead, add a wrapper
type ParentField<> that is specific to QOM. This hides the implementation
detail of *what* is special about the ParentField, and will also be easy
to check in the #[derive(Object)] macro.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 7 Jan 2025 15:52:25 +0000 (16:52 +0100)]
rust: add --check-cfg test to rustc arguments
rustc will check that every reachable #[cfg] matches a list of
the expected config names and values. Recent versions of rustc are
also complaining about #[cfg(test)], even if it is basically a standard
part of the language. So, always allow it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Stefan Hajnoczi [Fri, 10 Jan 2025 18:39:19 +0000 (13:39 -0500)]
Merge tag 'migration-
20250110-pull-request' of https://gitlab.com/farosas/qemu into staging
Migration pull request
- compression:
Shameer's fix for CONFIG_UADK build
Yuan Liu fixes for zero-page, QPL, qatzip
- multifd sync cleanups, prereq. for VFIO and postcopy work
- fixes for 9.2 regressions:
multifd with pre-9.0 -> post-9.1 migrations (#2720)
s390x migration (#2704)
- fix for assertions during paused migrations; rework of
late-block-activate logic (#2395, #686)
- fixes for compressed arrays creation and parsing, mostly affecting
s390x
# -----BEGIN PGP SIGNATURE-----
#
# iQJEBAABCAAuFiEEqhtIsKIjJqWkw2TPx5jcdBvsMZ0FAmeBDgkQHGZhcm9zYXNA
# c3VzZS5kZQAKCRDHmNx0G+wxnSlUEACl31wY+77JxWnBva/eDDwnJ9HiCrqsoqaZ
# YIJJXNlk4lYJWNdZRt6p27exzWrQwm+kWKPECeCakgCMlfhnKCvejGq7iV/fJY4o
# D8hjE3t1htQ8mfblY1+bqzg3Rml59KwXxiqAwvlljbNWdkXruv026dq9vgJMzFhi
# ia043fOO1tYULIoawgmwmLEHnztht0v+ZTZ1v5KQbrH655tpxls/8kHc6v5PXEpA
# 3PSmCrCQh1dPtkYRjuJ9yHyfU+/T8tYwIjrU6VR1wQW7MBNkjtqNudaqAFiuyuqn
# P8gh4rAQrMhA9y+aq6xSoJP8XGkuOHxLQtlNutlmtbcQyZ7JqgLmK9ZLdoPf21sK
# //erV63NoyaciYB9Nk3NXflwroc6zyvo8A584kGNPwBznZOJLESP4SPvVm/nlE29
# vbyq8AWHRjFiqqf6P0ttQLAFkusZJzM1Y9UakF51hyVBX70yfqLG20XXZtIq/aZA
# GbBB2Fo0MIlbmWaur3vLsSzn7B8d++Gl9TTGcK/eIXJ1ANCuCxGv9fbXJQlP5F4I
# 3OAoSmAVJ2eqw4v0+2WMiEa8yUA5drNnDSI3VRkG+0K9jRfHKXki466/QQdGrNw7
# 8GuuzLBNai3gEKbavDU0Be73r982KjXeYXj7RuAkQfm0d4H7tiwtg91Cd1dPKfzh
# mhpmOFJDCg==
# =joNM
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 10 Jan 2025 07:09:45 EST
# gpg: using RSA key
AA1B48B0A22326A5A4C364CFC798DC741BEC319D
# gpg: issuer "farosas@suse.de"
# gpg: Good signature from "Fabiano Rosas <farosas@suse.de>" [unknown]
# gpg: aka "Fabiano Almeida Rosas <fabiano.rosas@suse.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: AA1B 48B0 A223 26A5 A4C3 64CF C798 DC74 1BEC 319D
* tag 'migration-
20250110-pull-request' of https://gitlab.com/farosas/qemu: (25 commits)
multifd: bugfix for incorrect migration data with qatzip compression
multifd: bugfix for incorrect migration data with QPL compression
multifd: bugfix for migration using compression methods
s390x: Fix CSS migration
migration: Fix arrays of pointers in JSON writer
migration: Dump correct JSON format for nullptr replacement
migration: Rename vmstate_info_nullptr
migration: Fix parsing of s390 stream
migration: Remove unused argument in vmsd_desc_field_end
migration: Add more error handling to analyze-migration.py
migration/block: Rewrite disk activation
migration/block: Fix possible race with block_inactive
migration/block: Apply late-block-active behavior to postcopy
migration/block: Make late-block-active the default
qmp/cont: Only activate disks if migration completed
migration: Add helper to get target runstate
migration/multifd: Fix compat with QEMU < 9.0
migration/multifd: Document the reason to sync for save_setup()
migration/multifd: Cleanup src flushes on condition check
migration/multifd: Remove sync processing on postcopy
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi [Fri, 10 Jan 2025 15:30:50 +0000 (10:30 -0500)]
Merge tag 'qom-qdev-
20250109' of https://github.com/philmd/qemu into staging
QOM & QDev patches
- Remove DeviceState::opts (Akihiko)
- Replace container_get by machine/object_get_container (Peter)
- Remove InterfaceInfo::concrete_class field (Paolo)
- Reduce machine_containers[] scope (Philippe)
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmeABNgACgkQ4+MsLN6t
# wN4XtQ/+NyXEK9vjq+yXnk7LRxTDQBrXxNc71gLqNA8rGwXTuELIXOthNW+UM2a9
# CdnVbrIX/FRfQLXTHx0C2ENteafrR1oXDQmEOz1UeYgaCWJsNdVe3r1MYUdHcwVM
# 90JcSbYhrvxFE/p/6WhTjjv2DXn4E8witsPwRc8EBi5bHeFz6cNPzhdF59A3ljZF
# 0zr1MLHJHhwR6OoBbm9HM8x8i4Zw4LoKEjo8cCgcBfPQIMKf0HQ4XsinIDwn0VXN
# S3jIysNyGHlptHOiJuErILZtzrm4F2lGwYan89jxuElfWjC7SVB2z4CQkQtPceIJ
# HRBrE7VPwJ566OAThoSwPG3jXT1yCDOYmNCX1kJOMo9rYh3MwG0VrbMr5iwfYk8Z
# wO+8IyMAx7m8FibdsoMmxtI1PYTf0JQaCB6MSwdoAMMQVp1FDWBun2g+swLjQgO4
# 15iSB+PMIZe7Ywd0b63VZrUMHKwMxd9RFYEbbsdA8DRI50W3HMQPZAJiGXt7RxJ9
# p9qxqg0WGpVjgTnInt/KH4axiWPD5cru+THVYk6dvOdtTM5wj2jEswWy2vQ6LkEF
# MgxaUXfja8E20AXvdr6uXKwcKOIJ9+TaU5AhUmjpvacjJhy5eQdoFt9OnIMQt25U
# KTtapCVsong5JzYZWhITNCMf5w2YGCJGJJekxdrqBvFk+FkMR38=
# =+TLu
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 09 Jan 2025 12:18:16 EST
# gpg: using RSA key
FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE
* tag 'qom-qdev-
20250109' of https://github.com/philmd/qemu:
system: Inline machine_containers[] in qemu_create_machine_containers()
qom: remove unused InterfaceInfo::concrete_class field
qom: Remove container_get()
qom: Use object_get_container()
qom: Add object_get_container()
qdev: Use machine_get_container()
qdev: Add machine_get_container()
qdev: Make qdev_get_machine() not use container_get()
qdev: Implement qdev_create_fake_machine() for user emulation
qdev: Remove opts member
hw/pci: Use -1 as the default value for rombar
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Alex Bennée [Wed, 8 Jan 2025 12:10:44 +0000 (12:10 +0000)]
tests/functional: bail aarch64_virt tests early if missing TCG
The set_machine and require_accelerator steps can bail early so move
those to the front of the test functions. While we are at it also
clean up some long lines when adding the vm arguments.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-23-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:43 +0000 (12:10 +0000)]
tests/functional: remove unused kernel_command_line
The Alpine test boots from the CDROM so we don't --append a command
line. Drop the unused code.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-22-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:42 +0000 (12:10 +0000)]
tests/functional: update tuxruntest to use uncompress utility
Use the utility functions to reduce code duplication.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-21-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:41 +0000 (12:10 +0000)]
tests/functional: add zstd support to uncompress utility
Rather than using the python library (which has a different API
anyway) lets just call the binary. zstdtools is already in out
qemu.yml so all test containers should have it around. Tests should
still use @skipIfMissingCommands('zstd') to gracefully handle when
only minimal dependencies have been installed.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-20-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:40 +0000 (12:10 +0000)]
tests/functional: remove hacky sleep from the tests
We have proper detection of prompts now so we don't need to guess with
sleep() sprinkled through the test. The extra step of calling halt is
just to flush the final bits of the log (although the last line is
still missed).
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-19-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:39 +0000 (12:10 +0000)]
system/qtest: properly feedback results of clock_[step|set]
Time will not advance if the system is paused or there are no timer
events set for the future. In absence of pending timer events
advancing time would make no difference the system state. Attempting
to do so would be a bug and the test or device under test would need
fixing.
Tighten up the result reporting to `FAIL` if time was not advanced.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2687
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-18-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:38 +0000 (12:10 +0000)]
tests/qtest: remove clock_steps from virtio tests
In the qtest environment time will not step forward if the system is
paused (timers disabled) or we have no timer events to fire. As a
result VirtIO events are responded to directly and we don't need to
step time forward.
We still do timeout processing to handle the fact the target QEMU may
not be ready to respond right away. This will usually be due to a slow
CI system or if QEMU is running under something like rr.
Future qtest patches will assert that time actually changes when a
step is requested.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-17-alex.bennee@linaro.org>
Pierrick Bouvier [Wed, 8 Jan 2025 12:10:37 +0000 (12:10 +0000)]
tests/functional/aarch64: add tests for FEAT_RME
This boot an OP-TEE environment, and launch a nested guest VM inside it
using the Realms feature. We do it for virt and sbsa-ref platforms.
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <
20241220165212.
3653495-1-pierrick.bouvier@linaro.org>
[AJB: tweak ordering of setup, strip changelog from commit]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <
20250108121054.
1126164-16-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:36 +0000 (12:10 +0000)]
tests/functional: update the x86_64 tuxrun tests
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-15-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:35 +0000 (12:10 +0000)]
tests/functional: update the sparc64 tuxrun tests
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-14-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:34 +0000 (12:10 +0000)]
tests/functional: update the s390x tuxrun tests
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-13-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:33 +0000 (12:10 +0000)]
tests/functional: update the riscv64 tuxrun tests
Now there are new up to date images available we should update to them.
Note we re-use the riscv32 kernel and rootfs for test_riscv64_rv32.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-12-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:32 +0000 (12:10 +0000)]
tests/functional: update the riscv32 tuxrun tests
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-11-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:31 +0000 (12:10 +0000)]
tests/functional: update the ppc64 tuxrun tests
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-10-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:30 +0000 (12:10 +0000)]
tests/functional: update the ppc32 tuxrun tests
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-9-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:29 +0000 (12:10 +0000)]
tests/functional: update the mips64el tuxrun tests
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-8-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:28 +0000 (12:10 +0000)]
tests/functional: update the mips64 tuxrun tests
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-7-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:27 +0000 (12:10 +0000)]
tests/functional: update the mips32el tuxrun tests
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-6-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:26 +0000 (12:10 +0000)]
tests/functional: update the mips32 tuxrun tests
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-5-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:25 +0000 (12:10 +0000)]
tests/functional: add a m68k tuxrun tests
We didn't have this before and as it exercises the m68k virt platform
it seems worth adding. We don't wait for the shutdown because QEMU
will auto-exit on the shutdown.
Cc: Laurent Vivier <laurent@vivier.eu>
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-4-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:24 +0000 (12:10 +0000)]
tests/functional: update the i386 tuxrun tests
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-3-alex.bennee@linaro.org>
Alex Bennée [Wed, 8 Jan 2025 12:10:23 +0000 (12:10 +0000)]
tests/functional: update the arm tuxrun tests
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <
20250108121054.
1126164-2-alex.bennee@linaro.org>
Yuan Liu [Wed, 18 Dec 2024 09:14:13 +0000 (17:14 +0800)]
multifd: bugfix for incorrect migration data with qatzip compression
When QPL compression is enabled on the migration channel and the same
dirty page changes from a normal page to a zero page in the iterative
memory copy, the dirty page will not be updated to a zero page again
on the target side, resulting in incorrect memory data on the source
and target sides.
The root cause is that the target side does not record the normal pages
to the receivedmap.
The solution is to add ramblock_recv_bitmap_set_offset in target side
to record the normal pages.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Jason Zeng <jason.zeng@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <
20241218091413.140396-4-yuan1.liu@intel.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Yuan Liu [Wed, 18 Dec 2024 09:14:12 +0000 (17:14 +0800)]
multifd: bugfix for incorrect migration data with QPL compression
When QPL compression is enabled on the migration channel and the same
dirty page changes from a normal page to a zero page in the iterative
memory copy, the dirty page will not be updated to a zero page again
on the target side, resulting in incorrect memory data on the source
and target sides.
The root cause is that the target side does not record the normal pages
to the receivedmap.
The solution is to add ramblock_recv_bitmap_set_offset in target side
to record the normal pages.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Jason Zeng <jason.zeng@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <
20241218091413.140396-3-yuan1.liu@intel.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Yuan Liu [Wed, 18 Dec 2024 09:14:11 +0000 (17:14 +0800)]
multifd: bugfix for migration using compression methods
When compression is enabled on the migration channel and
the pages processed are all zero pages, these pages will
not be sent and updated on the target side, resulting in
incorrect memory data on the source and target sides.
The root cause is that all compression methods call
multifd_send_prepare_common to determine whether to compress
dirty pages, but multifd_send_prepare_common does not update
the IOV of MultiFDPacket_t when all dirty pages are zero pages.
The solution is to always update the IOV of MultiFDPacket_t
regardless of whether the dirty pages are all zero pages.
Fixes: 303e6f54f9 ("migration/multifd: Implement zero page transmission on the multifd thread.")
Cc: qemu-stable@nongnu.org #9.0+
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Jason Zeng <jason.zeng@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <
20241218091413.140396-2-yuan1.liu@intel.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Fabiano Rosas [Thu, 9 Jan 2025 18:52:49 +0000 (15:52 -0300)]
s390x: Fix CSS migration
Commit
a55ae46683 ("s390: move css_migration_enabled from machine to
css.c") disabled CSS migration globally instead of doing it
per-instance.
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: qemu-stable@nongnu.org #9.1
Fixes: a55ae46683 ("s390: move css_migration_enabled from machine to css.c")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2704
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <
20250109185249.23952-8-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Fabiano Rosas [Thu, 9 Jan 2025 18:52:48 +0000 (15:52 -0300)]
migration: Fix arrays of pointers in JSON writer
Currently, if an array of pointers contains a NULL pointer, that
pointer will be encoded as '0' in the stream. Since the JSON writer
doesn't define a "pointer" type, that '0' will now be an uint8, which
is different from the original type being pointed to, e.g. struct.
(we're further calling uint8 "nullptr", but that's irrelevant to the
issue)
That mixed-type array shouldn't be compressed, otherwise data is lost
as the code currently makes the whole array have the type of the first
element:
css = {NULL, NULL, ..., 0x5555568a7940, NULL};
{"name": "s390_css", "instance_id": 0, "vmsd_name": "s390_css",
"version": 1, "fields": [
...,
{"name": "css", "array_len": 256, "type": "nullptr", "size": 1},
...,
]}
In the above, the valid pointer at position 254 got lost among the
compressed array of nullptr.
While we could disable the array compression when a NULL pointer is
found, the JSON part of the stream still makes part of downtime, so we
should avoid writing unecessary bytes to it.
Keep the array compression in place, but if NULL and non-NULL pointers
are mixed break the array into several type-contiguous pieces :
css = {NULL, NULL, ..., 0x5555568a7940, NULL};
{"name": "s390_css", "instance_id": 0, "vmsd_name": "s390_css",
"version": 1, "fields": [
...,
{"name": "css", "array_len": 254, "type": "nullptr", "size": 1},
{"name": "css", "type": "struct", "struct": {"vmsd_name": "s390_css_img", ... }, "size": 768},
{"name": "css", "type": "nullptr", "size": 1},
...,
]}
Now each type-discontiguous region will become a new JSON entry. The
reader should interpret this as a concatenation of values, all part of
the same field.
Parsing the JSON with analyze-script.py now shows the proper data
being pointed to at the places where the pointer is valid and
"nullptr" where there's NULL:
"s390_css (14)": {
...
"css": [
"nullptr",
"nullptr",
...
"nullptr",
{
"chpids": [
{
"in_use": "0x00",
"type": "0x00",
"is_virtual": "0x00"
},
...
]
},
"nullptr",
}
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <
20250109185249.23952-7-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Peter Xu [Thu, 9 Jan 2025 18:52:47 +0000 (15:52 -0300)]
migration: Dump correct JSON format for nullptr replacement
QEMU plays a trick with null pointers inside an array of pointers in a VMSD
field. See
07d4e69147 ("migration/vmstate: fix array of ptr with
nullptrs") for more details on why. The idea makes sense in general, but
it may overlooked the JSON writer where it could write nothing in a
"struct" in the JSON hints section.
We hit some analyze-migration.py issues on s390 recently, showing that some
of the struct field contains nothing, like:
{"name": "css", "array_len": 256, "type": "struct", "struct": {}, "size": 1}
As described in details by Fabiano:
https://lore.kernel.org/r/87pll37cin.fsf@suse.de
It could be that we hit some null pointers there, and JSON was gone when
they're null pointers.
To fix it, instead of hacking around only at VMStateInfo level, do that
from VMStateField level, so that JSON writer can also be involved. In this
case, JSON writer will replace the pointer array (which used to be a
"struct") to be the real representation of the nullptr field.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <
20250109185249.23952-6-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Fabiano Rosas [Thu, 9 Jan 2025 18:52:46 +0000 (15:52 -0300)]
migration: Rename vmstate_info_nullptr
Rename vmstate_info_nullptr from "uint64_t" to "nullptr". This vmstate
actually reads and writes just a byte, so the proper name would be
uint8. However, since this is a marker for a NULL pointer, it's
convenient to have a more explicit name that can be identified by the
consumers of the JSON part of the stream.
Change the name to "nullptr" and add support for it in the
analyze-migration.py script. Arbitrarily use the name of the type as
the value of the field to avoid the script showing 0x30 or '0', which
could be confusing for readers.
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <
20250109185249.23952-5-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Fabiano Rosas [Thu, 9 Jan 2025 18:52:45 +0000 (15:52 -0300)]
migration: Fix parsing of s390 stream
The parsing for the S390StorageAttributes section is currently leaving
an unconsumed token that is later interpreted by the generic code as
QEMU_VM_EOF, cutting the parsing short.
The migration will issue a STATTR_FLAG_DONE between iterations, which
the script consumes correctly, but there's a final STATTR_FLAG_EOS at
.save_complete that the script is ignoring. Since the EOS flag is a
u64 0x1ULL and the stream is big endian, on little endian hosts a byte
read from it will be 0x0, the same as QEMU_VM_EOF.
Fixes: 81c2c9dd5d ("tests/qtest/migration-test: Fix analyze-migration.py for s390x")
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <
20250109185249.23952-4-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Fabiano Rosas [Thu, 9 Jan 2025 18:52:44 +0000 (15:52 -0300)]
migration: Remove unused argument in vmsd_desc_field_end
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <
20250109185249.23952-3-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Fabiano Rosas [Thu, 9 Jan 2025 18:52:43 +0000 (15:52 -0300)]
migration: Add more error handling to analyze-migration.py
The analyze-migration script was seen failing in s390x in misterious
ways. It seems we're reaching the VMSDFieldStruct constructor without
any fields, which would indicate an empty .subsection entry, a
VMSTATE_STRUCT with no fields or a vmsd with no fields. We don't have
any of those, at least not without the unmigratable flag set, so this
should never happen.
Add some debug statements so that we can see what's going on the next
time the issue happens.
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <
20250109185249.23952-2-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Peter Xu [Fri, 6 Dec 2024 23:08:38 +0000 (18:08 -0500)]
migration/block: Rewrite disk activation
This patch proposes a flag to maintain disk activation status globally. It
mostly rewrites disk activation mgmt for QEMU, including COLO and QMP
command xen_save_devices_state.
Backgrounds
===========
We have two problems on disk activations, one resolved, one not.
Problem 1: disk activation recover (for switchover interruptions)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When migration is either cancelled or failed during switchover, especially
when after the disks are inactivated, QEMU needs to remember re-activate
the disks again before vm starts.
It used to be done separately in two paths: one in qmp_migrate_cancel(),
the other one in the failure path of migration_completion().
It used to be fixed in different commits, all over the places in QEMU. So
these are the relevant changes I saw, I'm not sure if it's complete list:
- In 2016, commit
fe904ea824 ("migration: regain control of images when
migration fails to complete")
- In 2017, commit
1d2acc3162 ("migration: re-active images while migration
been canceled after inactive them")
- In 2023, commit
6dab4c93ec ("migration: Attempt disk reactivation in
more failure scenarios")
Now since we have a slightly better picture maybe we can unify the
reactivation in a single path.
One side benefit of doing so is, we can move the disk operation outside QMP
command "migrate_cancel". It's possible that in the future we may want to
make "migrate_cancel" be OOB-compatible, while that requires the command
doesn't need BQL in the first place. This will already do that and make
migrate_cancel command lightweight.
Problem 2: disk invalidation on top of invalidated disks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is an unresolved bug for current QEMU. Link in "Resolves:" at the
end. It turns out besides the src switchover phase (problem 1 above), QEMU
also needs to remember block activation on destination.
Consider two continuous migration in a row, where the VM was always paused.
In that scenario, the disks are not activated even until migration
completed in the 1st round. When the 2nd round starts, if QEMU doesn't
know the status of the disks, it needs to try inactivate the disk again.
Here the issue is the block layer API bdrv_inactivate_all() will crash a
QEMU if invoked on already inactive disks for the 2nd migration. For
detail, see the bug link at the end.
Implementation
==============
This patch proposes to maintain disk activation with a global flag, so we
know:
- If we used to inactivate disks for migration, but migration got
cancelled, or failed, QEMU will know it should reactivate the disks.
- On incoming side, if the disks are never activated but then another
migration is triggered, QEMU should be able to tell that inactivate is
not needed for the 2nd migration.
We used to have disk_inactive, but it only solves the 1st issue, not the
2nd. Also, it's done in completely separate paths so it's extremely hard
to follow either how the flag changes, or the duration that the flag is
valid, and when we will reactivate the disks.
Convert the existing disk_inactive flag into that global flag (also invert
its naming), and maintain the disk activation status for the whole
lifecycle of qemu. That includes the incoming QEMU.
Put both of the error cases of source migration (failure, cancelled)
together into migration_iteration_finish(), which will be invoked for
either of the scenario. So from that part QEMU should behave the same as
before. However with such global maintenance on disk activation status, we
not only cleanup quite a few temporary paths that we try to maintain the
disk activation status (e.g. in postcopy code), meanwhile it fixes the
crash for problem 2 in one shot.
For freshly started QEMU, the flag is initialized to TRUE showing that the
QEMU owns the disks by default.
For incoming migrated QEMU, the flag will be initialized to FALSE once and
for all showing that the dest QEMU doesn't own the disks until switchover.
That is guaranteed by the "once" variable.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2395
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <
20241206230838.
1111496-7-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Peter Xu [Fri, 6 Dec 2024 23:08:37 +0000 (18:08 -0500)]
migration/block: Fix possible race with block_inactive
Src QEMU sets block_inactive=true very early before the invalidation takes
place. It means if something wrong happened during setting the flag but
before reaching qemu_savevm_state_complete_precopy_non_iterable() where it
did the invalidation work, it'll make block_inactive flag inconsistent.
For example, think about when qemu_savevm_state_complete_precopy_iterable()
can fail: it will have block_inactive set to true even if all block drives
are active.
Fix that by only update the flag after the invalidation is done.
No Fixes for any commit, because it's not an issue if bdrv_activate_all()
is re-entrant upon all-active disks - false positive block_inactive can
bring nothing more than "trying to active the blocks but they're already
active". However let's still do it right to avoid the inconsistent flag
v.s. reality.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <
20241206230838.
1111496-6-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Peter Xu [Fri, 6 Dec 2024 23:08:36 +0000 (18:08 -0500)]
migration/block: Apply late-block-active behavior to postcopy
Postcopy never cared about late-block-active. However there's no mention
in the capability that it doesn't apply to postcopy.
Considering that we _assumed_ late activation is always good, do that too
for postcopy unconditionally, just like precopy. After this patch, we
should have unified the behavior across all.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <
20241206230838.
1111496-5-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Peter Xu [Fri, 6 Dec 2024 23:08:35 +0000 (18:08 -0500)]
migration/block: Make late-block-active the default
Migration capability 'late-block-active' controls when the block drives
will be activated. If enabled, block drives will only be activated until
VM starts, either src runstate was "live" (RUNNING, or SUSPENDED), or it'll
be postponed until qmp_cont().
Let's do this unconditionally. There's no harm to delay activation of
block drives. Meanwhile there's no ABI breakage if dest does it, because
src QEMU has nothing to do with it, so it's no concern on ABI breakage.
IIUC we could avoid introducing this cap when introducing it before, but
now it's still not too late to just always do it. Cap now prone to
removal, but it'll be for later patches.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <
20241206230838.
1111496-4-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Peter Xu [Fri, 6 Dec 2024 23:08:34 +0000 (18:08 -0500)]
qmp/cont: Only activate disks if migration completed
As the comment says, the activation of disks is for the case where
migration has completed, rather than when QEMU is still during
migration (RUN_STATE_INMIGRATE).
Move the code over to reflect what the comment is describing.
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <
20241206230838.
1111496-3-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Peter Xu [Fri, 6 Dec 2024 23:08:33 +0000 (18:08 -0500)]
migration: Add helper to get target runstate
In 99% cases, after QEMU migrates to dest host, it tries to detect the
target VM runstate using global_state_get_runstate().
There's one outlier so far which is Xen that won't send global state.
That's the major reason why global_state_received() check was always there
together with global_state_get_runstate().
However it's utterly confusing why global_state_received() has anything to
do with "let's start VM or not".
Provide a helper to explain it, then we have an unified entry for getting
the target dest QEMU runstate after migration.
Suggested-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <
20241206230838.
1111496-2-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Fabiano Rosas [Fri, 13 Dec 2024 16:01:19 +0000 (13:01 -0300)]
migration/multifd: Fix compat with QEMU < 9.0
Commit
f5f48a7891 ("migration/multifd: Separate SYNC request with
normal jobs") changed the multifd source side to stop sending data
along with the MULTIFD_FLAG_SYNC, effectively introducing the concept
of a SYNC-only packet. Relying on that, commit
d7e58f412c
("migration/multifd: Don't send ram data during SYNC") later came
along and skipped reading data from SYNC packets.
In a versions timeline like this:
8.2
f5f48a7 9.0 9.1
d7e58f41 9.2
The issue arises that QEMUs < 9.0 still send data along with SYNC, but
QEMUs > 9.1 don't gather that data anymore. This leads to various
kinds of migration failures due to desync/missing data.
Stop checking for a SYNC packet on the destination and unconditionally
unfill the packet.
>From now on:
old -> new:
the source sends data + sync, destination reads normally
new -> new:
source sends only sync, destination reads zeros
new -> old:
source sends only sync, destination reads zeros
CC: qemu-stable@nongnu.org
Fixes: d7e58f412c ("migration/multifd: Don't send ram data during SYNC")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2720
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <
20241213160120.23880-2-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Peter Xu [Fri, 6 Dec 2024 22:47:55 +0000 (17:47 -0500)]
migration/multifd: Document the reason to sync for save_setup()
It's not straightforward to see why src QEMU needs to sync multifd during
setup() phase. After all, there's no page queued at that point.
For old QEMUs, there's a solid reason: EOS requires it to work. While it's
clueless on the new QEMUs which do not take EOS message as sync requests.
One will figure that out only when this is conditionally removed. In fact,
the author did try it out. Logically we could still avoid doing this on
new machine types, however that needs a separate compat field and that can
be an overkill in some trivial overhead in setup() phase.
Let's instead document it completely, to avoid someone else tries this
again and do the debug one more time, or anyone confused on why this ever
existed.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <
20241206224755.
1108686-8-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Peter Xu [Fri, 6 Dec 2024 22:47:54 +0000 (17:47 -0500)]
migration/multifd: Cleanup src flushes on condition check
The src flush condition check is over complicated, and it's getting more
out of control if postcopy will be involved.
In general, we have two modes to do the sync: legacy or modern ways.
Legacy uses per-section flush, modern uses per-round flush.
Mapped-ram always uses the modern, which is per-round.
Introduce two helpers, which can greatly simplify the code, and hopefully
make it readable again.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <
20241206224755.
1108686-7-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Peter Xu [Fri, 6 Dec 2024 22:47:53 +0000 (17:47 -0500)]
migration/multifd: Remove sync processing on postcopy
Multifd never worked with postcopy, at least yet so far.
Remove the sync processing there, because it's confusing, and they should
never appear. Now if RAM_SAVE_FLAG_MULTIFD_FLUSH is observed, we fail hard
instead of trying to invoke multifd code.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <
20241206224755.
1108686-6-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Peter Xu [Fri, 6 Dec 2024 22:47:52 +0000 (17:47 -0500)]
migration/multifd: Unify RAM_SAVE_FLAG_MULTIFD_FLUSH messages
RAM_SAVE_FLAG_MULTIFD_FLUSH message should always be correlated to a sync
request on src. Unify such message into one place, and conditionally send
the message only if necessary.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <
20241206224755.
1108686-5-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Peter Xu [Fri, 6 Dec 2024 22:47:51 +0000 (17:47 -0500)]
migration/ram: Move RAM_SAVE_FLAG* into ram.h
Firstly, we're going to use the multifd flag soon in multifd code, so ram.c
isn't gonna work.
Secondly, we have a separate RDMA flag dangling around, which is definitely
not obvious. There's one comment that helps, but not too much.
Put all RAM save flags altogether, so nothing will get overlooked.
Add a section explain why we can't use bits over 0x200.
Remove RAM_SAVE_FLAG_FULL as it's already not used in QEMU, as the comment
explained.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <
20241206224755.
1108686-4-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Peter Xu [Fri, 6 Dec 2024 22:47:50 +0000 (17:47 -0500)]
migration/multifd: Allow to sync with sender threads only
Teach multifd_send_sync_main() to sync with threads only.
We already have such requests, which is when mapped-ram is enabled with
multifd. In that case, no SYNC messages will be pushed to the stream when
multifd syncs the sender threads because there's no destination threads
waiting for that. The whole point of the sync is to make sure all threads
finished their jobs.
So fundamentally we have a request to do the sync in different ways:
- Either to sync the threads only,
- Or to sync the threads but also with the destination side.
Mapped-ram did it already because of the use_packet check in the sync
handler of the sender thread. It works.
However it may stop working when e.g. VFIO may start to reuse multifd
channels to push device states. In that case VFIO has similar request on
"thread-only sync" however we can't check a flag because such sync request
can still come from RAM which needs the on-wire notifications.
Paving way for that by allowing the multifd_send_sync_main() to specify
what kind of sync the caller needs. We can use it for mapped-ram already.
No functional change intended.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <
20241206224755.
1108686-3-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Peter Xu [Fri, 6 Dec 2024 22:47:49 +0000 (17:47 -0500)]
migration/multifd: Further remove the SYNC on complete
Commit
637280aeb2 ("migration/multifd: Avoid the final FLUSH in
complete()") stopped sending the RAM_SAVE_FLAG_MULTIFD_FLUSH flag at
ram_save_complete(), because the sync on the destination side is not
needed due to the last iteration of find_dirty_block() having already
done it.
However, that commit overlooked that multifd_ram_flush_and_sync() on the
source side is also not needed at ram_save_complete(), for the same
reason.
Moreover, removing the RAM_SAVE_FLAG_MULTIFD_FLUSH but keeping the
multifd_ram_flush_and_sync() means that currently the recv threads will
hang when receiving the MULTIFD_FLAG_SYNC message, waiting for the
destination sync which only happens when RAM_SAVE_FLAG_MULTIFD_FLUSH is
received.
Luckily, multifd is still all working fine because recv side cleanup
code (mostly multifd_recv_sync_main()) is smart enough to make sure even
if recv threads are stuck at SYNC it'll get kicked out. And since this
is the completion phase of migration, nothing else will be sent after
the SYNCs.
This needs to be fixed because in the future VFIO will have data to push
after ram_save_complete() and we don't want the recv thread to be stuck
in the MULTIFD_FLAG_SYNC message.
Remove the unnecessary (and buggy) invocation of
multifd_ram_flush_and_sync().
For very old binaries (multifd_flush_after_each_section==true), the
flush_and_sync is still needed because each EOS received on destination
will enforce all-channel sync once.
Stable branches do not need this patch, as no real bug I can think of
that will go wrong there.. so not attaching Fixes to be clear on the
backport not needed.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <
20241206224755.
1108686-2-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Shameer Kolothum [Tue, 3 Dec 2024 12:49:43 +0000 (12:49 +0000)]
migration/multifd: Fix compile error caused by page_size usage
>From Commit
90fa121c6c07 ("migration/multifd: Inline page_size and
page_count") onwards page_size is not part of MutiFD*Params but uses
an inline constant instead.
However, it missed updating an old usage, causing a compile error.
Fixes: 90fa121c6c07 ("migration/multifd: Inline page_size and page_count")
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <
20241203124943.52572-1-shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Philippe Mathieu-Daudé [Thu, 2 Jan 2025 13:53:15 +0000 (14:53 +0100)]
system: Inline machine_containers[] in qemu_create_machine_containers()
Only qemu_create_machine_containers() uses the
machine_containers[] array, restrict the scope
to this single user.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20250102211800.79235-9-philmd@linaro.org>
Paolo Bonzini [Tue, 7 Jan 2025 11:13:08 +0000 (12:13 +0100)]
qom: remove unused InterfaceInfo::concrete_class field
The "concrete_class" field of InterfaceClass is only ever written, and as far
as I can tell is not particularly useful when debugging either; remove it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <
20250107111308.21886-1-pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Peter Xu [Thu, 21 Nov 2024 19:22:02 +0000 (14:22 -0500)]
qom: Remove container_get()
Now there's no user of container_get(), remove it.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <
20241121192202.
4155849-14-peterx@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Peter Xu [Thu, 21 Nov 2024 19:22:01 +0000 (14:22 -0500)]
qom: Use object_get_container()
Use object_get_container() whenever applicable across the tree.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <
20241121192202.
4155849-13-peterx@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Peter Xu [Thu, 21 Nov 2024 19:22:00 +0000 (14:22 -0500)]
qom: Add object_get_container()
Add a helper to fetch a root container (under object_get_root()). Sanity
check on the type of the object.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-ID: <
20241121192202.
4155849-12-peterx@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>