qemu.git
3 months agoMAINTAINERS: Add myself as HPPA maintainer
Helge Deller [Thu, 30 Jan 2025 12:36:24 +0000 (13:36 +0100)]
MAINTAINERS: Add myself as HPPA maintainer

Since I contribute quite some code to hppa, I'd like to step up and
become the secondary maintainer for HPPA beside Richard.
Additionally change status of hppa machines to maintained as I will
take care of them.

Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
3 months agohw/i386/pc: Remove unused pc_compat_2_3 declarations
Philippe Mathieu-Daudé [Wed, 15 Jan 2025 23:22:27 +0000 (00:22 +0100)]
hw/i386/pc: Remove unused pc_compat_2_3 declarations

We removed the implementations in commit 46a2bd52571
("hw/i386/pc: Remove deprecated pc-i440fx-2.3 machine")
but forgot to remove the declarations. Do it now.

Fixes: 46a2bd52571 ("hw/i386/pc: Remove deprecated pc-i440fx-2.3 machine")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
3 months agolicenses: Remove SPDX tags not being license identifier for Linaro
Philippe Mathieu-Daudé [Thu, 2 Jan 2025 16:05:10 +0000 (17:05 +0100)]
licenses: Remove SPDX tags not being license identifier for Linaro

Per [*]:

  "we're only interested in adopting SPDX for recording the
  licensing info, [not] any other SPDX metadata."

Replace the 'SPDX-FileCopyrightText' and 'SPDX-FileContributor'
tags added by Linaro by 'Copyright (c)' and 'Authors' words
respectively.

[*] https://lore.kernel.org/qemu-devel/20241007154548.1144961-4-berrange@redhat.com/

Inspired-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
3 months agotests/functional/test_mips_malta: Fix comment about endianness of the test
Thomas Huth [Mon, 27 Jan 2025 18:41:10 +0000 (19:41 +0100)]
tests/functional/test_mips_malta: Fix comment about endianness of the test

This test is for the big endian MIPS target, not for the little endian
target.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Fixes: 79cb4a14cb6 ("tests/functional: Convert mips32eb 4Kc Malta tests")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
3 months agogdbstub/user-target: fix gdbserver int format (%d -> %x)
Dominik 'Disconnect3d' Czarnota [Mon, 20 Jan 2025 22:28:58 +0000 (23:28 +0100)]
gdbstub/user-target: fix gdbserver int format (%d -> %x)

This commit fixes an incorrect format string for formatting integers
provided to GDB when debugging a target run in QEMU user mode.

The correct format is hexadecimal for both success and errno values,
some of which can be seen here [0].

[0] https://github.com/bminor/binutils-gdb/blob/e65a355022d0dc6b5707310876a72b5693ec0aa5/gdbserver/hostio.cc#L196-L213

Signed-off-by: Dominik 'Disconnect3d' Czarnota <dominik.b.czarnota@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Fixes: e282010b2e1e ("gdbstub: Add support for info proc mappings")
Cc: qemu-stable@nongnu.org
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
3 months agovvfat: create_long_filename: fix out-of-bounds array access
Michael Tokarev [Sun, 19 Jan 2025 09:35:47 +0000 (12:35 +0300)]
vvfat: create_long_filename: fix out-of-bounds array access

create_long_filename() intentionally uses direntry_t->name[8+3] array
as a larger array.  This works, but makes static code analysis tools
unhappy.  The problem here is that a directory entry holding long file
name is significantly different from regular directory entry, and the
name is split into several parts within the entry, not just in regular
8+3 name field.

Treat the entry as array of bytes instead.  This fixes the OOB access
from the compiler/tools PoV, but does not change the resulting code
in any way.

Keep the existing code style.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
3 months agotests/functional/test_mips_malta: Convert the mips big endian replay tests
Thomas Huth [Tue, 28 Jan 2025 15:28:39 +0000 (16:28 +0100)]
tests/functional/test_mips_malta: Convert the mips big endian replay tests

Move the mips big endian replay tests from tests/avocado/replay_kernel.py
to the functional framework. Since the functional tests should be run per
target, we cannot stick all replay tests in one file. Thus let's add
these tests to a separate file now.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250128152839.184599-6-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
3 months agotests/functional/test_mips64el_malta: Convert the mips64el replay tests
Thomas Huth [Tue, 28 Jan 2025 15:28:38 +0000 (16:28 +0100)]
tests/functional/test_mips64el_malta: Convert the mips64el replay tests

Move the mips64el replay tests from tests/avocado/replay_kernel.py to
the functional framework. Since the functional tests should be run per
target, we cannot stick all replay tests in one file. Thus let's add
these tests to a separate file there now.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250128152839.184599-5-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
3 months agotests/functional/test_mipsel_malta: Convert the mipsel replay tests
Thomas Huth [Tue, 28 Jan 2025 15:28:37 +0000 (16:28 +0100)]
tests/functional/test_mipsel_malta: Convert the mipsel replay tests

Move the mipsel replay tests from tests/avocado/replay_kernel.py to
the functional framework. Since the functional tests should be run per
target, we cannot stick all replay tests in one file. Thus let's add
these tests to a new, separate file there instead.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250128152839.184599-4-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
3 months agotests/functional: Add the ReplayKernelBase class
Thomas Huth [Tue, 28 Jan 2025 15:28:36 +0000 (16:28 +0100)]
tests/functional: Add the ReplayKernelBase class

Copy the ReplayKernelBase class from the avocado tests. We are going
to need it to convert the related replay tests in the following patches.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250128152839.184599-3-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
3 months agotests/functional: Add a decorator for skipping long running tests
Thomas Huth [Tue, 28 Jan 2025 15:28:35 +0000 (16:28 +0100)]
tests/functional: Add a decorator for skipping long running tests

Some tests have a very long runtime and might run into timeout issues
e.g. when QEMU has been compiled with --enable-debug. Add a decorator
for marking them more easily. Rename the corresponding environment
variable to be more in sync with the other QEMU_TEST_ALLOW_* switches
that we already have, and add a paragraph about it in the documentation.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250128152839.184599-2-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
3 months agonet/dump: Correctly compute Ethernet packet offset
Laurent Vivier [Fri, 17 Jan 2025 11:17:09 +0000 (12:17 +0100)]
net/dump: Correctly compute Ethernet packet offset

When a packet is sent with QEMU_NET_PACKET_FLAG_RAW by QEMU it
never includes virtio-net header even if qemu_get_vnet_hdr_len()
is not 0, and filter-dump is not managing this case.

The only user of QEMU_NET_PACKET_FLAG_RAW is announce_self,
we can show the problem using it and tcpddump:

- QEMU parameters:

  .. -monitor stdio \
     -netdev bridge,id=netdev0,br=virbr0 \
     -device virtio-net,mac=9a:2b:2c:2d:2e:2f,netdev=netdev0 \
     -object filter-dump,netdev=netdev0,file=log.pcap,id=pcap0

- HMP command:

  (qemu) announce_self

- TCP dump:

  $ tcpdump -nxr log.pcap

  without the fix:

    08:00:06:04:00:03 > 2e:2f:80:35:00:01, ethertype Unknown (0x9a2b), length 50:
         0x0000:  2c2d 2e2f 0000 0000 9a2b 2c2d 2e2f 0000
         0x0010:  0000 0000 0000 0000 0000 0000 0000 0000
         0x0020:  0000 0000

  with the fix:

    ARP, Reverse Request who-is 9a:2b:2c:2d:2e:2f tell 9a:2b:2c:2d:2e:2f, length 46
         0x0000:  0001 0800 0604 0003 9a2b 2c2d 2e2f 0000
         0x0010:  0000 9a2b 2c2d 2e2f 0000 0000 0000 0000
         0x0020:  0000 0000 0000 0000 0000 0000 0000

Fixes: 481c52320a26 ("net: Strip virtio-net header when dumping")
Cc: akihiko.odaki@daynix.com
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
3 months agonet: Fix announce_self
Laurent Vivier [Fri, 17 Jan 2025 11:17:08 +0000 (12:17 +0100)]
net: Fix announce_self

b9ad513e1876 ("net: Remove receive_raw()") adds an iovec entry
in qemu_deliver_packet_iov() to add the virtio-net header
in the data when QEMU_NET_PACKET_FLAG_RAW is set but forgets
to increase the number of iovec entries in the array, so
receive_iov() will only send the first entry (the virtio-net
entry, full of 0) and no data. The packet will be discarded.

The only user of QEMU_NET_PACKET_FLAG_RAW is announce_self.

We can see the problem with tcpdump:

- QEMU parameters:

  .. -monitor stdio \
     -netdev bridge,id=netdev0,br=virbr0 \
     -device virtio-net,mac=9a:2b:2c:2d:2e:2f,netdev=netdev0 \

- HMP command:

  (qemu) announce_self

- TCP dump:

  $ sudo tcpdump -nxi virbr0

  without the fix:

    <nothing>

  with the fix:

   ARP, Reverse Request who-is 9a:2b:2c:2d:2e:2f tell 9a:2b:2c:2d:2e:2f, length 46
        0x0000:  0001 0800 0604 0003 9a2b 2c2d 2e2f 0000
        0x0010:  0000 9a2b 2c2d 2e2f 0000 0000 0000 0000
        0x0020:  0000 0000 0000 0000 0000 0000 0000

Reported-by: Xiaohui Li <xiaohli@redhat.com>
Bug: https://issues.redhat.com/browse/RHEL-73891
Fixes: b9ad513e1876 ("net: Remove receive_raw()")
Cc: akihiko.odaki@daynix.com
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
3 months agotests/functional: Extend PPC 40p test with Linux boot
Cédric Le Goater [Wed, 29 Jan 2025 10:48:44 +0000 (11:48 +0100)]
tests/functional: Extend PPC 40p test with Linux boot

Fetch the cdrom image for the IBM 6015 PReP PowerPC machine hosted on
the Juneau Linux Users Group site, boot and check Linux version.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20250129104844.1322100-1-clg@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
3 months agos390x/s390-virtio-ccw: Support plugging PCI-based virtio memory devices
David Hildenbrand [Tue, 28 Jan 2025 18:57:05 +0000 (19:57 +0100)]
s390x/s390-virtio-ccw: Support plugging PCI-based virtio memory devices

Let's just wire it up, unlocking virtio-mem-pci support on s390x.

While at it, drop the "return;" in s390_machine_device_unplug_request(),
to make it look like the other handlers.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-ID: <20250128185705.1609038-3-david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
3 months agovirtio-mem-pci: Allow setting nvectors, so we can use MSI-X
David Hildenbrand [Tue, 28 Jan 2025 18:57:04 +0000 (19:57 +0100)]
virtio-mem-pci: Allow setting nvectors, so we can use MSI-X

Let's do it similar as virtio-balloon-pci. With this change, we can
use virtio-mem-pci on s390x, although plugging will still fail until
properly wired up in the machine.

No need to worry about transitional/non_transitional devices, because they
don't exist for virtio-mem.

Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20250128185705.1609038-2-david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
3 months agovirtio-balloon-pci: Allow setting nvectors, so we can use MSI-X
Reza Arbab [Wed, 15 Jan 2025 16:14:25 +0000 (10:14 -0600)]
virtio-balloon-pci: Allow setting nvectors, so we can use MSI-X

Most virtio-pci devices allow MSI-X. Add it to virtio-balloon-pci, but
only enable it in new machine types, so we don't break migration of
existing machine types between different qemu versions.

This copies what was done for virtio-rng-pci in:
9ea02e8f1306 ("virtio-rng-pci: Allow setting nvectors, so we can use MSI-X")
bad9c5a5166f ("virtio-rng-pci: fix migration compat for vectors")
62bdb8871512 ("virtio-rng-pci: fix transitional migration compat for vectors")

Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Message-ID: <20250115161425.246348-1-arbab@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
3 months agohw/s390x/s390-virtio-ccw: Fix a record/replay deadlock
Ilya Leoshkevich [Fri, 24 Jan 2025 11:25:48 +0000 (12:25 +0100)]
hw/s390x/s390-virtio-ccw: Fix a record/replay deadlock

Booting an s390x VM in record/replay mode hangs due to a deadlock
between rr_cpu_thread_fn() and s390_machine_reset(). The former needs
the record/replay mutex held by the latter, and the latter waits until
the former completes its run_on_cpu() request.

Fix by temporarily dropping the record/replay mutex, like it's done in
pause_all_vcpus().

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-ID: <20250124112625.23050-1-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
3 months agotests/tcg/s390x: Test modifying code using the MVC instruction
Ilya Leoshkevich [Tue, 28 Jan 2025 00:12:43 +0000 (01:12 +0100)]
tests/tcg/s390x: Test modifying code using the MVC instruction

Add a small test to prevent regressions.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20250128001338.11474-2-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
3 months agotarget/s390x: Fix MVC not always invalidating translation blocks
Ilya Leoshkevich [Tue, 28 Jan 2025 00:12:42 +0000 (01:12 +0100)]
target/s390x: Fix MVC not always invalidating translation blocks

Node.js crashes in qemu-system-s390x with random SIGSEGVs / SIGILLs.

The v8 JIT used by Node.js can garbage collect and overwrite unused
code. Overwriting is performed by WritableJitAllocation::CopyCode(),
which ultimately calls memcpy(). For certain sizes, memcpy() uses the
MVC instruction.

QEMU implements MVC and other similar instructions using helpers. While
TCG store ops invalidate affected translation blocks automatically,
helpers must do this manually by calling probe_access_flags(). The MVC
helper does this using the access_prepare() -> access_prepare_nf() ->
s390_probe_access() -> probe_access_flags() call chain.

At the last step of this chain, the store size is replaced with 0. This
causes the probe_access_flags() -> notdirty_write() ->
tb_invalidate_phys_range_fast() chain to miss some translation blocks.

When this happens, QEMU executes a mix of old and new code. This
quickly leads to either a SIGSEGV or a SIGILL in case the old code
ends in the middle of a new instruction.

Fix by passing the true size.

Reported-by: Berthold Gunreben <azouhr@opensuse.org>
Cc: Sarah Kriesch <ada.lovelace@gmx.de>
Cc: qemu-stable@nongnu.org
Closes: https://bugzilla.opensuse.org/show_bug.cgi?id=1235709
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Fixes: e2faabee78ff ("accel/tcg: Forward probe size on to notdirty_write")
Message-ID: <20250128001338.11474-1-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
3 months agotarget/s390x: Fix PPNO execution with icount
Ilya Leoshkevich [Thu, 23 Jan 2025 12:37:53 +0000 (13:37 +0100)]
target/s390x: Fix PPNO execution with icount

Executing PERFORM RANDOM NUMBER OPERATION makes QEMU exit with "Bad
icount read" when using record/replay. This is caused by
icount_get_raw_locked() if the current instruction is not the last one
in the respective translation block.

For the x86_64's rdrand this is resolved by calling
translator_io_start(). On s390x one uses IF_IO in order to make this
call happen automatically.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20250123123808.194405-1-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
3 months agotests/functional/test_mips_malta: Fix comment about endianness of the test
Thomas Huth [Mon, 27 Jan 2025 18:41:10 +0000 (19:41 +0100)]
tests/functional/test_mips_malta: Fix comment about endianness of the test

This test is for the big endian MIPS target, not for the little endian
target.

Fixes: 79cb4a14cb6 ("tests/functional: Convert mips32eb 4Kc Malta tests")
Message-ID: <20250127184112.108122-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
3 months agotests/functional: Add a ppc64 mac99 test
Cédric Le Goater [Tue, 28 Jan 2025 21:21:45 +0000 (22:21 +0100)]
tests/functional: Add a ppc64 mac99 test

The test sequence boots from disk a mac99 machine in 64-bit mode, in
which case the CPU is a PPC 970.

The buildroot rootfs is built with config :

BR2_powerpc64=y
BR2_powerpc_970=y

and the kernel with the g5 deconfig.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20250128212145.1186617-1-clg@redhat.com>
[thuth: Adjusted the comment about '-nographic]
Signed-off-by: Thomas Huth <thuth@redhat.com>
3 months agotests/functional: Fix the aarch64_tcg_plugins test
Thomas Huth [Thu, 23 Jan 2025 08:36:25 +0000 (09:36 +0100)]
tests/functional: Fix the aarch64_tcg_plugins test

Unfortunately, this test had not been added to meson.build, so we did
not notice a regression: Looking for 'Kernel panic - not syncing: VFS:'
as the indication for the final boot state of the kernel was a bad
idea since 'Kernel panic - not syncing' is the default failure
message of the LinuxKernelTest class, and since we're now reading
the console input byte by byte instead of linewise (see commit
cdad03b74f75), the failure now triggers before we fully read the
success string. Let's fix this by simply looking for the previous
line in the console output instead.

Also, replace the call to cancel() - this was only available in the
Avocado framework. In the functional framework, we must use skipTest()
instead. While we're at it, also fix the TODO here by looking for the
exact error and only skip the test if the plugins are not available.

Fixes: 3abc545e66 ("tests/functional: Convert the tcg_plugins test")
Fixes: cdad03b74f ("tests/functional: rewrite console handling to be bytewise")
Message-ID: <20250123083625.1498495-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
3 months agotests/functional: Convert the migration avocado test
Thomas Huth [Fri, 3 Jan 2025 07:43:08 +0000 (08:43 +0100)]
tests/functional: Convert the migration avocado test

Now that we've got a find_free_port() function in the functional
test framework, we can convert the migration test, too.
While the original avocado test was only meant to run on aarch64,
ppc64 and x86, we can turn this into a more generic test by now
and run it on all architectures that have a machine which ships
with a working firmware. To avoid overlapping with the migration
qtest, we now also test migration on machines that are not covered
by the migration qtest yet.

Acked-by: Fabiano Rosas <farosas@suse.de>
Message-ID: <20250103074308.463860-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
3 months agotests/functional: Fix broken decorators with lamda functions
Thomas Huth [Wed, 22 Jan 2025 13:43:14 +0000 (14:43 +0100)]
tests/functional: Fix broken decorators with lamda functions

The decorators that use a lambda function are currently broken
and do not properly skip the test if the condition is not met.
Using "return skipUnless(lambda: ...)" does not work as expected.
To fix it, rewrite the decorators without lambda, it's simpler
that way anyway.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250122134315.1448794-3-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
3 months agotests/functional/qemu_test/decorators: Fix bad check for imports
Thomas Huth [Wed, 22 Jan 2025 13:43:13 +0000 (14:43 +0100)]
tests/functional/qemu_test/decorators: Fix bad check for imports

skipIfMissingImports should use importlib.import_module() for checking
whether a module with the name stored in the "impname" variable is
available or not, otherwise the code tries to import a module with
the name "impname" instead.
(This bug hasn't been noticed before since there is another issue
with this decorator that will be fixed by the next patch)

Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250122134315.1448794-2-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
3 months agomigration: refactor ram_save_target_page functions
Prasad Pandit [Mon, 27 Jan 2025 12:08:21 +0000 (17:38 +0530)]
migration: refactor ram_save_target_page functions

Refactor ram_save_target_page legacy and multifd
functions into one. Other than simplifying it,
it frees 'migration_ops' object from usage, so it
is expunged.

Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-ID: <20250127120823.144949-3-ppandit@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: Trivial cleanup on JSON writer of vmstate_save()
Peter Xu [Tue, 14 Jan 2025 23:07:46 +0000 (18:07 -0500)]
migration: Trivial cleanup on JSON writer of vmstate_save()

Two small cleanups in the same section of vmstate_save():

  - Check vmdesc before the "mixed null/non-null data in array" logic, to
  be crystal clear that it's only about the JSON writer, not the vmstate on
  its own in the migration stream.

  - Since we have is_null variable now, use that to replace a check.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-17-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: Merge precopy/postcopy on switchover start
Peter Xu [Tue, 14 Jan 2025 23:07:45 +0000 (18:07 -0500)]
migration: Merge precopy/postcopy on switchover start

Now after all the cleanups, finally we can merge the switchover startup
phase into one single function for precopy/postcopy.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-16-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: Always set DEVICE state
Peter Xu [Tue, 14 Jan 2025 23:07:44 +0000 (18:07 -0500)]
migration: Always set DEVICE state

DEVICE state was introduced back in 2017:

https://lore.kernel.org/qemu-devel/20171020090556.18631-1-dgilbert@redhat.com/

Quote from Dave's cover letter, when the pre-switchover phase was enabled,
the state transition looks like this:

  The precopy flow is:
  active->pre-switchover->device->completed

  The postcopy flow is:
  active->pre-switchover->postcopy-active->completed

To supplement above, when the cap is not enabled:

  The precopy flow is:
  active->completed

  The postcopy flow is:
  active->postcopy-active->completed

It works for us, though we have some code just to special case these state
transitions, so the DEVICE state currently is special only to precopy, and
only conditionally.

I had a quick discussion with Libvirt developers, it turns out that this
may not be necessary. IOW, it seems okay we can have DEVICE state to be
generic, so that we don't have over-complicated state machines.  It not
only helps align all the migration state machine, help cleanup the code
path especially on pre-switchover handling (see the patch itself), another
side benefit is we can unconditionally have a specific state to mark the
switchover phase, which might be helpful for debugging too.

This patch makes the DEVICE state to be present always, marking that source
QEMU is switching over.  Then the state machine will be always as simple
as:

  active-> [pre-switchover->] -> device -> [postcopy-active->] -> complete

After the change, no matter whether pre-switchover or postcopy is enabled
or not, we always have DEVICE state showing the switchover phase.  When
pre-switchover enabled, we'll have an extra stage before that.  When
postcopy is enabled, we'll have an extra stage after that.

A few qtests need touch up in QEMU tree for this change:

  - A few iotest outputs (194, 203, 234, 262, 280)
  - Teach libqos's migrate() on "device" state

Cc: Jiri Denemark <jdenemar@redhat.com>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Dr. David Alan Gilbert <dave@treblig.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-15-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: Cleanup qemu_savevm_state_complete_precopy()
Peter Xu [Tue, 14 Jan 2025 23:07:43 +0000 (18:07 -0500)]
migration: Cleanup qemu_savevm_state_complete_precopy()

Now qemu_savevm_state_complete_precopy() is never used in postcopy, clean
it up as in_postcopy==false now unconditionally.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-14-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: Unwrap qemu_savevm_state_complete_precopy() in postcopy
Peter Xu [Tue, 14 Jan 2025 23:07:42 +0000 (18:07 -0500)]
migration: Unwrap qemu_savevm_state_complete_precopy() in postcopy

Postcopy invokes qemu_savevm_state_complete_precopy() twice for a long
time, and that caused way too much confusions.  Let's clean this up and
make postcopy easier to read.

It's actually fairly straightforward: postcopy starts with saving
non-postcopiable iterables, then later it saves again with non-iterable
only.  Move these two calls out makes everything much easier to follow.
Otherwise it's very unclear what qemu_savevm_state_complete_precopy() did
in either of the calls.

No functional change intended.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-13-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: Notify COMPLETE once for postcopy
Peter Xu [Tue, 14 Jan 2025 23:07:41 +0000 (18:07 -0500)]
migration: Notify COMPLETE once for postcopy

Postcopy invokes qemu_savevm_state_complete_precopy() twice, that means
it'll invoke COMPLETE notify twice.. also twice the tracepoints that
marking precopy complete.

Move that notification (along with the tracepoint) out to the caller, so
that postcopy will only notify once right at the start of switchover phase
from precopy.  When at it, rename it to suite the file now it locates.

For precopy, there should have no functional change except the tracepoint
has a name change.

For the other two users of qemu_savevm_state_complete_precopy(), namely:
qemu_savevm_state() and qemu_savevm_live_state(): the notifier shouldn't
matter because they're not precopy at all.  Now in these two contexts (aka,
"savevm", and "colo") sometimes the precopy notifiers will still be
invoked, but that's outside the scope of this patch.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-12-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: Take BQL slightly longer in postcopy_start()
Peter Xu [Tue, 14 Jan 2025 23:07:40 +0000 (18:07 -0500)]
migration: Take BQL slightly longer in postcopy_start()

This paves way for some follow up patch to modify migration states at the
end of postcopy_start(), which should better be with the BQL so that
there's no way of concurrent cancellation.

So we'll do something slightly more with BQL but they're really trivial,
hopefully nothing will really chance with this.

A side benefit is we can drop another explicit lock() in failure path.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-11-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: Drop cached migration state in migration_maybe_pause()
Peter Xu [Tue, 14 Jan 2025 23:07:39 +0000 (18:07 -0500)]
migration: Drop cached migration state in migration_maybe_pause()

I can't see why we must cache the state now after we avoided possible
CANCEL race: that's the only thing I can think of that can modify the
migration state concurrently with the migration thread itself.  Make all
the state updates to happen always, then we don't need to cache the state
anymore.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-10-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: Adjust locking in migration_maybe_pause()
Peter Xu [Tue, 14 Jan 2025 23:07:38 +0000 (18:07 -0500)]
migration: Adjust locking in migration_maybe_pause()

In migration_maybe_pause() QEMU may yield BQL before waiting for a
semaphore.  However it yields the BQL too early, which logically gives it
chance for the main thread to quickly take the BQL and modify the state to
CANCELLING.

To avoid such race condition from happening at all, always update the
migration states within the BQL.  It'll make sure no concurrent
cancellation can ever happen.

With that, IIUC there's chance we can remove the extra parameter in
migration_maybe_pause() to update active state, but that'll be done
separately later.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-9-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: Adjust postcopy bandwidth during switchover
Peter Xu [Tue, 14 Jan 2025 23:07:37 +0000 (18:07 -0500)]
migration: Adjust postcopy bandwidth during switchover

Precopy uses unlimited bandwidth always during switchover, it makes sense
because this is so critical and no one would like to throttle bandwidth
during the VM blackout.

OTOH, postcopy surprisingly didn't do that.  There's one line that in the
middle of the postcopy switchover it tries to switch to postcopy's
specified max-postcopy-bandwidth, but even so it's somewhere in the middle
which is strange.

This patch brings the two modes to always use unlimited bandwidth for
switchover, meanwhile only apply the postcopy max bandwidth after the
switchover is completed.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-8-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: Synchronize all CPU states only for non-iterable dump
Peter Xu [Tue, 14 Jan 2025 23:07:36 +0000 (18:07 -0500)]
migration: Synchronize all CPU states only for non-iterable dump

Do one shot cpu sync at qemu_savevm_state_complete_precopy_non_iterable(),
instead of coding it separately in two places.

Note that in the context of qemu_savevm_state_complete_precopy(), this
patch is also an optimization for postcopy path, in that we can avoid sync
cpu twice during switchover: before this patch, postcopy_start() invokes
twice on qemu_savevm_state_complete_precopy(), each of them will try to
sync CPU info.  In reality, only one of them would be enough.

For background snapshot, there's no intended functional change.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-7-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: Drop inactivate_disk param in qemu_savevm_state_complete*
Peter Xu [Tue, 14 Jan 2025 23:07:35 +0000 (18:07 -0500)]
migration: Drop inactivate_disk param in qemu_savevm_state_complete*

This parameter is only used by one caller, which is the genuine precopy
complete path (migration_completion_precopy).

The parameter was introduced in a1fbe750fd ("migration: Fix race of image
locking between src and dst") to make sure the inactivate will happen
before EOF to make sure dest will always be able to activate the disk
properly.  However there's no limitation on how early we inactivate the
disk.  For precopy completion path, we can always do that as long as VM is
stopped.

Move the disk inactivate there, then we can remove this inactivate_disk
parameter in the whole call stack, because all the rest users pass in false
always.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-6-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: Avoid two src-downtime-end tracepoints for postcopy
Peter Xu [Tue, 14 Jan 2025 23:07:34 +0000 (18:07 -0500)]
migration: Avoid two src-downtime-end tracepoints for postcopy

Postcopy can trigger this tracepoint twice, while only the 1st one is
valid.  Avoid triggering the 2nd tracepoint just like what we do with
recording the total downtime.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-5-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: Optimize postcopy on downtime by avoiding JSON writer
Peter Xu [Tue, 14 Jan 2025 23:07:33 +0000 (18:07 -0500)]
migration: Optimize postcopy on downtime by avoiding JSON writer

postcopy_start() is the entry function that postcopy is destined to start.
It also means QEMU source will not dump VM description, aka, the JSON
writer is garbage now.

We can leave that to be cleaned up when migration completes, however when
with the JSON writer object being present, vmstate_save() will still try to
construct the JSON objects for the VM descriptions, even though it'll never
be used later if it's postcopy.

To save those cycles, release the JSON writer earlier for postcopy. Then
vmstate_save() later will be smart enough to skip the JSON object
constructions completely.  It can logically reduce downtime because all
such JSON constructions happen during postcopy blackout.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-4-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: Do not construct JSON description if suppressed
Peter Xu [Tue, 14 Jan 2025 23:07:32 +0000 (18:07 -0500)]
migration: Do not construct JSON description if suppressed

QEMU machine has a property "suppress-vmdesc". When it is enabled, QEMU
will stop attaching JSON VM description at the end of the precopy migration
stream (postcopy is never affected because postcopy never attach that).

However even if it's suppressed by the user, the source QEMU will still
construct the JSON descriptions, which is a complete waste of CPU and
memory resources.

To avoid it, only create the JSON writer object if suppress-vmdesc is not
specified.

Luckily, vmstate_save() already supports vmdesc==NULL, so only a few spots
that are left to be prepared that vmdesc can be NULL now.

When at it, move the init / destroy of the JSON writer object to start /
end of the migration - the JSON writer object is a sub-struct of migration
state, and that looks like the only object that was dynamically allocated /
destroyed within migration process.  Make it the same as the rest objects
that migration uses.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-3-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: Remove postcopy implications in should_send_vmdesc()
Peter Xu [Tue, 14 Jan 2025 23:07:31 +0000 (18:07 -0500)]
migration: Remove postcopy implications in should_send_vmdesc()

should_send_vmdesc() has a hack inside (which was not reflected in the
function name) in that it tries to detect global postcopy state and that
will affect the value to be returned.

It's easier to keep the helper simple by only check the suppress-vmdesc
property.  Then:

  - On the sender side of its usage, there's already in_postcopy variable
    that we can use: postcopy doesn't send vmdesc at all, so directly skip
    everything for postcopy.

  - On the recv side, when reaching vmdesc processing it must be precopy
    code already, hence that hack check never used to work anyway.

No functional change intended, except a trivial side effect that QEMU
source will start to avoid running some JSON helper in postcopy path, but
that would only reduce the postcopy blackout window a bit, rather than any
other bad side effect.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-2-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: cpr-transfer documentation
Steve Sistare [Wed, 15 Jan 2025 19:00:50 +0000 (11:00 -0800)]
migration: cpr-transfer documentation

Add documentation for the cpr-transfer migration mode.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-25-git-send-email-steven.sistare@oracle.com
[add -machine memory-backend=ram0]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration-test: cpr-transfer
Steve Sistare [Wed, 15 Jan 2025 19:00:49 +0000 (11:00 -0800)]
migration-test: cpr-transfer

Add a migration test for cpr-transfer mode.  Defer the connection to the
target monitor, else the test hangs because in cpr-transfer mode QEMU does
not listen for monitor connections until we send the migrate command to
source QEMU.

To test -incoming defer, send a migrate incoming command to the target,
after sending the migrate command to the source, as required by
cpr-transfer mode.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-24-git-send-email-steven.sistare@oracle.com
[only allocate in_channels when needed]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agotests/qtest: assert qmp connected
Steve Sistare [Wed, 15 Jan 2025 19:00:48 +0000 (11:00 -0800)]
tests/qtest: assert qmp connected

Assert that qmp_fd is valid when we communicate with the monitor.

Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1736967650-129648-23-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agotests/qtest: enhance migration channels
Steve Sistare [Wed, 15 Jan 2025 19:00:47 +0000 (11:00 -0800)]
tests/qtest: enhance migration channels

Change the migrate_qmp and migrate_qmp_fail channels argument to a QObject
type so the caller can manipulate the object before passing it to the
helper.  Define migrate_str_to_channel to aid such manipulation.
Add a channels argument to migrate_incoming_qmp.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-22-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration-test: defer connection
Steve Sistare [Wed, 15 Jan 2025 19:00:46 +0000 (11:00 -0800)]
migration-test: defer connection

Add an option to defer connection to the target monitor, needed by the
cpr-transfer test.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/1736967650-129648-21-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agotests/qtest: defer connection
Steve Sistare [Wed, 15 Jan 2025 19:00:45 +0000 (11:00 -0800)]
tests/qtest: defer connection

Add an option to defer making the connecting to the monitor and qtest
sockets when calling qtest_init_with_env.  The client makes the connection
later by calling qtest_connect and qtest_qmp_handshake.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-20-git-send-email-steven.sistare@oracle.com
[plumb capabilities list into qtest_qmp_handshake]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agotests/qtest: optimize migrate_set_ports
Steve Sistare [Wed, 15 Jan 2025 19:00:44 +0000 (11:00 -0800)]
tests/qtest: optimize migrate_set_ports

Do not query connection parameters if all port numbers are known.  This is
more efficient, and also solves a problem for the cpr-transfer test.
At the point where cpr-transfer calls migrate_qmp and migrate_set_ports,
the monitor is not connected and queries are not allowed.  Port=0 is
never used for cpr-transfer.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-19-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration-test: memory_backend
Steve Sistare [Wed, 15 Jan 2025 19:00:43 +0000 (11:00 -0800)]
migration-test: memory_backend

Allow each migration test to define its own memory backend, replacing
the standard "-m <size>" specification.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/1736967650-129648-18-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: cpr-transfer mode
Steve Sistare [Wed, 15 Jan 2025 19:00:42 +0000 (11:00 -0800)]
migration: cpr-transfer mode

Add the cpr-transfer migration mode, which allows the user to transfer
a guest to a new QEMU instance on the same host with minimal guest pause
time, by preserving guest RAM in place, albeit with new virtual addresses
in new QEMU, and by preserving device file descriptors.  Pages that were
locked in memory for DMA in old QEMU remain locked in new QEMU, because the
descriptor of the device that locked them remains open.

cpr-transfer preserves memory and devices descriptors by sending them to
new QEMU over a unix domain socket using SCM_RIGHTS.  Such CPR state cannot
be sent over the normal migration channel, because devices and backends
are created prior to reading the channel, so this mode sends CPR state
over a second "cpr" migration channel.  New QEMU reads the cpr channel
prior to creating devices or backends.  The user specifies the cpr channel
in the channel arguments on the outgoing side, and in a second -incoming
command-line parameter on the incoming side.

The user must start old QEMU with the the '-machine aux-ram-share=on' option,
which allows anonymous memory to be transferred in place to the new process
by transferring a memory descriptor for each ram block.  Memory-backend
objects must have the share=on attribute, but memory-backend-epc is not
supported.

The user starts new QEMU on the same host as old QEMU, with command-line
arguments to create the same machine, plus the -incoming option for the
main migration channel, like normal live migration.  In addition, the user
adds a second -incoming option with channel type "cpr".  This CPR channel
must support file descriptor transfer with SCM_RIGHTS, i.e. it must be a
UNIX domain socket.

To initiate CPR, the user issues a migrate command to old QEMU, adding
a second migration channel of type "cpr" in the channels argument.
Old QEMU stops the VM, saves state to the migration channels, and enters
the postmigrate state.  New QEMU mmap's memory descriptors, and execution
resumes.

The implementation splits qmp_migrate into start and finish functions.
Start sends CPR state to new QEMU, which responds by closing the CPR
channel.  Old QEMU detects the HUP then calls finish, which connects the
main migration channel.

In summary, the usage is:

  qemu-system-$arch -machine aux-ram-share=on ...

  start new QEMU with "-incoming <main-uri> -incoming <cpr-channel>"

  Issue commands to old QEMU:
    migrate_set_parameter mode cpr-transfer

    {"execute": "migrate", ...
        {"channel-type": "main"...}, {"channel-type": "cpr"...} ... }

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-17-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agoMerge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
Stefan Hajnoczi [Wed, 29 Jan 2025 14:51:03 +0000 (09:51 -0500)]
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* target/i386: optimize string instructions
* target/i386: new Sierra Forest and Clearwater Forest models
* rust: type-safe vmstate implementation
* rust: use interior mutability for PL011
* rust: clean ups
* memtxattrs: remove usage of bitfields from MEMTXATTRS_UNSPECIFIED
* gitlab-ci: enable Rust backtraces

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmeZ6VYUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroMjbQgApuooMOp0z/8Ky4/ux8M8/vrlcNCH
# V1Pm6WzrjEzd9TIMLGr6npOyLOkWI31Aa4o/TuW09SeKE3dpCf/7LYA5VDEtkH79
# F57MgnSj56sMNgu+QZ/SiGvkKJXl+3091jIianrrI0dtX8hPonm6bt55woDvQt3z
# p94+4zzv5G0nc+ncITCDho8sn5itdZWVOjf9n6VCOumMjF4nRSoMkJKYIvjNht6n
# GtjMhYA70tzjkIi4bPyYkhFpMNlAqEDIp2TvPzp6klG5QoUErHIzdzoRTAtE4Dpb
# 7240r6jarQX41TBXGOFq0NrxES1cm5zO/6159D24qZGHGm2hG4nDx+t2jw==
# =ZKFy
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 29 Jan 2025 03:39:50 EST
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (49 commits)
  gitlab-ci: include full Rust backtraces in test runs
  rust: qemu-api: add sub-subclass to the integration tests
  rust/zeroable: Implement Zeroable with const_zero macro
  rust: qdev: make reset take a shared reference
  rust: pl011: drop use of ControlFlow
  rust: pl011: pull device-specific code out of MemoryRegionOps callbacks
  rust: pl011: remove duplicate definitions
  rust: pl011: wrap registers with BqlRefCell
  rust: pl011: extract PL011Registers
  rust: pl011: pull interrupt updates out of read/write ops
  rust: pl011: extract CharBackend receive logic into a separate function
  rust: pl011: extract conversion to RegisterOffset
  rust: pl011: hide unnecessarily "pub" items from outside pl011::device
  rust: pl011: remove unnecessary "extern crate"
  rust: prefer NonNull::new to assertions
  rust: vmstate: make order of parameters consistent in vmstate_clock
  rust: vmstate: remove translation of C vmstate macros
  rust: pl011: switch vmstate to new-style macros
  rust: qemu_api: add vmstate_struct
  rust: vmstate: add public utility macros to implement VMState
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 months agoMerge tag 'pull-target-arm-20250128-1' of https://git.linaro.org/people/pmaydell...
Stefan Hajnoczi [Wed, 29 Jan 2025 14:50:39 +0000 (09:50 -0500)]
Merge tag 'pull-target-arm-20250128-1' of https://git.linaro.org/people/pmaydell/qemu-arm into staging

target-arm queue:
 * hw/arm: Remove various uses of first_cpu global
 * hw/char/imx_serial: Fix reset value of UFCR register
 * hw/char/imx_serial: Update all state before restarting ageing timer
 * hw/pci-host/designware: Expose MSI IRQ
 * hw/arm/stellaris: refactoring, cleanup
 * hw/arm/stellaris: map both I2C controllers
 * tests/functional: Add a test for the arm microbit machine
 * target/arm: arm_reset_sve_state() should set FPSR, not FPCR
 * target/arm: refactorings preparatory to FEAT_AFP implementation
 * fpu: Rename float_flag_input_denormal to float_flag_input_denormal_flushed
 * fpu: Rename float_flag_output_denormal to float_flag_output_denormal_flushed
 * hw/usb/canokey: Fix buffer overflow for OUT packet

# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmeZOi0ZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3sUeEACwDhM4ldn/gVZgVN7nf42a
# /CLD/qJx1vqi5bAB5zkY1bSCR9hS2IkhTBoQQH9Ng6ztG1IRpT/tKXDJAemWty70
# XgExdl4yjdwXMQK4JKU9qSfaBTuX7Z8Hz+nA1AnblO/4H+XpVNVJzp8Ee/uWTyEd
# BKPBpwqbIXNwUWEqkzDok074Q05rHlhsJD2DsoJTcmtpROhLHLATwQDZGGFuf56H
# LVcdx6GRP+/mWEGWLtj19mvaR/2cn4rQf+I1MACZ81nRjQCHbCohNAMr2wFsKg1+
# 2jYk9uHdFoambJ5+mFuC55Efk+QJaP4vDR0Gf3jLloFr+rS/5h3HiUuD8dUWOwFd
# mPWXsjwYzqBW2knt1nfq1ByzYWZ8rVQEn5G53dX/eoNXuDGsonZxPnevgmv5kIUc
# /W618Jez1nu9RDtNKccobHEtTGlGInJxJ7YzkU7Q6FO80IAqSdV7t9v7uPLJwcnz
# nQz+wVzb4oOmwMzn3BpKY7N/S7IZOSy3ASNHj8o4yCHMJT8Ki0/N4bl0k0DLxJ0T
# RiNCsV9c7MJfo9a+pbOnu0Lc3SjjropdvHYU+bB7R0mgd8ysN+Tou0dpa+i7tUTu
# DHWqs2/+UApHKBiC+DSynPjjRR2aT/5lYFncGaiEVoEQttPLka3SAzgHPVQZs1zD
# bxZkEAFktAFGIjU70fYNkg==
# =H4p7
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 28 Jan 2025 15:12:29 EST
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
# gpg:                 aka "Peter Maydell <peter@archaic.org.uk>" [unknown]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* tag 'pull-target-arm-20250128-1' of https://git.linaro.org/people/pmaydell/qemu-arm: (36 commits)
  hw/usb/canokey: Fix buffer overflow for OUT packet
  target/arm: Use FPST_A64_F16 for halfprec-to-other conversions
  target/arm: Remove redundant advsimd float16 helpers
  fpu: Fix a comment in softfloat-types.h
  fpu: Rename float_flag_output_denormal to float_flag_output_denormal_flushed
  fpu: Rename float_flag_input_denormal to float_flag_input_denormal_flushed
  target/arm: Remove now-unused vfp.fp_status_f16 and FPST_FPCR_F16
  target/arm: Use FPST_A64_F16 in A64 decoder
  target/arm: Use FPST_A32_F16 in A32 decoder
  target/arm: Use fp_status_f16_a64 in AArch64-only helpers
  target/arm: Use fp_status_f16_a32 in AArch32-only helpers
  target/arm: Define new fp_status_f16_a32 and fp_status_f16_a64
  target/arm: Remove now-unused vfp.fp_status and FPST_FPCR
  target/arm: Use FPST_A64 in A64 decoder
  target/arm: Use FPST_A32 in A32 decoder
  target/arm: Use fp_status_a32 in vfp_cmp helpers
  target/arm: Use fp_status_a32 in vjvct helper
  target/arm: Use fp_status_a64 or fp_status_a32 in is_ebf()
  target/arm: Use vfp.fp_status_a64 in A64-only helper functions
  target/arm: Define new fp_status_a32 and fp_status_a64
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3 months agomigration: cpr-transfer save and load
Steve Sistare [Wed, 15 Jan 2025 19:00:41 +0000 (11:00 -0800)]
migration: cpr-transfer save and load

Add functions to create a QEMUFile based on a unix URI, for saving or
loading, for use by cpr-transfer mode to preserve CPR state.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-16-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: VMSTATE_FD
Steve Sistare [Wed, 15 Jan 2025 19:00:40 +0000 (11:00 -0800)]
migration: VMSTATE_FD

Define VMSTATE_FD for declaring a file descriptor field in a
VMStateDescription.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-15-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: SCM_RIGHTS for QEMUFile
Steve Sistare [Wed, 15 Jan 2025 19:00:39 +0000 (11:00 -0800)]
migration: SCM_RIGHTS for QEMUFile

Define functions to put/get file descriptors to/from a QEMUFile, for qio
channels that support SCM_RIGHTS.  Maintain ordering such that
  put(A), put(fd), put(B)
followed by
  get(A), get(fd), get(B)
always succeeds.  Other get orderings may succeed but are not guaranteed.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-14-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: incoming channel
Steve Sistare [Wed, 15 Jan 2025 19:00:38 +0000 (11:00 -0800)]
migration: incoming channel

Extend the -incoming option to allow an @MigrationChannel to be specified.
This allows channels other than 'main' to be described on the command
line, which will be needed for CPR.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-13-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: enhance migrate_uri_parse
Steve Sistare [Wed, 15 Jan 2025 19:00:37 +0000 (11:00 -0800)]
migration: enhance migrate_uri_parse

Export migrate_uri_parse for use outside migration internals, and define
a method migrate_is_uri that indicates when migrate_uri_parse should
be used.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-12-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agohostmem-shm: preserve for cpr
Steve Sistare [Wed, 15 Jan 2025 19:00:36 +0000 (11:00 -0800)]
hostmem-shm: preserve for cpr

Preserve memory-backend-shm memory objects during cpr-transfer.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-11-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agohostmem-memfd: preserve for cpr
Steve Sistare [Wed, 15 Jan 2025 19:00:35 +0000 (11:00 -0800)]
hostmem-memfd: preserve for cpr

Preserve memory-backend-memfd memory objects during cpr-transfer.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-10-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agophysmem: preserve ram blocks for cpr
Steve Sistare [Wed, 15 Jan 2025 19:00:34 +0000 (11:00 -0800)]
physmem: preserve ram blocks for cpr

Save the memfd for ramblocks in CPR state, along with a name that
uniquely identifies it.  The block's idstr is not yet set, so it
cannot be used for this purpose.  Find the saved memfd in new QEMU when
creating a block.  If size of a resizable block is larger in new QEMU,
extend it via the file_ram_alloc truncate parameter, and the extra space
will be usable after a guest reset.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-9-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: cpr-state
Steve Sistare [Wed, 15 Jan 2025 19:00:33 +0000 (11:00 -0800)]
migration: cpr-state

CPR must save state that is needed after QEMU is restarted, when devices
are realized.  Thus the extra state cannot be saved in the migration
channel, as objects must already exist before that channel can be loaded.
Instead, define auxilliary state structures and vmstate descriptions, not
associated with any registered object, and serialize the aux state to a
cpr-specific channel in cpr_state_save.  Deserialize in cpr_state_load
after QEMU restarts, before devices are realized.

Provide accessors for clients to register file descriptors for saving.
The mechanism for passing the fd's to the new process will be specific
to each migration mode, and added in subsequent patches.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-8-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomachine: aux-ram-share option
Steve Sistare [Wed, 15 Jan 2025 19:00:32 +0000 (11:00 -0800)]
machine: aux-ram-share option

Allocate auxilliary guest RAM as an anonymous file that is shareable
with an external process.  This option applies to memory allocated as
a side effect of creating various devices. It does not apply to
memory-backend-objects, whether explicitly specified on the command
line, or implicitly created by the -m command line option.

This option is intended to support new migration modes, in which the
memory region can be transferred in place to a new QEMU process, by sending
the memfd file descriptor to the process.  Memory contents are preserved,
and if the mode also transfers device descriptors, then pages that are
locked in memory for DMA remain locked.  This behavior is a pre-requisite
for supporting vfio, vdpa, and iommufd devices with the new modes.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-7-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomemory: add RAM_PRIVATE
Steve Sistare [Wed, 15 Jan 2025 19:00:31 +0000 (11:00 -0800)]
memory: add RAM_PRIVATE

Define the RAM_PRIVATE flag.

In RAMBlock creation functions, if MAP_SHARED is 0 in the flags parameter,
in a subsequent patch the implementation may still create a shared mapping
if other conditions require it.  Callers who specifically want a private
mapping, eg for objects specified by the user, must pass RAM_PRIVATE.

After RAMBlock creation, MAP_SHARED in the block's flags indicates whether
the block is shared or private, and MAP_PRIVATE is omitted.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-6-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agophysmem: fd-based shared memory
Steve Sistare [Wed, 15 Jan 2025 19:00:30 +0000 (11:00 -0800)]
physmem: fd-based shared memory

Create MAP_SHARED RAMBlocks by mmap'ing a file descriptor rather than using
MAP_ANON, so the memory can be accessed in another process by passing and
mmap'ing the fd.  This will allow CPR to support memory-backend-ram and
memory-backend-shm objects, provided the user creates them with share=on.

Use memfd_create if available because it has no constraints.  If not, use
POSIX shm_open.  However, allocation on the opened fd may fail if the shm
mount size is too small, even if the system has free memory, so for backwards
compatibility fall back to qemu_anon_ram_alloc/MAP_ANON on failure.

For backwards compatibility on Windows, always use MAP_ANON.  share=on has
no purpose there, but the syntax is accepted, and must continue to work.

Lastly, quietly fall back to MAP_ANON if the system does not support
qemu_ram_alloc_from_fd.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-5-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agophysmem: qemu_ram_alloc_from_fd extensions
Steve Sistare [Wed, 15 Jan 2025 19:00:29 +0000 (11:00 -0800)]
physmem: qemu_ram_alloc_from_fd extensions

Extend qemu_ram_alloc_from_fd to support resizable ram, and define
qemu_ram_resize_cb to clean up the API.

Add a grow parameter to extend the file if necessary.  However, if
grow is false, a zero-sized file is always extended.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1736967650-129648-4-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agophysmem: fix qemu_ram_alloc_from_fd size calculation
Steve Sistare [Wed, 15 Jan 2025 19:00:28 +0000 (11:00 -0800)]
physmem: fix qemu_ram_alloc_from_fd size calculation

qemu_ram_alloc_from_fd allocates space if file_size == 0.  If non-zero,
it uses the existing space and verifies it is large enough, but the
verification was broken when the offset parameter was introduced.  As
a result, a file smaller than offset passes the verification and causes
errors later.  Fix that, and update the error message to include offset.

Peter provides this concise reproducer:

  $ touch ramfile
  $ truncate -s 64M ramfile
  $ ./qemu-system-x86_64 -object memory-backend-file,mem-path=./ramfile,offset=128M,size=128M,id=mem1,prealloc=on
  qemu-system-x86_64: qemu_prealloc_mem: preallocating memory failed: Bad address

With the fix, the error message is:
  qemu-system-x86_64: mem1 backing store size 0x4000000 is too small for 'size' option 0x8000000 plus 'offset' option 0x8000000

Cc: qemu-stable@nongnu.org
Fixes: 4b870dc4d0c0 ("hostmem-file: add offset option")
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-3-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agobackends/hostmem-shm: factor out allocation of "anonymous shared memory with an fd"
Steve Sistare [Wed, 15 Jan 2025 19:00:27 +0000 (11:00 -0800)]
backends/hostmem-shm: factor out allocation of "anonymous shared memory with an fd"

Let's factor it out so we can reuse it.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-2-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agomigration: fix -Werror=maybe-uninitialized
Marc-André Lureau [Tue, 14 Jan 2025 10:48:11 +0000 (14:48 +0400)]
migration: fix -Werror=maybe-uninitialized

../migration/savevm.c: In function ‘qemu_savevm_state_complete_precopy_non_iterable’:
../migration/savevm.c:1560:20: error: ‘ret’ may be used uninitialized [-Werror=maybe-uninitialized]
 1560 |             return ret;
      |                    ^~~

Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20250114104811.2612846-1-marcandre.lureau@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
3 months agogitlab-ci: include full Rust backtraces in test runs
Paolo Bonzini [Tue, 28 Jan 2025 16:06:11 +0000 (17:06 +0100)]
gitlab-ci: include full Rust backtraces in test runs

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 months agohw/usb/canokey: Fix buffer overflow for OUT packet
Hongren Zheng [Mon, 13 Jan 2025 09:38:56 +0000 (17:38 +0800)]
hw/usb/canokey: Fix buffer overflow for OUT packet

When USBPacket in OUT direction has larger payload
than the ep_out_buffer (of size 512), a buffer overflow
would occur.

It could be fixed by limiting the size of usb_packet_copy
to be at most buffer size. Further optimization gets rid
of the ep_out_buffer and directly uses ep_out as the target
buffer.

This is reported by a security researcher who artificially
constructed an OUT packet of size 2047. The report has gone
through the QEMU security process, and as this device is for
testing purpose and no deployment of it in virtualization
environment is observed, it is triaged not to be a security bug.

Cc: qemu-stable@nongnu.org
Fixes: d7d34918551dc48 ("hw/usb: Add CanoKey Implementation")
Reported-by: Juan Jose Lopez Jaimez <thatjiaozi@gmail.com>
Signed-off-by: Hongren Zheng <i@zenithal.me>
Message-id: Z4TfMOrZz6IQYl_h@Sun
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 months agotarget/arm: Use FPST_A64_F16 for halfprec-to-other conversions
Peter Maydell [Fri, 24 Jan 2025 16:27:46 +0000 (16:27 +0000)]
target/arm: Use FPST_A64_F16 for halfprec-to-other conversions

We should be using the F16-specific float_status for conversions from
half-precision, because halfprec inputs never set Input Denormal.

Without FEAT_AHP, using the wrong fpst here had no effect, because
the only difference between the A64_F16 and A64 fpst is its handling
of flush-to-zero on input and output, and the helper functions
vfp_fcvt_f16_to_* and vfp_fcvt_*_to_f16 all explicitly squash the
relevant flushing flags, and flush_inputs_to_zero was the only way
that IDC could be set.

With FEAT_AHP, the FPCR.AH=1 behaviour sets IDC for
input_denormal_used, which we will only ignore in
vfp_get_fpsr_from_host() for the A64_F16 fpst; so it matters that we
use that one for f16 inputs (and the normal one for single/double to
f16 conversions).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-27-peter.maydell@linaro.org

3 months agotarget/arm: Remove redundant advsimd float16 helpers
Peter Maydell [Fri, 24 Jan 2025 16:27:45 +0000 (16:27 +0000)]
target/arm: Remove redundant advsimd float16 helpers

The advsimd_addh etc helpers defined in helper-a64.c are identical to
the vfp_addh etc helpers defined in helper-vfp.c: both take two
float16 inputs (in a uint32_t type) plus a float_status* and are
simple wrappers around the softfloat float16_* functions.

(The duplication seems to be a historical accident: we added the
advsimd helpers in 2018 as part of the A64 implementation, and at
that time there was no f16 emulation in A32.  Then later we added the
A32 f16 handling by extending the existing VFP helper macros to
generate f16 versions as well as f32 and f64, and didn't realise we
could clean things up.)

Remove the now-unnecessary advsimd helpers and make the places that
generated calls to them use the vfp helpers instead. Many of the
helper functions were already unused.

(The remaining advsimd_ helpers are those which don't have vfp
versions.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-26-peter.maydell@linaro.org

3 months agofpu: Fix a comment in softfloat-types.h
Peter Maydell [Fri, 24 Jan 2025 16:27:41 +0000 (16:27 +0000)]
fpu: Fix a comment in softfloat-types.h

In softfloat-types.h a comment documents that if the float_status
field flush_to_zero is set then we flush denormalised results to 0
and set the inexact flag.  This isn't correct: the status flag that
we set when flush_to_zero causes us to flush an output to zero is
float_flag_output_denormal_flushed.

Correct the comment.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-22-peter.maydell@linaro.org

3 months agofpu: Rename float_flag_output_denormal to float_flag_output_denormal_flushed
Peter Maydell [Fri, 24 Jan 2025 16:27:40 +0000 (16:27 +0000)]
fpu: Rename float_flag_output_denormal to float_flag_output_denormal_flushed

Our float_flag_output_denormal exception flag is set when
the fpu code flushes an output denormal to zero. Rename
it to float_flag_output_denormal_flushed:
 * this keeps it parallel with the flag for flushing
   input denormals, which we just renamed
 * it makes it clearer that it doesn't mean "set when
   the output is a denormal"

Commit created with
 for f in `git grep -l float_flag_output_denormal`; do sed -i -e 's/float_flag_output_denormal/float_flag_output_denormal_flushed/' $f; done

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-21-peter.maydell@linaro.org

3 months agofpu: Rename float_flag_input_denormal to float_flag_input_denormal_flushed
Peter Maydell [Fri, 24 Jan 2025 16:27:39 +0000 (16:27 +0000)]
fpu: Rename float_flag_input_denormal to float_flag_input_denormal_flushed

Our float_flag_input_denormal exception flag is set when the fpu code
flushes an input denormal to zero.  This is what many guest
architectures (eg classic Arm behaviour) require, but it is not the
only donarmal-related reason we might want to set an exception flag.
The x86 behaviour (which we do not currently model correctly) wants
to see an exception flag when a denormal input is *not* flushed to
zero and is actually used in an arithmetic operation. Arm's FEAT_AFP
also wants these semantics.

Rename float_flag_input_denormal to float_flag_input_denormal_flushed
to make it clearer when it is set and to allow us to add a new
float_flag_input_denormal_used next to it for the x86/FEAT_AFP
semantics.

Commit created with
 for f in `git grep -l float_flag_input_denormal`; do sed -i -e 's/float_flag_input_denormal/float_flag_input_denormal_flushed/' $f; done

and manual editing of softfloat-types.h and softfloat.c to clean
up the indentation afterwards and to fix a comment which wasn't
using the full name of the flag.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-20-peter.maydell@linaro.org

3 months agotarget/arm: Remove now-unused vfp.fp_status_f16 and FPST_FPCR_F16
Peter Maydell [Fri, 24 Jan 2025 16:27:38 +0000 (16:27 +0000)]
target/arm: Remove now-unused vfp.fp_status_f16 and FPST_FPCR_F16

Now we have moved all the uses of vfp.fp_status_f16 and FPST_FPCR_F16
to the new A32 or A64 fields, we can remove these.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-19-peter.maydell@linaro.org

3 months agotarget/arm: Use FPST_A64_F16 in A64 decoder
Peter Maydell [Fri, 24 Jan 2025 16:27:37 +0000 (16:27 +0000)]
target/arm: Use FPST_A64_F16 in A64 decoder

In the A32 decoder, use FPST_A64_F16 rather than FPST_FPCR_F16.
By doing an automated conversion of the whole file we avoid possibly
using more than one fpst value in a set_rmode/op/restore_rmode
sequence.

Patch created with
  perl -p -i -e 's/FPST_FPCR_F16(?!_)/FPST_A64_F16/g' target/arm/tcg/translate-{a64,sve,sme}.c

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-18-peter.maydell@linaro.org

3 months agotarget/arm: Use FPST_A32_F16 in A32 decoder
Peter Maydell [Fri, 24 Jan 2025 16:27:36 +0000 (16:27 +0000)]
target/arm: Use FPST_A32_F16 in A32 decoder

In the A32 decoder, use FPST_A32_F16 rather than FPST_FPCR_F16.
By doing an automated conversion of the whole file we avoid possibly
using more than one fpst value in a set_rmode/op/restore_rmode
sequence.

Patch created with
  perl -p -i -e 's/FPST_FPCR_F16(?!_)/FPST_A32_F16/g' target/arm/tcg/translate-vfp.c

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-17-peter.maydell@linaro.org

3 months agotarget/arm: Use fp_status_f16_a64 in AArch64-only helpers
Peter Maydell [Fri, 24 Jan 2025 16:27:35 +0000 (16:27 +0000)]
target/arm: Use fp_status_f16_a64 in AArch64-only helpers

We directly use fp_status_f16 in a handful of helpers that are
AArch64-specific; switch to fp_status_f16_a64 for these.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-16-peter.maydell@linaro.org

3 months agotarget/arm: Use fp_status_f16_a32 in AArch32-only helpers
Peter Maydell [Fri, 24 Jan 2025 16:27:34 +0000 (16:27 +0000)]
target/arm: Use fp_status_f16_a32 in AArch32-only helpers

We directly use fp_status_f16 in a handful of helpers that
are AArch32-specific; switch to fp_status_f16_a32 for these.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-15-peter.maydell@linaro.org

3 months agotarget/arm: Define new fp_status_f16_a32 and fp_status_f16_a64
Peter Maydell [Fri, 24 Jan 2025 16:27:33 +0000 (16:27 +0000)]
target/arm: Define new fp_status_f16_a32 and fp_status_f16_a64

As the first part of splitting the existing fp_status_f16
into separate float_status fields for AArch32 and AArch64
(so that we can make FEAT_AFP control bits apply only
for AArch64), define the two new fp_status_f16_a32 and
fp_status_f16_a64 fields, but don't use them yet.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-14-peter.maydell@linaro.org

3 months agotarget/arm: Remove now-unused vfp.fp_status and FPST_FPCR
Peter Maydell [Fri, 24 Jan 2025 16:27:32 +0000 (16:27 +0000)]
target/arm: Remove now-unused vfp.fp_status and FPST_FPCR

Now we have moved all the uses of vfp.fp_status and FPST_FPCR
to either the A32 or A64 fields, we can remove these.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-13-peter.maydell@linaro.org

3 months agotarget/arm: Use FPST_A64 in A64 decoder
Peter Maydell [Fri, 24 Jan 2025 16:27:31 +0000 (16:27 +0000)]
target/arm: Use FPST_A64 in A64 decoder

In the A64 decoder, use FPST_A64 rather than FPST_FPCR.  By
doing an automated conversion of the whole file we avoid possibly
using more than one fpst value in a set_rmode/op/restore_rmode
sequence.

Patch created with

  perl -p -i -e 's/FPST_FPCR(?!_)/FPST_A64/g' target/arm/tcg/translate-{a64,sve,sme}.c

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-12-peter.maydell@linaro.org

3 months agotarget/arm: Use FPST_A32 in A32 decoder
Peter Maydell [Fri, 24 Jan 2025 16:27:30 +0000 (16:27 +0000)]
target/arm: Use FPST_A32 in A32 decoder

In the A32 decoder, use FPST_A32 rather than FPST_FPCR.  By
doing an automated conversion of the whole file we avoid possibly
using more than one fpst value in a set_rmode/op/restore_rmode
sequence.

Patch created with
  perl -p -i -e 's/FPST_FPCR(?!_)/FPST_A32/g' target/arm/tcg/translate-vfp.c

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-11-peter.maydell@linaro.org

3 months agotarget/arm: Use fp_status_a32 in vfp_cmp helpers
Peter Maydell [Fri, 24 Jan 2025 16:27:29 +0000 (16:27 +0000)]
target/arm: Use fp_status_a32 in vfp_cmp helpers

The helpers vfp_cmps, vfp_cmpes, vfp_cmpd, vfp_cmped are used only from
the A32 decoder; the A64 decoder uses separate vfp_cmps_a64 etc helpers
(because for A64 we update the main NZCV flags and for A32 we update
the FPSCR NZCV flags). So we can make these helpers use the fp_status_a32
field instead of fp_status.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-10-peter.maydell@linaro.org

3 months agotarget/arm: Use fp_status_a32 in vjvct helper
Peter Maydell [Fri, 24 Jan 2025 16:27:28 +0000 (16:27 +0000)]
target/arm: Use fp_status_a32 in vjvct helper

Use fp_status_a32 in the vjcvt helper function; this is called only
from the A32/T32 decoder and is not used inside a
set_rmode/restore_rmode sequence.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-9-peter.maydell@linaro.org

3 months agotarget/arm: Use fp_status_a64 or fp_status_a32 in is_ebf()
Peter Maydell [Tue, 28 Jan 2025 11:40:13 +0000 (11:40 +0000)]
target/arm: Use fp_status_a64 or fp_status_a32 in is_ebf()

In is_ebf(), we might be called for A64 or A32, but we have
the CPUARMState* so we can select fp_status_a64 or
fp_status_a32 accordingly.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
3 months agotarget/arm: Use vfp.fp_status_a64 in A64-only helper functions
Peter Maydell [Fri, 24 Jan 2025 16:27:27 +0000 (16:27 +0000)]
target/arm: Use vfp.fp_status_a64 in A64-only helper functions

Switch from vfp.fp_status to vfp.fp_status_a64 for helpers which:
 * directly reference an fp_status field
 * are called only from the A64 decoder
 * are not called inside a set_rmode/restore_rmode sequence

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20250124162836.2332150-8-peter.maydell@linaro.org
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
3 months agotarget/arm: Define new fp_status_a32 and fp_status_a64
Peter Maydell [Fri, 24 Jan 2025 16:27:26 +0000 (16:27 +0000)]
target/arm: Define new fp_status_a32 and fp_status_a64

We want to split the existing fp_status in the Arm CPUState into
separate float_status fields for AArch32 and AArch64.  (This is
because new control bits defined by FEAT_AFP only have an effect for
AArch64, not AArch32.) To make this split we will:
 * define new fp_status_a32 and fp_status_a64 which have
   identical behaviour to the existing fp_status
 * move existing uses of fp_status to fp_status_a32 or
   fp_status_a64 as appropriate
 * delete the old fp_status when it has no uses left

In this patch we add the new float_status fields.

We will also need to split fp_status_f16, but we will do that
as a separate series of patches.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-7-peter.maydell@linaro.org

3 months agotarget/arm: Use uint32_t in vfp_exceptbits_from_host()
Peter Maydell [Fri, 24 Jan 2025 16:27:25 +0000 (16:27 +0000)]
target/arm: Use uint32_t in vfp_exceptbits_from_host()

In vfp_exceptbits_from_host(), we accumulate the FPSR flags in
an "int", and our return type is also "int". However, the only
callsite returns the same information as a uint32_t, and
more generally we handle FPSR values in the code as uint32_t,
not int. Bring this function in to line with that convention.

There is no behaviour change because none of the FPSR bits
we set in this function are bit 31. The input argument to
the function remains 'int' because that is the return type
of the softfloat get_float_exception_flags().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-6-peter.maydell@linaro.org

3 months agotarget/arm: Use FPSR_ constants in vfp_exceptbits_from_host()
Peter Maydell [Fri, 24 Jan 2025 16:27:24 +0000 (16:27 +0000)]
target/arm: Use FPSR_ constants in vfp_exceptbits_from_host()

Use the FPSR_ named constants in vfp_exceptbits_from_host(),
rather than hardcoded magic numbers.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-5-peter.maydell@linaro.org

3 months agotarget/arm: arm_reset_sve_state() should set FPSR, not FPCR
Peter Maydell [Fri, 24 Jan 2025 16:27:23 +0000 (16:27 +0000)]
target/arm: arm_reset_sve_state() should set FPSR, not FPCR

The pseudocode ResetSVEState() does:
    FPSR = ZeroExtend(0x0800009f<31:0>, 64);
but QEMU's arm_reset_sve_state() called vfp_set_fpcr() by accident.

Before the advent of FEAT_AFP, this was only setting a collection of
RES0 bits, which vfp_set_fpsr() would then ignore, so the only effect
was that we didn't actually set the FPSR the way we are supposed to
do.  Once FEAT_AFP is implemented, setting the bottom bits of FPSR
will change the floating point behaviour.

Call vfp_set_fpsr(), as we ought to.

(Note for stable backports: commit 7f2a01e7368f9 moved this function
from sme_helper.c to helper.c, but it had the same bug before the
move too.)

Cc: qemu-stable@nongnu.org
Fixes: f84734b87461 ("target/arm: Implement SMSTART, SMSTOP")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-4-peter.maydell@linaro.org

3 months agotests/functional: Add a test for the arm microbit machine
Thomas Huth [Fri, 24 Jan 2025 10:17:09 +0000 (11:17 +0100)]
tests/functional: Add a test for the arm microbit machine

We don't have any functional tests for this machine yet, thus let's
add a test with a MicroPython binary that is available online
(thanks to Joel Stanley for providing it, see:
 https://www.mail-archive.com/qemu-devel@nongnu.org/msg606064.html ).

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20250124101709.1591761-1-thuth@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 months agorust: qemu-api: add sub-subclass to the integration tests
Zhao Liu [Fri, 17 Jan 2025 10:59:55 +0000 (11:59 +0100)]
rust: qemu-api: add sub-subclass to the integration tests

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 months agorust/zeroable: Implement Zeroable with const_zero macro
Zhao Liu [Thu, 23 Jan 2025 18:07:28 +0000 (19:07 +0100)]
rust/zeroable: Implement Zeroable with const_zero macro

The `const_zero` crate provides a nice macro to zero type-specific
constants, which doesn't need to enumerates the fields one by one.

Introduce the `const_zero` macro to QEMU (along with its documentation), and
use it to simplify the implementation of `Zeroable` trait.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250123163143.679841-1-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 months agorust: qdev: make reset take a shared reference
Paolo Bonzini [Mon, 2 Dec 2024 11:40:18 +0000 (12:40 +0100)]
rust: qdev: make reset take a shared reference

Because register reset is within a borrow_mut() call, reset
does not need anymore a mut reference to the PL011State.

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 months agorust: pl011: drop use of ControlFlow
Paolo Bonzini [Fri, 17 Jan 2025 17:13:30 +0000 (18:13 +0100)]
rust: pl011: drop use of ControlFlow

It is a poor match for what the code is doing, anyway.

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>