git.maquefel.me Git - qemu.git/log

overcommit: introduce mem-lock=on-fault

Locking the memory without MCL_ONFAULT instantly prefaults any mmaped
anonymous memory with a write-fault, which introduces a lot of extra
overhead in terms of memory usage when all you want to do is to prevent
kcompactd from migrating and compacting QEMU pages. Add an option to
only lock pages lazily as they're faulted by the process by using
MCL_ONFAULT if asked.

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Link: https://lore.kernel.org/r/20250212143920.1269754-5-d-tatianin@yandex-team.ru
Signed-off-by: Peter Xu <peterx@redhat.com>

system: introduce a new MlockState enum

Replace the boolean value enable_mlock with an enum and add a helper to
decide whether we should be calling os_mlock.

This is a stepping stone towards introducing a new mlock mode, which
will be the third possible state of this enum.

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Link: https://lore.kernel.org/r/20250212143920.1269754-4-d-tatianin@yandex-team.ru
Signed-off-by: Peter Xu <peterx@redhat.com>

system/vl: extract overcommit option parsing into a helper

This will be extended in the future commits, let's move it out of line
right away so that it's easier to read.

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Link: https://lore.kernel.org/r/20250212143920.1269754-3-d-tatianin@yandex-team.ru
Signed-off-by: Peter Xu <peterx@redhat.com>

os: add an ability to lock memory on_fault

This will be used in the following commits to make it possible to only
lock memory on fault instead of right away.

Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Link: https://lore.kernel.org/r/20250212143920.1269754-2-d-tatianin@yandex-team.ru
[peterx: fail os_mlock(on_fault=1) when not supported]
[peterx: use G_GNUC_UNUSED instead of "(void)on_fault", per Dan]
Signed-off-by: Peter Xu <peterx@redhat.com>

system/physmem: poisoned memory discard on reboot

Repair poisoned memory location(s), calling ram_block_discard_range():
punching a hole in the backend file when necessary and regenerating
a usable memory.
If the kernel doesn't support the madvise calls used by this function
and we are dealing with anonymous memory, fall back to remapping the
location(s).

Signed-off-by: William Roche <william.roche@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20250211212707.302391-3-william.roche@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>

system/physmem: handle hugetlb correctly in qemu_ram_remap()

The list of hwpoison pages used to remap the memory on reset
is based on the backend real page size.
To correctly handle hugetlb, we must mmap(MAP_FIXED) a complete
hugetlb page; hugetlb pages cannot be partially mapped.

Signed-off-by: William Roche <william.roche@oracle.com>
Co-developed-by: David Hildenbrand <david@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20250211212707.302391-2-william.roche@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>

physmem: teach cpu_memory_rw_debug() to write to more memory regions

Right now, we only allow for writing to memory regions that allow direct
access using memcpy etc; all other writes are simply ignored. This
implies that debugging guests will not work as expected when writing
to MMIO device regions.

Let's extend cpu_memory_rw_debug() to write to more memory regions,
including MMIO device regions. Reshuffle the condition in
memory_access_is_direct() to make it easier to read and add a comment.

While this change implies that debug access can now also write to MMIO
devices, we now are also permit ELF image loads and similar users of
cpu_memory_rw_debug() to write to MMIO devices; currently we ignore
these writes.

Peter assumes [1] that there's probably a class of guest images, which
will start writing junk (likely zeroes) into device model registers; we
previously would silently ignore any such bogus ELF sections. Likely
these images are of questionable correctness and this can be ignored. If
ever a problem, we could make these cases use address_space_write_rom()
instead, which is left unchanged for now.

This patch is based on previous work by Stefan Zabka.

[1] https://lore.kernel.org/all/CAFEAcA_2CEJKFyjvbwmpt=on=GgMVamQ5hiiVt+zUr6AY3X=Xg@mail.gmail.com/

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/213
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20250210084648.33798-8-david@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>

hmp: use cpu_get_phys_page_debug() in hmp_gva2gpa()

We don't need the MemTxAttrs, so let's simply use the simpler function
variant.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20250210084648.33798-7-david@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>

memory: pass MemTxAttrs to memory_access_is_direct()

We want to pass another flag that will be stored in MemTxAttrs. So pass
MemTxAttrs directly.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20250210084648.33798-6-david@redhat.com
[peterx: Fix MacOS builds]
Signed-off-by: Peter Xu <peterx@redhat.com>

physmem: disallow direct access to RAM DEVICE in address_space_write_rom()

As documented in commit 4a2e242bbb306 ("memory: Don't use memcpy for
ram_device regions"), we disallow direct access to RAM DEVICE regions.

This change implies that address_space_write_rom() and
cpu_memory_rw_debug() won't be able to write to RAM DEVICE regions. It
will also affect cpu_flush_icache_range(), but it's only used by
hw/core/loader.c after writing to ROM, so it is expected to not apply
here with RAM DEVICE.

This fixes direct access to these regions where we don't want direct
access. We'll extend cpu_memory_rw_debug() next to also be able to write to
these (and IO) regions.

This is a preparation for further changes.

Cc: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20250210084648.33798-5-david@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>

physmem: factor out direct access check into memory_region_supports_direct_access()

Let's factor the complete "directly accessible" check independent of
the "write" condition out so we can reuse it next.

We can now split up the checks RAM and ROMD check, so we really only check
for RAM DEVICE in case of RAM -- ROM DEVICE is neither RAM not RAM DEVICE.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20250210084648.33798-4-david@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>

physmem: factor out RAM/ROMD check in memory_access_is_direct()

Let's factor more of the generic "is this directly accessible" check,
independent of the "write" condition out.

Note that the "!mr->rom_device" check in the write case essentially
disallows the memory_region_is_romd() condition again. Further note that
RAM DEVICE regions are also RAM regions, so we can check for RAM+ROMD
first.

This is a preparation for further changes.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20250210084648.33798-3-david@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>

physmem: factor out memory_region_is_ram_device() check in memory_access_is_direct()

As documented in commit 4a2e242bbb306 ("memory: Don't use memcpy for
ram_device regions"), we disallow direct access to RAM DEVICE regions.

Let's make this clearer to prepare for further changes. Note that romd
regions will never be RAM DEVICE at the same time.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20250210084648.33798-2-david@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>

system/physmem: take into account fd_offset for file fallocate

Punching a hole in a file with fallocate needs to take into account the
fd_offset value for a correct file location.
But guest_memfd internal use doesn't currently consider fd_offset.

Fixes: 4b870dc4d0c0 ("hostmem-file: add offset option")
Signed-off-by: William Roche <william.roche@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20250122194053.3103617-2-william.roche@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>

Merge tag 'pull-10.0-testing-and-gdstub-updates-100225-1' of https://gitlab.com/stsquad/qemu into staging

testing and gdbstub updates:

  - add a check-rust test to docker builds
  - re-factor the qtest logic to be cleaner
  - fix tests to not clock_step when no timers enabled
  - roll-up log prefix into qtest_send
  - cleaner error reporting when qtest_clock_set fails
  - revert old deadlock fix now tests are updated
  - only run full set of migration tests under HW acceleration
  - support late attachment to user-mode gdbstubs

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmeqBSsACgkQ+9DbCVqe
# KkQS/Af+K0hpdGc1msiuMsqmuESBvhoQniYZFLN1/pwe2KpG8i/+fq2fsCuxJhJ1
# 2TzPH7aj54p9MGCZf2k9JLhO22XldN+oezZMc1crhoWK0AtrWhnLs58I2oEPIsUo
# NmGO6Zfm98ge89o2y8GCvd0QXAtUf+jduDKnW0mfnOnw+w/mky5KzWS7/1091VGW
# 42LSY4KnqgdLSqLyuLBOrgADEjB1ChWS4/bSC+kEYSGrmNQB+n1KeIzzlJBGpOr0
# Z9yzmhMCm7TWdkFNPmnVfYH/7ZUNcpv6PtQSpkku4f6b/gybyvJBknHpM4i+Gpb5
# 87wSjljrCpdNm/9KFRjiJuUWdS/jCg==
# =UF0n
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 10 Feb 2025 08:54:51 EST
# gpg:                using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8  DF35 FBD0 DB09 5A9E 2A44

* tag 'pull-10.0-testing-and-gdstub-updates-100225-1' of https://gitlab.com/stsquad/qemu:
  tests/tcg: Add late gdbstub attach test
  docs/user: Document the %d placeholder and suspend=n QEMU_GDB features
  gdbstub: Allow late attachment
  osdep: Introduce qemu_kill_thread()
  user: Introduce host_interrupt_signal
  user: Introduce user/signal.h
  gdbstub: Try unlinking the unix socket before binding
  gdbstub: Allow the %d placeholder in the socket path
  tests/qtest/migration: Pick smoke tests
  tests/qtest/migration: Add --full option
  Revert "util/timer: avoid deadlock when shutting down"
  tests/qtest: tighten up the checks on clock_step
  tests/qtest: rename qtest_send_prefix and roll-up into qtest_send
  tests/qtest: simplify qtest_process_inbuf
  tests/qtest: don't step clock at start of npcm7xx periodic IRQ test
  tests/qtest: don't attempt to clock_step while waiting for virtio ISR
  tests/docker: replicate the check-rust-tools-nightly CI job

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Merge tag 'for-upstream' of https://repo.or.cz/qemu/kevin into staging

Block layer patches

- Managing inactive nodes (enables QSD migration with shared storage)
- Fix swapped values for BLOCK_IO_ERROR 'device' and 'qom-path'
- vpc: Read images exported from Azure correctly
- scripts/qemu-gdb: Support coroutine dumps in coredumps
- Minor cleanups

# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmek34IRHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9bDpxAAnTvwmdazAXG0g9GzqvrEB/+6rStjAsqE
# 9MTWV4WxyN41d0RXxN8CYKb8CXSiTRyw6r3CSGNYEI2eShe9e934PriSkZm41HyX
# n9Yh5YxqGZqitzvPtx62Ii/1KG+PcjQbfHuK1p4+rlKa0yQ2eGlio1JIIrZrCkBZ
# ikZcQUrhIyD0XV8hTQ2+Ysa+ZN6itjnlTQIG3gS3m8f8WR7kyUXD8YFMQFJFyjVx
# NrAIpLnc/ln9+5PZR9tje8U7XEn2KCgI5pgGaQnrd0h0G1H4ig8ogzYYnKTLhjU/
# AmQpS8np8Tyg6S1UZTiekEq0VuAhThEQc5b3sGbmHWH/R2ABMStyf18oCBAkPzZ7
# s6h+3XzTKKY2Q5Q3ZG/ANkUJjTNBhdj1fcaARvbSWsqsuk5CWX/I3jzvgihFtCSs
# eGu+b/bLeW6P7hu4qPHBcgLHuB1Fc7Rd2t4BoIGM1wcO2CeC9DzUKOiIMZOEJIh0
# GGqCkEWDHgckDTakD4/vSqm0UDKt6FSlQC9ga/ILBY3IB5HpHoArY58selymy28i
# X7MgAvbjdsmNuUuXDZZOiObcFt3j8jlmwPJpPyzXPQIiPX1RXeBPRhVAEeZCKn6Z
# tfHr72SJdMeVOGXVTvOrJ2iW+4g03rPdmkDFCUhpOwo62RODq7ahvCIXsNf3nEFR
# rSB3T1M/8EM=
# =iQLP
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 06 Feb 2025 11:12:50 EST
# gpg:                using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg:                issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (25 commits)
  block: remove unused BLOCK_OP_TYPE_DATAPLANE
  iotests: Add (NBD-based) tests for inactive nodes
  iotests: Add qsd-migrate case
  iotests: Add filter_qtest()
  nbd/server: Support inactive nodes
  block/export: Add option to allow export of inactive nodes
  block: Drain nodes before inactivating them
  block/export: Don't ignore image activation error in blk_exp_add()
  block: Support inactive nodes in blk_insert_bs()
  block: Add blockdev-set-active QMP command
  block: Add option to create inactive nodes
  block: Fix crash on block_resize on inactive node
  block: Don't attach inactive child to active node
  migration/block-active: Remove global active flag
  block: Inactivate external snapshot overlays when necessary
  block: Allow inactivating already inactive nodes
  block: Add 'active' field to BlockDeviceInfo
  block-backend: Fix argument order when calling 'qapi_event_send_block_io_error()'
  scripts/qemu-gdb: Support coroutine dumps in coredumps
  scripts/qemu-gdb: Simplify fs_base fetching for coroutines
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Merge tag 'pull-target-arm-20250210' of https://git.linaro.org/people/pmaydell/qemu-arm into staging

target-arm queue:
* Deprecate pxa2xx CPUs, iwMMXt emulation, -old-param option
* Drop unused AArch64DecodeTable typedefs
* Minor code cleanups
* hw/net/cadence_gem:  Fix the mask/compare/disable-mask logic
* linux-user: Do not define struct sched_attr if libc headers do

# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmeqH/sZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3lW6D/4r4SyxAzrjIQRLh3xydADN
# A9EsQ44Or/M7jJ7uzR5nkLldlHdKTccVZFj17BlK6DnklsTUVSUoxpHtzYTHE2Ar
# Q8iqV4dqoyDrYpqHWNQQvwQCBLbcj0CFQ1VjieG656m4uhImoeVMiH3xbFvMwqj0
# KpIWL/+jaRs5jgpnN7Ig4Zq3gVHVZWyOOjzIKF/l4hFchK4eao0oAWdWo/TtGPHB
# WyqkO1YZoZGBlT/7WXyKE5YXoXbd8m079NXcHmH6sy1/fSNXQ7qIlHGV/36kiJo1
# WnDgZ0KUOEl4thaeq731xtgGcwt9C9Qx8g9bJP42os7EzQZBtvXxJXWgQKpvpNVH
# Hmpsj0ed7oI1LH5DEPkqvYOEnnvEFt3skMbblhIZufnrAnojk9Q64v/Z1LNEIuuC
# j5sZrFZsKPsA2uNzsmqXyJxWwnU6IT5YNBZAzALFTwE8dNL/VMXfRYhhUEy0Ay3C
# jVXHk+sfOKo83YNswffagBeb/tRFDApgvRySxxL9TCONGl0HNkXqSuE+hssF8jyr
# AnZ3zxSrmWKZizuotvFwaP0bxP0Sa/yeR1lR6E1xu+iEEJKJ4dE5xpX4E3uf6tHk
# cfQQXFrhOzEwGn4qLDuqcgvhxRecZL7kNiFYidynKafIBw///J1cpaDYxxwh9v6O
# TZuJliw0uCo6z0sXxVIn1w==
# =MS2g
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 10 Feb 2025 10:49:15 EST
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
# gpg:                 aka "Peter Maydell <peter@archaic.org.uk>" [unknown]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* tag 'pull-target-arm-20250210' of https://git.linaro.org/people/pmaydell/qemu-arm:
  linux-user: Do not define struct sched_attr if libc headers do
  qemu-options: Deprecate -old-param command line option
  hw/net/cadence_gem:  Fix the mask/compare/disable-mask logic
  hw/cpu/arm: Declare CPU QOM types using DEFINE_TYPES() macro
  hw/cpu/arm: Alias 'num-cpu' property on TYPE_REALVIEW_MPCORE
  hw/arm/fsl-imx7: Add local 'mpcore/gic' variables
  hw/arm/fsl-imx6ul: Add local 'mpcore/gic' variables
  hw/arm/fsl-imx6: Add local 'mpcore/gic' variables
  hw/arm/boot: Propagate vCPU to arm_load_dtb()
  target/arm: Drop unused AArch64DecodeTable typedefs
  tests/tcg/arm: Remove test-arm-iwmmxt test
  target/arm: deprecate the pxa2xx CPUs and iwMMXt emulation

Conflicts:
- The iwMMXt deprecation notice conflicted with the 32-bit host
  operating system deprecation notice. Add both notices.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Merge tag 'pull-qapi-2025-02-10-v2' of https://repo.or.cz/qemu/armbru into staging

QAPI patches patches for 2025-02-10

# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmeqEXESHGFybWJydUBy
# ZWRoYXQuY29tAAoJEDhwtADrkYZTaOEP/2VYKkb2VzPdWzyQcEx66MJ+1RjcEy1A
# JtD6mTdpEuti5NgrUUOSHjrd6P3DVNZL8SMPD21F4/I1t0u+ztfCtx65YKrKo8hV
# jCnYS5w2i/YT3Cpz052yEhUoPgxj4kQiR3gqbLkpBKV7lh6wZ3+gVTNW8DJzPW/R
# MmE9vkOCLhjmkodxRiVa7df73qMEm4nfbmQjM9SWBU55AC2xElptjJo0Sc7sMT3n
# HdoLjXKfjUCIpmI3LfbRvS3Tyxd9gQn/la2yf3gaXJ0qrbP4xyu5VCzAOla5myuC
# XyakLUu9DOsfNuHXvKX+M8jE7pf6wibLMfVhPigACob2LAa4Zo7LvCKqjhclTNhK
# +/PvTGrirnGweNWXz5/2tG97F7oSzX2m182LyuloQbaehXAtpAuHehSCQUet6HOu
# CEUOeV7D13nxcgxXT1GvQIqsTYRtIJvY8DM3tRoCAzDv/KNdXF4M/ybtUHmyHUkg
# kspwCRfQJ1sNRdmj7oBtmWvvbYBk/zKvt84yOQZFYocmofp18KVLDN+hzEAHvHQE
# 4t8yCktjrGGC0bCgIaQkBaeU7nxMWXBOOlYcejnXTR4VPTDTRKMAosmAotcd9d5H
# QgGjcMhbDPJHavi36JdJQgxuwl4LskwLCdenBfXhmH8ePIWhjIqqzcdDJy0UcH0x
# pX8L/Jsd42qD
# =jFK8
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 10 Feb 2025 09:47:13 EST
# gpg:                using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg:                issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* tag 'pull-qapi-2025-02-10-v2' of https://repo.or.cz/qemu/armbru:
  qapi: expose all schema features to code
  qapi: rename 'special_features' to 'features'
  qapi: change 'unsigned special_features' to 'uint64_t features'
  qapi: cope with feature names containing a '-'
  qapi/ui: Fix documentation of upper bound value in InputMoveEvent
  qapi: fix colon in Since tag section
  qapi: Move and rename qapi/qmp/dispatch.h to qapi/qmp-registry.h
  qapi: Move include/qapi/qmp/ to include/qobject/

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* tcg/optimize: optimize TSTNE using smask and zmask
* target/i386: fix exceptions for 0 * Inf + QNaN
* rust: cleanups to the configuration and the warnings
* rust: add developer docs

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmep0nsUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroN0aggAo8mKY4c7LGLF+5xGM1W/ZLxaBrMN
# 4/LGMV3UJJq+OIEb+e7ThtfyHzcAsYXdO2SIMJ6ffW58ukmwM1SycA8WUjG3JMya
# m05dVnI//D/G7iqYgyNlsiTbqazdr3P/Ha0ty10l9K9dL3SjB3Nm3xLD4XnD3tzM
# U+0yXrn1ngMZ3Y1knTuWwjoH7JT5QftwGCVZ5aeq0KlSAvWb8M8jupYFunLBcZ0z
# LkFSWXbxhw9HRkI+Lp6c2IjIBDEd7267tEfnXf+d29ykVnrdyIWgyYw08/Un72i+
# C5hNEUMiwcD7preTYX5YqDcxSASG1zWNFzEWGhTphMWf1O60f2PIXMWX5g==
# =0KKa
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 10 Feb 2025 05:18:35 EST
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
  rust: restrict missing_const_for_fn to qemu_api crate
  rust: pl011: use default set of lints
  tcg/optimize: optimize TSTNE using smask and zmask
  tests/tcg/x86_64/fma: Test some x86 fused-multiply-add cases
  target/i386: Do not raise Invalid for 0 * Inf + QNaN
  rust: add clippy configuration file
  rust: add docs
  rust: include rust_version in Cargo.toml
  rust: remove unnecessary Cargo.toml metadata

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Merge tag 'pull-tcg-20250208' of https://gitlab.com/rth7680/qemu into staging

meson: Disallow 64-bit on 32-bit emulation

# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmenwp8dHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV92iQgAm6uXZQPlnRw6PkPg
# getS21AcsrypCdrik2YhCrFNtVoo5cLn6aJAWl6c10zJZO+Ap4FKRAKbGUaQ7Awv
# x+NRvo5Q/W+E+e8BeTvvuhQ6DaGkH5eMIgQC31w3s0Fh4siAEpNCXxb3WFyTEHiR
# qVh/jOKF6lHTGFke9BVbEv3iNVi+Hg7GXyVi3/HqWj3VTe0WzdB1eJDKF2Ja5dAJ
# H9UVOwxdxB9ztjjx1Y4SSfyftcq/myz9xx4yGPotW+85gk33HEMLMDVksLVLuyJN
# rKzcjrU8a64i+B3xP9f3t6Nwud1LoYm6bI2Wf80ScUK8qJ0XidNT96FkO6Phj/mH
# lW/nWA==
# =zdKE
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 08 Feb 2025 15:46:23 EST
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* tag 'pull-tcg-20250208' of https://gitlab.com/rth7680/qemu:
  meson: Deprecate 32-bit host support
  meson: Disallow 64-bit on 32-bit emulation
  target/*: Remove TARGET_LONG_BITS from cpu-param.h
  configure: Define TARGET_LONG_BITS in configs/targets/*.mak
  gitlab-ci: Replace aarch64 with arm in cross-i686-tci build
  meson: Disallow 64-bit on 32-bit HVF/NVMM/WHPX emulation
  meson: Disallow 64-bit on 32-bit Xen emulation
  meson: Disallow 64-bit on 32-bit KVM emulation
  meson: Drop tcg as a module

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

qapi: expose all schema features to code

This replaces use of the constants from the QapiSpecialFeatures
enum, with constants from the auto-generate QapiFeatures enum
in qapi-features.h

The 'deprecated' and 'unstable' features still have a little bit of
special handling, being force defined to be the 1st + 2nd features
in the enum, regardless of whether they're used in the schema. This
retains compatibility with common code that references the features
via the QapiSpecialFeatures constants.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250205123550.2754387-5-berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Imports tidied up with isort]
Signed-off-by: Markus Armbruster <armbru@redhat.com>

qapi: rename 'special_features' to 'features'

This updates the QAPI code generation to refer to 'features' instead
of 'special_features', in preparation for generalizing their exposure.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250205123550.2754387-4-berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Imports tidied up with isort]
Signed-off-by: Markus Armbruster <armbru@redhat.com>

qapi: change 'unsigned special_features' to 'uint64_t features'

The "special_features" field / parameter holds the subset of schema
features that are for internal code use. Specifically 'DEPRECATED'
and 'UNSTABLE'.

This special casing of internal features is going to be removed, so
prepare for that by renaming to 'features'. Using a fixed size type
is also best practice for bit fields.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250205123550.2754387-3-berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>

qapi: cope with feature names containing a '-'

When we shortly expose all feature names to code, it will be valid to
include a '-', which must be translated to a '_' for the enum constants.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250205123550.2754387-2-berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>

qapi/ui: Fix documentation of upper bound value in InputMoveEvent

The upper bound of pointer position in InputMoveEvent should be 0x7fff,
according to INPUT_EVENT_ABS_MAX.

Signed-off-by: Zhang Boyang <zhangboyang.id@gmail.com>
Message-ID: <20250116104433.12114-1-zhangboyang.id@gmail.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
[Phrasing tweak squashed in]
Signed-off-by: Markus Armbruster <armbru@redhat.com>

qapi: fix colon in Since tag section

As described in docs/devel/qapi-code-gen.rst line 998,
there should be no space between "Since" and ":".

Signed-off-by: Victor Toso <victortoso@redhat.com>
Message-ID: <20241217091504.16416-1-victortoso@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>

qapi: Move and rename qapi/qmp/dispatch.h to qapi/qmp-registry.h

The general expectation is that header files should follow the same
file/path naming scheme as the corresponding source file. There are
various historical exceptions to this practice in QEMU, with one of
the most notable being the include/qapi/qmp/ directory.

include/qapi/qmp/dispatch.h corresponds mostly to qapi/qmp-registry.c.
Move and rename it to include/qapi/qmp-registry.h.

Now just qerror.h is left in include/qapi/qmp/. Since it's deprecated
& (slowly) getting eliminated anyway, it isn't worth moving.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20241118151235.2665921-3-armbru@redhat.com>

qapi: Move include/qapi/qmp/ to include/qobject/

The general expectation is that header files should follow the same
file/path naming scheme as the corresponding source file. There are
various historical exceptions to this practice in QEMU, with one of
the most notable being the include/qapi/qmp/ directory. Most of the
headers there correspond to source files in qobject/.

This patch corrects most of that inconsistency by creating
include/qobject/ and moving the headers for qobject/ there.

This also fixes MAINTAINERS for include/qapi/qmp/dispatch.h:
scripts/get_maintainer.pl now reports "QAPI" instead of "No
maintainers found".

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com> #s390x
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20241118151235.2665921-2-armbru@redhat.com>
[Rebased]

tests/tcg: Add late gdbstub attach test

Add a small test to prevent regressions.
Make sure that host_interrupt_signal is not visible to the guest.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20250117001542.8290-9-iii@linux.ibm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250207153112.3939799-18-alex.bennee@linaro.org>

docs/user: Document the %d placeholder and suspend=n QEMU_GDB features

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20250117001542.8290-8-iii@linux.ibm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250207153112.3939799-17-alex.bennee@linaro.org>

gdbstub: Allow late attachment

Allow debugging individual processes in multi-process applications by
starting them with export QEMU_GDB=/tmp/qemu-%d.sock,suspend=n.
Currently one would have to attach to every process to ensure the app
makes progress.

In case suspend=n is not specified, the flow remains unchanged. If it
is specified, then accepting the client connection is delegated to a
thread. In the future this machinery may be reused for handling
reconnections and interruptions.

On accepting a connection, the thread schedules gdb_handlesig() on the
first CPU and wakes it up with host_interrupt_signal. Note that the
result of this gdb_handlesig() invocation is handled, as opposed to
many other existing call sites. These other call sites probably need to
be fixed separately.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20250117001542.8290-7-iii@linux.ibm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250207153112.3939799-16-alex.bennee@linaro.org>

osdep: Introduce qemu_kill_thread()

Add a function for sending signals to individual threads. It does not make
sense on Windows, so do not provide an implementation, so that if someone
uses it by accident, they will get a linker error.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20250117001542.8290-6-iii@linux.ibm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250207153112.3939799-15-alex.bennee@linaro.org>

user: Introduce host_interrupt_signal

Attaching to the gdbstub of a running process requires stopping its
threads. For threads that run on a CPU, cpu_exit() is enough, but the
only way to grab attention of a thread that is stuck in a long-running
syscall is to interrupt it with a signal.

Reserve a host realtime signal for this, just like it's already done
for TARGET_SIGABRT on Linux. This may reduce the number of available
guest realtime signals by one, but this is acceptable, since there are
quite a lot of them, and it's unlikely that there are apps that need
them all.

Set signal_pending for the safe_sycall machinery to prevent invoking
the syscall. This is a lie, since we don't queue a guest signal, but
process_pending_signals() can handle the absence of pending signals.
The syscall returns with QEMU_ERESTARTSYS errno, which arranges for
the automatic restart. This is important, because it helps avoiding
disturbing poorly written guests.

Reviewed-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20250117001542.8290-5-iii@linux.ibm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250207153112.3939799-14-alex.bennee@linaro.org>

user: Introduce user/signal.h

gdbstub needs target_to_host_signal(), so move its declaration to a
public header.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20250117001542.8290-4-iii@linux.ibm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250207153112.3939799-13-alex.bennee@linaro.org>

gdbstub: Try unlinking the unix socket before binding

In case an emulated process execve()s another emulated process, bind()
will fail, because the socket already exists. So try deleting it. Use
the existing unix_listen() function which does this. Link qemu-user
with qemu-sockets.c and add the monitor_get_fd() stub.

Note that it is not possible to handle this in do_execv(): deleting
gdbserver_user_state.socket_path before safe_execve() is not correct,
because the latter may fail, and afterwards we may lose control.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250117001542.8290-3-iii@linux.ibm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250207153112.3939799-12-alex.bennee@linaro.org>

gdbstub: Allow the %d placeholder in the socket path

Just like for QEMU_LOG_FILENAME, replace %d with PID in the GDB socket
path. This allows running multi-process applications with, e.g.,
export QEMU_GDB=/tmp/qemu-%d.sock. Currently this is not possible,
since the first process will cause the subsequent ones to fail due to
not being able to bind() the GDB socket.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20250117001542.8290-2-iii@linux.ibm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250207153112.3939799-11-alex.bennee@linaro.org>

tests/qtest/migration: Pick smoke tests

Choose a few tests per group and move them from the full set to the
smoke set.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20250130184012.5711-3-farosas@suse.de>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250207153112.3939799-10-alex.bennee@linaro.org>

tests/qtest/migration: Add --full option

Add a new command line option to allow selecting between running the
full set of tests or a smaller set of tests. The default will be to
run the small set (i.e. no comand line option provided) so we can
reduce the amount of tests run by default. Only hosts which support
KVM for the target architecture being tested will run the complete set
of tests.

Adjust the meson.build file to pass in the --full option when
appropriate.

(for now, set the option unconditionally until the next patch actually
creates the small set)

Use cases:

configure --target-list=aarch64-softmmu,ppc64-softmmu,s390x-softmmu,x86_64-softmmu

                        | before - 615s/244 tests  | after - 244s/100 tests
------------------------+--------------------------+-----------------------------
make check              | full set for all archs   | full set for the KVM arch,
make check-qtest        |                          | small set for the rest
                        |                          |
qemu-system-$ARCH       | full set for $ARCH       | small set for $ARCH, KVM or
./migration-test        |                          | TCG automatically chosen
                        |                          |
qemu-system-$ARCH       | N/A                      | full set for $ARCH, KVM or
./migration-test --full |                          | TCG automatically chosen
                        |                          |
migration-compat-x86_64 | full set for x86_64      | small set for x86_64
CI job                  |                          |
------------------------+--------------------------+-----------------------------

Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20250130184012.5711-2-farosas@suse.de>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250207153112.3939799-9-alex.bennee@linaro.org>

Revert "util/timer: avoid deadlock when shutting down"

This reverts commit bc02be4508d8753d1f6071b77d10f4661587df6f.

Now we catch attempts to clock_step to the next timer when none are
enabled we can revert the previous attempt to prevent deadlock. As
long as a new target time is given we will move time forward even if
no timers will fire. This is desirable for tests which are checking
that nothing changes when things are disabled.

Previously most tests got away with it because --enable-slirp always
has a timer running while the test is active.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250207153112.3939799-8-alex.bennee@linaro.org>

tests/qtest: tighten up the checks on clock_step

It is invalid to call clock_step with an implied time to step forward
as if no timers are running we won't be able to advance.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250207153112.3939799-7-alex.bennee@linaro.org>

tests/qtest: rename qtest_send_prefix and roll-up into qtest_send

qtest_send_prefix never actually sent something over the chardev, all
it does is print the timestamp to the QTEST_LOG when enabled. So
rename the function, make it static, remove the unused CharDev and
simplify all the call sites by handling that directly with
qtest_send (and qtest_log_send).

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250207153112.3939799-6-alex.bennee@linaro.org>

tests/qtest: simplify qtest_process_inbuf

Don't both creating a GString to temporarily hold our qtest command.
Instead do a simpler g_strndup and use autofree to clean up
afterwards.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250207153112.3939799-5-alex.bennee@linaro.org>

tests/qtest: don't step clock at start of npcm7xx periodic IRQ test

Until there are timers enabled the semantics of clock_step_next() will
fail. Since d524441a36 (system/qtest: properly feedback results of
clock_[step|set]) we will signal a FAIL if time doesn't advance.

Reviewed-by: Hao Wu <wuhaotsh@google.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250207153112.3939799-4-alex.bennee@linaro.org>

tests/qtest: don't attempt to clock_step while waiting for virtio ISR

This replicates the changes from 92cb8f8bf6 (tests/qtest: remove
clock_steps from virtio tests) as there are no timers in the virtio
code. We still busy wait and timeout though.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250207153112.3939799-3-alex.bennee@linaro.org>

tests/docker: replicate the check-rust-tools-nightly CI job

This allows people to run the test locally:

make docker-test-rust@fedora-rust-nightly

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250207153112.3939799-2-alex.bennee@linaro.org>

rust: restrict missing_const_for_fn to qemu_api crate

missing_const_for_fn is not necessarily useful or good. For example in
a private API you can always add const later, and in a public API
it can be unnecessarily restrictive to annotate everything with const
(blocking further improvements to the API).

Nevertheless, QEMU turns it on because qemu_api uses const quite
aggressively and therefore it can be handy to have as much as possible
annotated with const. Outside qemu_api though, not so much: devices
are self contained consumers and if there is nothing that could use
their functions in const contexts that were not anticipated.

Since missing_const_for_fn can be a bit noisy and trigger on trivial
functions that no one would ever call in const context, do not
turn it on everywhere and only keep it in qemu_api as a special case.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

meson: Deprecate 32-bit host support

We deprecated i686 system mode support for qemu 8.0. However, to
make real cleanups to TCG we need to deprecate all 32-bit hosts.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

meson: Disallow 64-bit on 32-bit emulation

For system mode, we can rarely support the amount of RAM that
the guest requires. TCG emulation is restricted to round-robin
mode, which solves many of the atomicity issues, but not those
associated with virtio.  In any case, round-robin does nothing
to help the speed of emulation.

For user mode, most emulation does not succeed at all.  Most
of the time we cannot even load 64-bit non-PIE binaries due
to lack of a 64-bit address space.  Threads are run in
parallel, not round-robin, which means that atomicity
is not handled.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/*: Remove TARGET_LONG_BITS from cpu-param.h

This is now handled by the configs/targets/*.mak fragment.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

configure: Define TARGET_LONG_BITS in configs/targets/*.mak

Define TARGET_LONG_BITS in each target's configure fragment.
Do this without removing the define in target/*/cpu-param.h
so that errors are caught like so:

In file included from .../src/include/exec/cpu-defs.h:26,
                 from ../src/target/hppa/cpu.h:24,
                 from ../src/linux-user/qemu.h:4,
                 from ../src/linux-user/hppa/cpu_loop.c:21:
../src/target/hppa/cpu-param.h:11: error: "TARGET_LONG_BITS" redefined [-Werror]
   11 | #define TARGET_LONG_BITS              64
      |
In file included from .../src/include/qemu/osdep.h:36,
                 from ../src/linux-user/hppa/cpu_loop.c:20:
./hppa-linux-user-config-target.h:32: note: this is the location of the previous definition
   32 | #define TARGET_LONG_BITS 32
      |
cc1: all warnings being treated as errors

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

gitlab-ci: Replace aarch64 with arm in cross-i686-tci build

Configuration of 64-bit host on 32-bit guest will shortly
be denied. Use a 32-bit guest instead.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

meson: Disallow 64-bit on 32-bit HVF/NVMM/WHPX emulation

Require a 64-bit host binary to spawn a 64-bit guest.

For HVF this is trivially true because macOS 11 dropped
support for 32-bit applications entirely.

For NVMM, NetBSD only enables nvmm on x86_64:
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/dev/nvmm/Makefile?rev=1.1.6.2;content-type=text%2Fplain

For WHPX, we have already dropped support for 32-bit Windows.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

meson: Disallow 64-bit on 32-bit Xen emulation

Require a 64-bit host binary to spawn a 64-bit guest.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

meson: Disallow 64-bit on 32-bit KVM emulation

Require a 64-bit host binary to spawn a 64-bit guest.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

meson: Drop tcg as a module

This reverts commit dae0ec159f9 ("accel: build tcg modular").
The attempt was only enabled for x86, only modularized a small
portion of tcg, and in more than 3 years there have been no
follow-ups to improve the situation.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Merge tag 'hppa-system-for-v10-diva-artist-pull-request' of https://github.com/hdeller/qemu-hppa into staging

HPPA graphics and serial console enhancements

A small series of patches which enhance the graphics output on 64-bit hppa
machines. Allow disabling the artist graphic card and introduces drivers for
the Diva GSP (remote management) cards which are used in later 64-bit machines
and which we now use for serial console output.

The LMMIO regions of the Astro chip are now supported too, which is important
to support other graphic cards like an ATI PCI card with a 64-bit Linux kernel.

# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZ6Z1YQAKCRD3ErUQojoP
# X+LvAP0dXGZDtE9Lj5SWuZZVLd/g/KIqx7cvGcRFQSnmAEvqSAD/SIUmCzjxrHfD
# KOUS+DVaCd7xvSIEJtzch2zBL5jvuAw=
# =H3Wz
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 07 Feb 2025 16:04:33 EST
# gpg:                using EDDSA key BCE9123E1AD29F07C049BBDEF712B510A23A0F5F
# gpg: Good signature from "Helge Deller <deller@gmx.de>" [unknown]
# gpg:                 aka "Helge Deller <deller@kernel.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4544 8228 2CD9 10DB EF3D  25F8 3E5F 3D04 A7A2 4603
#      Subkey fingerprint: BCE9 123E 1AD2 9F07 C049  BBDE F712 B510 A23A 0F5F

* tag 'hppa-system-for-v10-diva-artist-pull-request' of https://github.com/hdeller/qemu-hppa:
  target/hppa: Update SeaBIOS-hppa
  hw/pci-host/astro: Add LMMIO range support
  hw/hppa: Avoid creation of artist if disabled on command line
  artist: Allow disabling artist on command line
  hw/hppa: Wire up Diva GSP card
  hw/char: Add emulation of Diva GSP PCI management boards

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

linux-user: Do not define struct sched_attr if libc headers do

glibc 2.41+ has added [1] definitions for sched_setattr and
sched_getattr functions and struct sched_attr. Therefore, it needs
to be checked for here as well before defining sched_attr, to avoid
a compilation failure.

Define sched_attr conditionally only when SCHED_ATTR_SIZE_VER0 is
not defined.

[1] https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=21571ca0d70302909cf72707b2a7736cf12190a0;hp=298bc488fdc047da37482f4003023cb9adef78f8

Signed-off-by: Khem Raj <raj.khem@gmail.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2799
Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

qemu-options: Deprecate -old-param command line option

The '-old-param' command line option is specific to Arm targets; it
is very briefly documented as "old param mode". What this option
actually does is change the behaviour when directly booting a guest
kernel, so that command line arguments are passed to the kernel using
the extremely old "param_struct" ABI, rather than the newer ATAGS or
even newer DTB mechanisms.

This support was added back in 2007 to support an old vendor kernel
on the akita/terrier board types:
https://mail.gnu.org/archive/html/qemu-devel/2007-07/msg00344.html
Even then, it was an out-of-date mechanism from the kernel's
point of view -- the kernel has had a comment since 2001 marking
it as deprecated. As of mid-2024, the kernel only retained
param_struct support for the RiscPC and Footbridge platforms:
https://lore.kernel.org/linux-arm-kernel/2831c5a6-cfbf-4fe0-b51c-0396e5b0aeb7@app.fastmail.com/

None of the board types QEMU supports need param_struct support;
mark this option as deprecated.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20250127123113.2947620-1-peter.maydell@linaro.org

hw/net/cadence_gem:  Fix the mask/compare/disable-mask logic

Our current handling of the mask/compare logic in the Cadence
GEM ethernet device is wrong:
(1) we load the same byte twice from rx_buf when
     creating the compare value
(2) we ignore the DISABLE_MASK flag

The "Cadence IP for Gigabit Ethernet MAC Part Number: IP7014 IP Rev:
R1p12 - Doc Rev: 1.3 User Guide" states that if the DISABLE_MASK bit
in type2_compare_x_word_1 is set, the mask_value field in
type2_compare_x_word_0 is used as an additional 2 byte Compare Value.

Correct these bugs:
* in the !disable_mask codepath, use lduw_le_p() so we
   correctly load a 16-bit value for comparison
* in the disable_mask codepath, we load a full 4-byte value
   from rx_buf for the comparison, set the compare value to
   the whole of the cr0 register (i.e. the concatenation of
   the mask and compare fields), and set mask to 0xffffffff
   to force a 32-bit comparison

Signed-off-by: Andrew Yuan <andrew.yuan@jaguarmicro.com>
Message-id: 20241219061658.805-1-andrew.yuan@jaguarmicro.com
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[PMM: Expand commit message and comment]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hw/cpu/arm: Declare CPU QOM types using DEFINE_TYPES() macro

When multiple QOM types are registered in the same file,
it is simpler to use the the DEFINE_TYPES() macro. In
particular because type array declared with such macro
are easier to review.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20250130112615.3219-7-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hw/cpu/arm: Alias 'num-cpu' property on TYPE_REALVIEW_MPCORE

No need to duplicate and forward the 'num-cpu' property from
TYPE_ARM11MPCORE_PRIV to TYPE_REALVIEW_MPCORE, alias it with
QOM object_property_add_alias().

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20250130112615.3219-6-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hw/arm/fsl-imx7: Add local 'mpcore/gic' variables

The A7MPCore forward the IRQs from its internal GIC.
To make the code clearer, add the 'mpcore' and 'gic'
variables.

Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250130112615.3219-5-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hw/arm/fsl-imx6ul: Add local 'mpcore/gic' variables

The A7MPCore forward the IRQs from its internal GIC.
To make the code clearer, add the 'mpcore' and 'gic'
variables. Rename 'd' variable as 'cpu'.

Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250130112615.3219-4-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hw/arm/fsl-imx6: Add local 'mpcore/gic' variables

The A9MPCore forward the IRQs from its internal GIC.
To make the code clearer, add the 'mpcore' and 'gic'
variables.

Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250130112615.3219-3-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hw/arm/boot: Propagate vCPU to arm_load_dtb()

In heterogeneous setup the first vCPU might not be
the one expected, better pass it explicitly.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20250130112615.3219-2-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

target/arm: Drop unused AArch64DecodeTable typedefs

We removed the old table-based decoder in favour of decodetree, but
we left a couple of typedefs that are now unused; delete them.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250128135046.4108775-1-peter.maydell@linaro.org

tests/tcg/arm: Remove test-arm-iwmmxt test

The test-arm-iwmmmxt test isn't testing what it thinks it's testing.

If you run it with a CPU type that supports iwMMXt then it will crash
immediately with a SIGILL, because (even with -marm) GCC will link it
against startup code that is in Thumb mode, and no iwMMXt CPU has
Thumb:

00010338 <_start>:
   10338:       f04f 0b00       mov.w   fp, #0
   1033c:       f04f 0e00       mov.w   lr, #0

If you run it with a CPU type which does *not* support iwMMXt, which
is what 'make check-tcg' does, then QEMU will not try to handle the
insns as iwMMXt.  Instead the translator turns them into illegal
instructions.  Then in the linux-user cpu_loop() code we identify
them as FPA11 instructions inside emulate_arm_fpa11(), because the
FPA11 happened to use the same coprocessor number as these iwMMXt
insns.  So we execute a completely different set of FPA11 insns,
which means we don't crash, but we will print garbage to stdout.
Then the test binary always exits with a 0 return code, so 'make
check-tcg' thinks the test passes.

Modern gnueabihf toolchains assume in their startup code that the CPU
is not so old as to not support Thumb, so there's no way to get them
to generate a binary that actually does what the test wants.  Since
we're deprecating iwMMXt emulation anyway, it's not worth trying to
salvage the test case to get it to really test the iwMMXt insns.

Delete the test entirely.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250127112715.2936555-3-peter.maydell@linaro.org

target/arm: deprecate the pxa2xx CPUs and iwMMXt emulation

The pxa2xx CPUs are now only useful with user-mode emulation, because
we dropped all the machine types that used them in 9.2.  (Technically
you could alse use "-cpu pxa270" with a board model like versatilepb
which doesn't sanity-check the CPU type, but that has never been a
supported config.)

To use them (or iwMMXt emulation) with QEMU user-mode you would need
to explicitly select them with the -cpu option or the QEMU_CPU
environment variable.  A google search finds no examples of anybody
doing this in the last decade; I don't believe the GCC folks are
using QEMU to test their iwMMXt codegen either.  In fact, GCC is in
the process of dropping support for iwMMXT entirely.

The iwMMXt emulation is thousands of lines of code in QEMU, and
is now the only bit of Arm insn decode which doesn't use decodetree.
We have no way to test or validate changes to it. This code is
just dead weight that is almost certainly not being used by anybody.
Mark it as deprecated.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250127112715.2936555-2-peter.maydell@linaro.org

rust: pl011: use default set of lints

Being the first crate added to QEMU, pl011 has a rather restrictive
Clippy setup.  This can be sometimes a bit too heavy on its suggestions,
for example

error: this could be a `const fn`
   --> hw/char/pl011/src/device.rs:382:5
    |
382 | /     fn set_read_trigger(&mut self) {
383 | |         self.read_trigger = 1;
384 | |     }
    | |_____^

Just use the standard set that is present in rust/Cargo.toml, with
just a small adjustment to allow upper case acronyms which are used
for register names.

Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

tcg/optimize: optimize TSTNE using smask and zmask

Generalize the existing optimization of "TSTNE x,sign" and "TSTNE x,-1".
This can be useful for example in the i386 frontend, which will generate
tests of zero-extended registers against 0xffffffff.

Ironically, on x86 hosts this is a very slight pessimization in the very
case it's meant to optimize because

brcond_i64 cc_dst,$0xffffffff,tsteq,$L1

(test %ebx, %ebx) is 1 byte smaller than

brcond_i64 cc_dst,$0x0,eq,$L1

(test %rbx, %rbx). However, in general it is an improvement, especially
if it avoids placing a large immediate in the constant pool.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

tests/tcg/x86_64/fma: Test some x86 fused-multiply-add cases

Add a test case which tests some corner case behaviour of
fused-multiply-add on x86:
* 0 * Inf + SNaN should raise Invalid
* 0 * Inf + QNaN shouldh not raise Invalid
* tininess should be detected after rounding

There is also one currently-disabled test case:
* flush-to-zero should be done after rounding

This is disabled because QEMU's emulation currently does this
incorrectly (and so would fail the test). The test case is kept in
but disabled, as the justification for why the test running harness
has support for testing both with and without FTZ set.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20250116112536.4117889-3-peter.maydell@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: Do not raise Invalid for 0 * Inf + QNaN

In commit 8adcff4ae7 ("fpu: handle raising Invalid for infzero in
pick_nan_muladd") we changed the handling of 0 * Inf + QNaN to always
raise the Invalid exception regardless of target architecture.  (This
was a change affecting hppa, i386, sh4 and tricore.) However, this
was incorrect for i386, which documents in the SDM section 14.5.2
that for the 0 * Inf + NaN case that it will only raise the Invalid
exception when the input is an SNaN.  (This is permitted by the IEEE
754-2008 specification, which documents that whether we raise Invalid
for 0 * Inf + QNaN is implementation defined.)

Adjust the softfloat pick_nan_muladd code to allow the target to
suppress the raising of Invalid for the inf * zero + NaN case (as an
extra flag orthogonal to its choice for when to use the default NaN),
and enable that for x86.

We do not revert here the behaviour change for hppa, sh4 or tricore:
* The sh4 manual is clear that it should signal Invalid
* The tricore manual is a bit vague but doesn't say it shouldn't
* The hppa manual doesn't talk about fused multiply-add corner
   cases at all

Cc: qemu-stable@nongnu.org
Fixes: 8adcff4ae7 (""fpu: handle raising Invalid for infzero in pick_nan_muladd")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20250116112536.4117889-2-peter.maydell@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: add clippy configuration file

Configure the minimum supported Rust version (though strictly speaking
that's redundant with Cargo.toml), and the list of CamelCase identifiers
that are not Rust types.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: add docs

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Merge tag 'pull-9p-20250207' of https://github.com/cschoenebeck/qemu into staging

9pfs changes:

* Greg Kurz steps back as maintainer of 9pfs.

* Make multidevs=remap default option (instead of multidevs=warn)
  and update documentation related to this option.

* Improve tracing (i.e. usefulness of log output content).

# -----BEGIN PGP SIGNATURE-----
#
# iQJLBAABCgA1FiEEltjREM96+AhPiFkBNMK1h2Wkc5UFAmel2AEXHHFlbXVfb3Nz
# QGNydWRlYnl0ZS5jb20ACgkQNMK1h2Wkc5UOuw//YMqU1N1JV7ppHMOKgu9GBSPI
# xFS5fhGHQd0b2DhXlX0m0vlwCN5m7up7Q1NwkgXzfaIV3wdU8XH1DgMXonjQdBaT
# ipIGdCR9aeGCUlFjlkQCEUVooDAuyU2zqIpfzvX7eCUFI96S2DI+jjmSqOhlJSsz
# yJnTJOK0LuoMpaFDxHJiEuTfoMycUWT78hTHRE8h/UWfTcKrzz5uIajq05bA1QLd
# XRtMa2k8ZWZf+uOcky3Qq/g5UHCc9LUidnVvTBejRQeEeE+qhLX9CzTUF+elNRiF
# BKPjQLfdyTMeleOj3KWxAkIV0GP4LA5naEi7Li56Kx/qOSySOnlbfUR0rXm1L58t
# yJdtavJEXFiRDa1BPvZeuDEGbSf229gOvQtD9yhURtw9XtdhZ8WuX7edes9/S5xz
# nc0slBD4UMkUL0VqD5LX4nwW/IJbUGD/sgqg1sMMXC+cSptPWCTQjf56VGZHg0Kf
# oasIe2pKyOu0NEIpxpolu+AGlgGsOHgEknBdG5GJ/dFzKb+9sukVEFxzZP/eRQjp
# d7ShWkn1o96vMgReFaYIXboVXNdeePDjfvLLD8+xKp3KEvidGEQs1PGPWyU0juIT
# 2en3PwZfYvm3Wfys3dAl7g1XJOzFM177SHKYKbEEeziIeRNN5oTjPxAIu0+M2yVK
# y2Khd01ZJQBGhiYYgVw=
# =Tr40
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 07 Feb 2025 04:53:05 EST
# gpg:                using RSA key 96D8D110CF7AF8084F88590134C2B58765A47395
# gpg:                issuer "qemu_oss@crudebyte.com"
# gpg: Good signature from "Christian Schoenebeck <qemu_oss@crudebyte.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: ECAB 1A45 4014 1413 BA38  4926 30DB 47C3 A012 D5F4
#      Subkey fingerprint: 96D8 D110 CF7A F808 4F88  5901 34C2 B587 65A4 7395

* tag 'pull-9p-20250207' of https://github.com/cschoenebeck/qemu:
  MAINTAINERS: Mark me as reviewer only for 9pfs
  9pfs: improve v9fs_open() tracing
  9pfs: make multidevs=remap default
  9pfs: improve v9fs_walk() tracing

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

MAINTAINERS: Mark me as reviewer only for 9pfs

I still review 9pfs changes from time to time but I'm definitely
not able to do actual maintainer work. Drop my tree on the way
as I'll obviously not use it anymore, and it has been left
untouched since May 2020.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-Id: <20250115100849.259612-1-groug@kaod.org>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>

rust: include rust_version in Cargo.toml

Tell clippy the minimum supported Rust version for QEMU.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: remove unnecessary Cargo.toml metadata

Some items of Cargo.toml (readme, homepage, repository) are
only present because of clippy::cargo warnings being enabled in
rust/hw/char/pl011/src/lib.rs. But these items are not
particularly useful and would be all the same for all Cargo.toml
files in the QEMU workspace. Clean them up.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

rust: add --rust-target option for bindgen

Without it, recent bindgen will give an error

error: extern block cannot be declared unsafe

if rustc is not new enough to support the "unsafe extern" construct.

Cc: qemu-rust@nongnu.org
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20250206111514.2134895-1-pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

9pfs: improve v9fs_open() tracing

Improve tracing of 9p 'Topen' request type by showing open() flags as
human-readable text.

E.g. trace output:

  v9fs_open tag 0 id 12 fid 2 mode 100352

would become:

  v9fs_open tag=0 id=12 fid=2 mode=100352(RDONLY|NONBLOCK|DIRECTORY|
  TMPFILE|NDELAY)

Therefor add a new utility function qemu_open_flags_tostr() that converts
numeric open() flags from host's native O_* flag constants to a string
presentation.

9p2000.L and 9p2000.u protocol variants use different numeric 'mode'
constants for 'Topen' requests. Instead of writing string conversion code
for both protocol variants, use the already existing conversion functions
that convert the mode flags from respective protocol constants to host's
native open() numeric flag constants and pass that result to the new
string conversion function qemu_open_flags_tostr().

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-Id: <E1tTgDR-000oRr-9g@kylie.crudebyte.com>

9pfs: make multidevs=remap default

1a6ed33cc5 introduced option multidevs=remap|forbid|warn and made
"warn" the default option.

As it turned out though, e.g. by several reports in conjunction with
following 9p client issue:

https://github.com/torvalds/linux/commit/850925a8133c73c4a2453c360b2c3beb3bab67c9

Many people are just ignoring this warning, or even do not notice the
warning at all. Therefore make multidevs=remap the new default option to
prevent people to run into such kind of severe misbehaviours in the first
place.

From performance PoV the runtime overhead of multidevs=remap is
neglectable with few or even just only one device being shared with the
same 9p export, expected to be constant Theta(1). The inode numbers
emitted to guest also just loose one bit (since 6b6aa8285d) for the 1st
device being shared.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-Id: <09cc84e5561f66b6a8cf49b3532c6c78a6acc806.1734876877.git.qemu_oss@crudebyte.com>

9pfs: improve v9fs_walk() tracing

'Twalk' is the most important request type in the 9p protocol to look out
for when debugging 9p communication. That's because it is the only part
of the 9p protocol which actually deals with human-readable path names,
whereas all other 9p request types work on numeric file IDs (FIDs) only.

Improve tracing of 'Twalk' requests, e.g. let's say client wanted to walk
to "/home/bob/src", then improve trace output from:

v9fs_walk tag 0 id 110 fid 0 newfid 1 nwnames 3

to:

v9fs_walk tag=0 id=110 fid=0 newfid=1 nwnames=3 wnames={home, bob, src}

To achieve this, add a new helper function trace_v9fs_walk_wnames() which
converts the received V9fsString array of individual path elements into a
comma-separated string presentation for being passed to the tracing system.
As this conversion is somewhat expensive, this conversion function is only
called if tracing of event 'v9fs_walk' is currently enabled.

Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <E1tJamT-007Cqk-9E@kylie.crudebyte.com>

block: remove unused BLOCK_OP_TYPE_DATAPLANE

BLOCK_OP_TYPE_DATAPLANE prevents BlockDriverState from being used by
virtio-blk/virtio-scsi with IOThread. Commit b112a65c52aa ("block:
declare blockjobs and dataplane friends!") eliminated the main reason
for this blocker in 2014.

Nowadays the block layer supports I/O from multiple AioContexts, so
there is even less reason to block IOThread users. Any legitimate
reasons related to interference would probably also apply to
non-IOThread users.

The only remaining users are bdrv_op_unblock(BLOCK_OP_TYPE_DATAPLANE)
calls after bdrv_op_block_all(). If we remove BLOCK_OP_TYPE_DATAPLANE
their behavior doesn't change.

Existing bdrv_op_block_all() callers that don't explicitly unblock
BLOCK_OP_TYPE_DATAPLANE seem to do so simply because no one bothered to
rather than because it is necessary to keep BLOCK_OP_TYPE_DATAPLANE
blocked.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20250203182529.269066-1-stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

iotests: Add (NBD-based) tests for inactive nodes

This tests different types of operations on inactive block nodes
(including graph changes, block jobs and NBD exports) to make sure that
users manually activating and inactivating nodes doesn't break things.

Support for inactive nodes in other export types will have to come with
separate test cases because they have different dependencies like blkio
or root permissions and we don't want to disable this basic test when
they are not fulfilled.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Message-ID: <20250204211407.381505-17-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

iotests: Add qsd-migrate case

Test that it's possible to migrate a VM that uses an image on shared
storage through qemu-storage-daemon.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20250204211407.381505-16-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

iotests: Add filter_qtest()

The open-coded form of this filter has been copied into enough tests
that it's better to move it into iotests.py.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20250204211407.381505-15-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

nbd/server: Support inactive nodes

In order to support running an NBD export on inactive nodes, we must
make sure to return errors for any operations that aren't allowed on
inactive nodes. Reads are the only operation we know we need for
inactive images, so to err on the side of caution, return errors for
everything else, even if some operations could possibly be okay.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Message-ID: <20250204211407.381505-14-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/export: Add option to allow export of inactive nodes

Add an option in BlockExportOptions to allow creating an export on an
inactive node without activating the node. This mode needs to be
explicitly supported by the export type (so that it doesn't perform any
operations that are forbidden for inactive nodes), so this patch alone
doesn't allow this option to be successfully used yet.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20250204211407.381505-13-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Drain nodes before inactivating them

So far the assumption has always been that if we try to inactivate a
node, it is already idle. This doesn't hold true any more if we allow
inactivating exported nodes because we can't know when new external
requests come in.

Drain the node around setting BDRV_O_INACTIVE so that requests can't
start operating on an active node and then in the middle it suddenly
becomes inactive. With this change, it's enough for exports to check
for new requests that they operate on an active node (or, like reads,
are allowed even on an inactive node).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Message-ID: <20250204211407.381505-12-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/export: Don't ignore image activation error in blk_exp_add()

Currently, block exports can't handle inactive images correctly.
Incoming write requests would run into assertion failures. Make sure
that we return an error when creating an export can't activate the
image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20250204211407.381505-11-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Support inactive nodes in blk_insert_bs()

Device models have a relatively complex way to set up their block
backends, in which blk_attach_dev() sets blk->disable_perm = true.
We want to support inactive images in exports, too, so that
qemu-storage-daemon can be used with migration. Because they don't use
blk_attach_dev(), they need another way to set this flag. The most
convenient is to do this automatically when an inactive node is attached
to a BlockBackend that can be inactivated.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20250204211407.381505-10-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Add blockdev-set-active QMP command

The system emulator tries to automatically activate and inactivate block
nodes at the right point during migration. However, there are still
cases where it's necessary that the user can do this manually.

Images are only activated on the destination VM of a migration when the
VM is actually resumed. If the VM was paused, this doesn't happen
automatically. The user may want to perform some operation on a block
device (e.g. taking a snapshot or starting a block job) without also
resuming the VM yet. This is an example where a manual command is
necessary.

Another example is VM migration when the image files are opened by an
external qemu-storage-daemon instance on each side. In this case, the
process that needs to hand over the images isn't even part of the
migration and can't know when the migration completes. Management tools
need a way to explicitly inactivate images on the source and activate
them on the destination.

This adds a new blockdev-set-active QMP command that lets the user
change the status of individual nodes (this is necessary in
qemu-storage-daemon because it could be serving multiple VMs and only
one of them migrates at a time). For convenience, operating on all
devices (like QEMU does automatically during migration) is offered as an
option, too, and can be used in the context of single VM.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20250204211407.381505-9-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Add option to create inactive nodes

In QEMU, nodes are automatically created inactive while expecting an
incoming migration (i.e. RUN_STATE_INMIGRATE). In qemu-storage-daemon,
the notion of runstates doesn't exist. It also wouldn't necessarily make
sense to introduce it because a single daemon can serve multiple VMs
that can be in different states.

Therefore, allow the user to explicitly open images as inactive with a
new option. The default is as before: Nodes are usually active, except
when created during RUN_STATE_INMIGRATE.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20250204211407.381505-8-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Fix crash on block_resize on inactive node

In order for block_resize to fail gracefully on an inactive node instead
of crashing with an assertion failure in bdrv_co_write_req_prepare()
(called from bdrv_co_truncate()), we need to check for inactive nodes
also when they are attached as a root node and make sure that
BLK_PERM_RESIZE isn't among the permissions allowed for inactive nodes.
To this effect, don't enumerate the permissions that are incompatible
with inactive nodes any more, but allow only BLK_PERM_CONSISTENT_READ
for them.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20250204211407.381505-7-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Don't attach inactive child to active node

An active node makes unrestricted use of its children and would possibly
run into assertion failures when it operates on an inactive child node.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20250204211407.381505-6-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

migration/block-active: Remove global active flag

Block devices have an individual active state, a single global flag
can't cover this correctly. This becomes more important as we allow
users to manually manage which nodes are active or inactive.

Now that it's allowed to call bdrv_inactivate_all() even when some
nodes are already inactive, we can remove the flag and just
unconditionally call bdrv_inactivate_all() and, more importantly,
bdrv_activate_all() before we make use of the nodes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20250204211407.381505-5-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Inactivate external snapshot overlays when necessary

Putting an active block node on top of an inactive one is strictly
speaking an invalid configuration and the next patch will turn it into a
hard error.

However, taking a snapshot while disk images are inactive after
completing migration has an important use case: After migrating to a
file, taking an external snapshot is what is needed to take a full VM
snapshot.

In order for this to keep working after the later patches, change
creating a snapshot such that it automatically inactivates an overlay
that is added on top of an already inactive node.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20250204211407.381505-4-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Allow inactivating already inactive nodes

What we wanted to catch with the assertion is cases where the recursion
finds that a child was inactive before its parent. This should never
happen. But if the user tries to inactivate an image that is already
inactive, that's harmless and we don't want to fail the assertion.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20250204211407.381505-3-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Add 'active' field to BlockDeviceInfo

This allows querying from QMP (and also HMP) whether an image is
currently active or inactive (in the sense of BDRV_O_INACTIVE).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20250204211407.381505-2-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block-backend: Fix argument order when calling 'qapi_event_send_block_io_error()'

Commit 7452162adec25c10 introduced 'qom-path' argument to BLOCK_IO_ERROR
event but when the event is instantiated in 'send_qmp_error_event()' the
arguments for 'device' and 'qom_path' in
qapi_event_send_block_io_error() were reversed :

Generated code for sending event:

  void qapi_event_send_block_io_error(const char *qom_path,
                                      const char *device,
                                      const char *node_name,
                                      IoOperationType operation,
                                      [...]

Call inside send_qmp_error_event():

     qapi_event_send_block_io_error(blk_name(blk),
                                    blk_get_attached_dev_path(blk),
                                    bs ? bdrv_get_node_name(bs) : NULL, optype,
                                    [...]

This results into reporting the QOM path as the device alias and vice
versa which in turn breaks libvirt, which expects the device alias being
either a valid alias or empty (which would make libvirt do the lookup by
node-name instead).

Cc: qemu-stable@nongnu.org
Fixes: 7452162adec2 ("qapi: add qom-path to BLOCK_IO_ERROR event")
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Message-ID: <09728d784888b38d7a8f09ee5e9e9c542c875e1e.1737973614.git.pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>