git.maquefel.me Git - qemu.git/log

aspeed/smc: Reintroduce "dram-base" property for AST2700

The Aspeed SMC device model use to have a 'sdram_base' property. It
was removed by commit d177892d4a48 ("aspeed/smc: Remove unused
"sdram-base" property") because previous changes simplified the DMA
transaction model to use an offset in RAM and not the physical
address.

The AST2700 SoC has larger address space (64-bit) and a new register
DMA DRAM Side Address High Part (0x7C) is introduced to deal with the
high bits of the DMA address. To be able to compute the offset of the
DMA transaction, as done on the other SoCs, we will need to know where
the DRAM is mapped in the address space. Re-introduce a "dram-base"
property to hold this value.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Jamin Lin <jamin_lin@aspeedtech.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>

Merge tag 'virtio-grants-v8-tag' of https://gitlab.com/sstabellini/qemu into staging

virtio-grants-v8

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEE0E4zq6UfZ7oH0wrqiU+PSHDhrpAFAmZqEk4ACgkQiU+PSHDh
# rpBaBxAA1jTfkty2RWJ0LfU5ekxnEWSx63zVzDWESFOQRjp/rOk/FhHbqbHzXISk
# cbHjz2PX6mNSOiFoSOWsNP7Utg+7xPH34D+D/EH59bmrXYFHCXxYjIK/T8T2Jr2p
# /qx3x/qxGRXFq38WFHvLhdK/0obdOuF3M6W/Zz82z8ruo7uHBX4XuCsF2rWV0ydb
# mvfAh+iMwh1JQN/g/vHIf0h+2RQjGCfsez+xVnG4rSeE4UWn/4iaU5c6SJ80arwE
# mwlnDOysEXwIZuy0fi+RX8o4tUie8rcS19+rBoMskXCAJXQblV/Aqhq4qww2DtA+
# kjL7HTHZrccZOJME9dj5gIUHvjAa9wxDZ5luelNVGY+VNO1hWXfk8Rcl9rtvOmNZ
# FKwcj3HW0ggQQMlH5+QizFQhNM3iRoirzX3t9Vw3uNbmwyTjSHcN3qVBExeCLAaT
# +N6t+aBfCOL5ZVskFb6YYxvWe3gLSghFH4cN/l0VLngzuGFl4BUNny5aNaENQYbX
# OSwH3rsE45j6X4B0gtwBXWFC31WpA1wPBwKYwcPZNmKWl30oJsXUs9UrTMHu4H/Z
# NnpFTgGYBaPCqlhkdIVQkOTpY9q85LzxQ8A+uwBUK+4uZwnw9rPXf+If8kyX/5eL
# 1AlVfBAG9uSVT/+AqxW/49jQ6jHRQ9ZgL9y6H0N0Ql3nrQBMasI=
# =4mj9
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 12 Jun 2024 02:25:34 PM PDT
# gpg:                using RSA key D04E33ABA51F67BA07D30AEA894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>" [unknown]
# gpg:                 aka "Stefano Stabellini <sstabellini@kernel.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3  0AEA 894F 8F48 70E1 AE90

* tag 'virtio-grants-v8-tag' of https://gitlab.com/sstabellini/qemu:
  hw/arm: xen: Enable use of grant mappings
  xen: mapcache: Add support for grant mappings
  xen: mapcache: Pass the ram_addr offset to xen_map_cache()
  xen: mapcache: Unmap first entries in buckets
  xen: mapcache: Make MCACHE_BUCKET_SHIFT runtime configurable

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Merge tag 'migration-20240614-pull-request' of https://gitlab.com/farosas/qemu into staging

Migration pull request

- Nick's reenabling of ppc64 tests + speed improvements
- Yuan's IAA/QPL compression support for multifd
- Shameer's UADK compression support for multifd

# -----BEGIN PGP SIGNATURE-----
#
# iQJEBAABCAAuFiEEqhtIsKIjJqWkw2TPx5jcdBvsMZ0FAmZseDcQHGZhcm9zYXNA
# c3VzZS5kZQAKCRDHmNx0G+wxncT/D/0RkSBDyY7Mg+WLIUkbXBFKxnCwpiDiub4K
# FsesQfTU8IBLTHSkAMeTipZ8MMg1odfTcB6CCzRpXdJ4V07UGCxKEV77WftiomUm
# bA/FmkvQRQh2iuEESV+6ciomvI33085TuZLguMQCsER1gv3BPCVjLZ3n7/oTm9MD
# IdLJx9x5vLKLgT1pfHJt0x9joER77Vk7JN97fuHHvcWBlUnZ1vsmWf3ZQnnWLJNf
# bg5TSlmxV1x/iGJh0GDIVyZHgBJ1jWKA7qONHxACP4mF14WFCVaQ8DYS+yL6Ggs3
# vAdOjTECE7kAbb6zk33NoZ8GO39xzrGTvYoxOGEnOCB8pco/dHyr01mdiH/NM+uF
# +OTymQhO8LqJ1VGPvkDfQy2CZmb7DbkER5Y/0zBPaUJCjqNlEQUoq5UfCJDPp5Am
# u5e29QQLWA1j4rsIA7L4HUP8KEuJrnANMSGaomJIjbR/rbLXwb0k5Fr9DL4J4bIu
# z6e+SMrY+0SMAmx5u9WG7HhVTw8yvZM1PnrvCvFGX35nNB0VJ0//lejLGNOXjcXm
# QZcytlkyKeLwn6mRJWCWlasbW07/lNegNRqBP394awFtG8OYKDgrHfTtxtJcLIiK
# buLmZezuI4XVPA2WxmK+viCAfPTukpnoLaQr1yxGH22VThqwjfcEyAHQFccSvY3y
# F3n9dtwpUQ==
# =HAv2
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 14 Jun 2024 10:04:55 AM PDT
# gpg:                using RSA key AA1B48B0A22326A5A4C364CFC798DC741BEC319D
# gpg:                issuer "farosas@suse.de"
# gpg: Good signature from "Fabiano Rosas <farosas@suse.de>" [unknown]
# gpg:                 aka "Fabiano Almeida Rosas <fabiano.rosas@suse.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: AA1B 48B0 A223 26A5 A4C3  64CF C798 DC74 1BEC 319D

* tag 'migration-20240614-pull-request' of https://gitlab.com/farosas/qemu:
  tests/migration-test: add uadk compression test
  migration/multifd: Switch to no compression when no hardware support
  migration/multifd: Add UADK based compression and decompression
  migration/multifd: Add UADK initialization
  migration/multifd: add uadk compression framework
  configure: Add uadk option
  docs/migration: add uadk compression feature
  tests/migration-test: add qpl compression test
  migration/multifd: implement qpl compression and decompression
  migration/multifd: implement initialization of qpl compression
  migration/multifd: add qpl compression method
  configure: add --enable-qpl build option
  migration/multifd: put IOV initialization into compression method
  docs/migration: add qpl compression feature
  tests/qtest/migration-test: Use custom asm bios for ppc64
  tests/qtest/migration-test: Enable on ppc64 TCG
  tests/qtest/migration-test: Quieten ppc64 QEMU warnings
  tests/qtest: Move common define from libqos-spapr.h to new ppc-util.h

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tests/migration-test: add uadk compression test

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

migration/multifd: Switch to no compression when no hardware support

Send raw packets over if UADK hardware support is not available. This is to
satisfy Qemu qtest CI which may run on platforms that don't have UADK
hardware support. Subsequent patch will add support for uadk migration
qtest.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

migration/multifd: Add UADK based compression and decompression

Uses UADK wd_do_comp_sync() API to (de)compress a normal page using
hardware accelerator.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

migration/multifd: Add UADK initialization

Initialize UADK session and allocate buffers required. The actual
compression/decompression will only be done in a subsequent patch.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

migration/multifd: add uadk compression framework

Adds the skeleton to support uadk compression method.
Complete functionality will be added in subsequent patches.

Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

configure: Add uadk option

Add --enable-uadk and --disable-uadk options to enable and disable
UADK compression accelerator. This is for using UADK based hardware
accelerators for live migration.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

docs/migration: add uadk compression feature

Document UADK(User Space Accelerator Development Kit) library details
and how to use that for migration.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Zhangfei Gao <zhangfei.gao@linaro.org>
[s/Qemu/QEMU in docs]
Signed-off-by: Fabiano Rosas <farosas@suse.de>

tests/migration-test: add qpl compression test

add qpl to compression method test for multifd migration

the qpl compression supports software path and hardware
path(IAA device), and the hardware path is used first by
default. If the hardware path is unavailable, it will
automatically fallback to the software path for testing.

Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

migration/multifd: implement qpl compression and decompression

QPL compression and decompression will use IAA hardware path if the IAA
hardware is available. Otherwise the QPL library software path is used.

The hardware path will automatically fall back to QPL software path if
the IAA queues are busy. In some scenarios, this may happen frequently,
such as configuring 4 channels but only one IAA device is available. In
the case of insufficient IAA hardware resources, retry and fallback can
help optimize performance:

1. Retry + SW fallback:
    total time: 14649 ms
    downtime: 25 ms
    throughput: 17666.57 mbps
    pages-per-second: 1509647

2. No fallback, always wait for work queues to become available
    total time: 18381 ms
    downtime: 25 ms
    throughput: 13698.65 mbps
    pages-per-second: 859607

If both the hardware and software paths fail, the uncompressed page is
sent directly.

Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

migration/multifd: implement initialization of qpl compression

during initialization, a software job is allocated to each channel
for software path fallabck when the IAA hardware is unavailable or
the hardware job submission fails. If the IAA hardware is available,
multiple hardware jobs are allocated for batch processing.

Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

migration/multifd: add qpl compression method

add the Query Processing Library (QPL) compression method

Introduce the qpl as a new multifd migration compression method, it can
use In-Memory Analytics Accelerator(IAA) to accelerate compression and
decompression, which can not only reduce network bandwidth requirement
but also reduce host compression and decompression CPU overhead.

How to enable qpl compression during migration:
migrate_set_parameter multifd-compression qpl

There is no qpl compression level parameter added since it only supports
level one, users do not need to specify the qpl compression level.

Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
[fixed docs spacing in migration.json]
Signed-off-by: Fabiano Rosas <farosas@suse.de>

configure: add --enable-qpl build option

add --enable-qpl and --disable-qpl options to enable and disable
the QPL compression method for multifd migration.

The Query Processing Library (QPL) is an open-source library
that supports data compression and decompression features. It
is based on the deflate compression algorithm and use Intel
In-Memory Analytics Accelerator(IAA) hardware for compression
and decompression acceleration.

For more live migration with IAA, please refer to the document
docs/devel/migration/qpl-compression.rst

Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

migration/multifd: put IOV initialization into compression method

Different compression methods may require different numbers of IOVs.
Based on streaming compression of zlib and zstd, all pages will be
compressed to a data block, so two IOVs are needed for packet header
and compressed data block.

Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

docs/migration: add qpl compression feature

add Intel Query Processing Library (QPL) compression method
introduction

Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

tests/qtest/migration-test: Use custom asm bios for ppc64

Similar to other archs, build a custom bios memory updater. Running the
test with OF code is a cool trick, but SLOF takes a long time to boot.
This reduces test time by around 3x (150s to 50s).

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

tests/qtest/migration-test: Enable on ppc64 TCG

ppc64 with TCG seems to no longer be failing this test, perhaps since
commit 03bfc2188f061 ("physmem: Fix migration dirty bitmap coherency
with TCG memory access") which is not ppc specific but was seen to hit
ppc64 quite easily.

Let's enable it again.

The s390x problem has been identified so mention it while we are
adjusting the comment.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

tests/qtest/migration-test: Quieten ppc64 QEMU warnings

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

tests/qtest: Move common define from libqos-spapr.h to new ppc-util.h

The spapr QEMU machine defaults is useful outside libqos, so create a
new header for ppc specific qtests and move it there.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>

Merge tag 'pull-request-2024-06-12' of https://gitlab.com/thuth/qemu into staging

* Fix loongarch64 avocado test
* Make qtests more flexible with regards to non-available CPU models
* Improvements for the test-smp-parse unit test

# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmZpoEoRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbVF6g/+JYTRKmaIduQIP9g2+NkM+qMTbjI9Ow47
# 8Vdj/ePMXNWOZsgMPkCUdisYeMZEC+XMcDN1xvwZXwLTMTJRacZCSFRpeN4P0m0W
# 6aQ28+tPNgx+B9Eh2kc4TpxbiqSH8u5u4GEN4Y07rcX/3YbYyjFgZD8orRu/nJ+H
# 0wV7Riq9csi1BkLxrgKaHocFSOl4eOga4OFi+u4wIn/xoW3MN0laxe4iuoQRMZPf
# gJLPRhEija4lto8iIKNxJbTABB0wEcWRWtgqcbHxdatqh1lPTPBpWxmdD/v1LJn+
# H/eO+oh05NQdlhw7+xfWF9PD+MpIePbZ28oNb3X3uURROTdcxpBAgpPipv07FsT4
# LmU2nIBQ4FcpDOkhLnLmBmFBNO6uDCzuGzxFRhX1SIiGMABqTDOKynBQSgQI2iB0
# 5J47XUwHtnOoCvf4SRA/MZG8zNSQZdJbnuOBLgZ+vsCG14mWM2NbfSUwRkH6pd/J
# fEbODuzHZoYgUTxjR9+WMbINAbNjMy+SP2sGZIBzcAIIkybKynOy58LoCyNT684U
# ean9bnc65908PJxEfsQ6k9kNwkK4GwOqZi+X383nVgMJ9+3dDw8M76IVU59hsq1n
# wnz4VgFcRdXMYhj9zghaCgH2Ezw8gZHILXH+RlX0Bav4LQ5vSZQ6tRNwM4+rfXBe
# okF1Sxmz31U=
# =s7+V
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 12 Jun 2024 06:19:06 AM PDT
# gpg:                using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg:                issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg:                 aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg:                 aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# gpg:                 aka "Thomas Huth <huth@tuxfamily.org>" [full]

* tag 'pull-request-2024-06-12' of https://gitlab.com/thuth/qemu:
  tests/tcg/s390x: Allow specifying extra QEMU options on the command line
  tests/unit/test-smp-parse: Test the full 8-levels topology hierarchy
  tests/unit/test-smp-parse: Test "modules" and "dies" combination case
  tests/unit/test-smp-parse: Test "modules" parameter in -smp
  tests/unit/test-smp-parse: Make test cases aware of module level
  tests/unit/test-smp-parse: Use default parameters=0 when not set in -smp
  tests/unit/test-smp-parse: Fix an invalid topology case
  tests/unit/test-smp-parse: Fix comment of parameters=1 case
  tests/unit/test-smp-parse: Fix comments of drawers and books case
  test: Remove libibumad dependence
  meson: Remove libibumad dependence
  tests/qtest/x86: check for availability of older cpu models before running tests
  tests/qtest/libqtest: add qtest_has_cpu_model() api
  qtest/x86/numa-test: do not use the obsolete 'pentium' cpu
  tests/avocado: Update LoongArch bios file

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Merge tag 'for-upstream' of https://repo.or.cz/qemu/kevin into staging

Block layer patches

- crypto: Fix crash when used with multiqueue devices
- linux-aio: add IO_CMD_FDSYNC command support
- copy-before-write: Avoid integer overflows for timeout > 4s
- Fix crash with QMP block_resize and iothreads
- qemu-io: add cvtnum() error handling for zone commands
- Code cleanup

# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmZoitoRHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9Z2ng/+KVz0P1M6fjdI0yJSwAla3PVRfB0BjZ+k
# pwoUaHholVB4lmhU8OhtUHgEPK/jIZVdgwfG2se8WHR3eAzEVTWqt5mRIjOVUX4b
# N29G6gTHt8p64YKSbiqnyK2IC7qhY/o3hQ+d8frk+tqstc2tzFHDtjkWtYROdl/X
# iNW6zXy1rz5qIyJ80QWvBs7CfQuvElzK0GN2QusSZDEUJYiLhVS6QfjNmRfJI5yT
# /eDoHAjMJycxy+8YpEj1QEdEcFV7dS0BCr6qeWeAg50Gej1xlDeknejG+Cro2A1z
# MJu4blqMhzzjG9YIS90wCDOxXYdifa1VQSIpV6zpU1ExToXFOVtF3h06Hu0aHiBu
# hU4UnTsQSLmlQXbSbFwlVgRdGfAxvIxp6EuWtPteSAfnxAlxoQbqnV6uN/RsFnsr
# R+zSiNx+20IDj4befzcQWNWpWNnTloRR01/iucncEpZZEu0/E58Y4bFAWBexMOhz
# MgYTXTVgR+WPuyR8FXyXX32dQBQMb5grSnseXwOBhi3ULrMqjLinR60B+XbWgy/g
# mE/oLc+uttAk1EbHH/8od8vjvtDHdl9FrfsPaPDlJTiexqNZHxiDE3WVdhvaPsTF
# wJ0CB7pdvrWIAVwmSpfksVoyL2HQx2ILjGSQbKPvYEZqSoUMr7+7Z0SkTQ1i706b
# xODS2wm+h0Q=
# =hMLb
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 11 Jun 2024 10:35:22 AM PDT
# gpg:                using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg:                issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]

* tag 'for-upstream' of https://repo.or.cz/qemu/kevin:
  crypto/block: drop qcrypto_block_open() n_threads argument
  block/crypto: create ciphers on demand
  linux-aio: add IO_CMD_FDSYNC command support
  block/copy-before-write: use uint64_t for timeout in nanoseconds
  qemu-io: add cvtnum() error handling for zone commands
  aio: warn about iohandler_ctx special casing
  Revert "monitor: use aio_co_reschedule_self()"
  block: drop force_dup parameter of raw_reconfigure_getfd()

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Merge tag 'tracing-pull-request' of https://gitlab.com/stefanha/qemu into staging

Pull request

Cleanups from Philippe Mathieu-Daudé.

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmZnNCQACgkQnKSrs4Gr
# c8hRQgf/WDNO0IvplK4U9PO5+Zm165xqY6lttfgniJzT2jb4p/dg0LiNOSqHx53Q
# 2eM/YJl7GxSXwnIESqNVuVxixh8DvExmOtM8RJm3HyJWtZoKfgMOV/dzHEhST3xj
# PglIEwL5Cm14skhQAVhJXzFlDjZ8seoz+YCbLhcYWk2B3an+5PvFySbp4iHS9cXJ
# lZUZx/aa9xjviwzMbsMxzFt3rA22pgNaxemV40FBIMWC0H+jP5pgBdZXE2n8jJvB
# 9eXZyG1kdkJKXO2DMhPYuG4rEEWOhV6dckXzmxCQEbHlGTH7X3Pn1F5B3+agi9g3
# 39U1Z+WFb8JFLOQMCQ3jlcbkIfULzQ==
# =wqXR
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 10 Jun 2024 10:13:08 AM PDT
# gpg:                using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]

* tag 'tracing-pull-request' of https://gitlab.com/stefanha/qemu:
  tracetool: Forbid newline character in event format
  hw/vfio: Remove newline character in trace events
  hw/usb: Remove newline character in trace events
  hw/sh4: Remove newline character in trace events
  backends/tpm: Remove newline character in trace event
  tracetool: Remove unused vcpu.py script

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tests/tcg/s390x: Allow specifying extra QEMU options on the command line

The use case for this is `make check-tcg EXTFLAGS="-accel kvm"`,
which allows validating the system TCG testcases on real hardware.
EXTFLAGS name is borrowed from tests/tcg/xtensa/Makefile.softmmu-target.
While at it, use += instead of = in order to be consistent with the
other architectures.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240522184116.35975-1-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>

tests/unit/test-smp-parse: Test the full 8-levels topology hierarchy

With module level, QEMU now support 8-levels topology hierarchy.
Cover "modules" in SMP_CONFIG_WITH_FULL_TOPO related cases.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Message-ID: <20240529061925.350323-9-zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>

tests/unit/test-smp-parse: Test "modules" and "dies" combination case

Since i386 PC machine supports both "modules" and "dies" in -smp, add the
"modules" and "dies" combination test case to match the actual topology
usage scenario.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Message-ID: <20240529061925.350323-8-zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>

tests/unit/test-smp-parse: Test "modules" parameter in -smp

Cover the module cases in test-smp-parse.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Message-ID: <20240529061925.350323-7-zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>

tests/unit/test-smp-parse: Make test cases aware of module level

Currently, -smp supports module level.

It is necessary to consider the effects of module in the test cases to
ensure that the calculations are correct. This is also the preparation
to add module test cases.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Message-ID: <20240529061925.350323-6-zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>

tests/unit/test-smp-parse: Use default parameters=0 when not set in -smp

Since -smp allows parameters=1 whether the level is supported by
machine, to avoid the test scenarios where the parameter defaults to 1
cause some errors to be masked, explicitly set undesired parameters to
0.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Message-ID: <20240529061925.350323-5-zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>

tests/unit/test-smp-parse: Fix an invalid topology case

Adjust the "cpus" parameter to match the comment configuration.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Message-ID: <20240529061925.350323-4-zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>

tests/unit/test-smp-parse: Fix comment of parameters=1 case

SMP_CONFIG_WITH_FULL_TOPO hasn't support module level, so the parameter
should indicate the "clusters".

Additionally, reorder the parameters of -smp to match the topology
hierarchy order.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Message-ID: <20240529061925.350323-3-zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>

tests/unit/test-smp-parse: Fix comments of drawers and books case

Fix the comments to match the actual configurations.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Message-ID: <20240529061925.350323-2-zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>

test: Remove libibumad dependence

Remove libibumad dependence from the test environment.

Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240611105427.61395-3-pizhenwei@bytedance.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>

meson: Remove libibumad dependence

RDMA based migration has no dependence on libumad. libibverbs and
librdmacm are enough.
libumad was used by rdmacm-mux which has been already removed. It's
remained mistakenly.

Fixes: 1dfd42c4264b ("hw/rdma: Remove deprecated pvrdma device and rdmacm-mux helper")
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240611105427.61395-2-pizhenwei@bytedance.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>

tests/qtest/x86: check for availability of older cpu models before running tests

It is better to check if some older cpu models like 486, athlon, pentium,
penryn, phenom, core2duo etc are available before running their corresponding
tests. Some downstream distributions may no longer support these older cpu
models.

Signature of add_feature_test() has been modified to return void as
FeatureTestArgs* was not used by the caller.

One minor correction. Replaced 'phenom' with '486' in the test
'x86/cpuid/auto-level/phenom/arat' matching the cpu used.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240610155303.7933-4-anisinha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>

tests/qtest/libqtest: add qtest_has_cpu_model() api

Added a new test api qtest_has_cpu_model() in order to check availability of
some cpu models in the current QEMU binary. The specific architecture of the
QEMU binary is selected using the QTEST_QEMU_BINARY environment variable.
This api would be useful to run tests against some older cpu models after
checking if QEMU actually supported these models.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240610155303.7933-3-anisinha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>

qtest/x86/numa-test: do not use the obsolete 'pentium' cpu

'pentium' cpu is old and obsolete and should be avoided for running tests if
its not strictly needed. Use 'max' cpu instead for generic non-cpu specific
numa test.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-ID: <20240610155303.7933-2-anisinha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>

tests/avocado: Update LoongArch bios file

The VM uses old bios to boot up only 1 cpu, causing the test case to fail.
Update the bios to solve this problem.

Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Message-ID: <20240604030058.2327145-1-gaosong@loongson.cn>
Signed-off-by: Thomas Huth <thuth@redhat.com>

tracetool: Forbid newline character in event format

Events aren't designed to be multi-lines. Multiple events
can be used instead. Prevent that format using multi-lines
by forbidding the newline character.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Mads Ynddal <mads@ynddal.dk>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20240606103943.79116-6-philmd@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

hw/vfio: Remove newline character in trace events

Trace events aren't designed to be multi-lines.
Remove the newline characters.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Mads Ynddal <mads@ynddal.dk>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20240606103943.79116-5-philmd@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

hw/usb: Remove newline character in trace events

Trace events aren't designed to be multi-lines.
Remove the newline characters.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Mads Ynddal <mads@ynddal.dk>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20240606103943.79116-4-philmd@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

hw/sh4: Remove newline character in trace events

Trace events aren't designed to be multi-lines. Remove
the newline character which doesn't bring much value.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Mads Ynddal <mads@ynddal.dk>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20240606103943.79116-3-philmd@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

backends/tpm: Remove newline character in trace event

Split the 'tpm_util_show_buffer' event in two to avoid
using a newline character.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Mads Ynddal <mads@ynddal.dk>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Message-id: 20240606103943.79116-2-philmd@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

tracetool: Remove unused vcpu.py script

vcpu.py is pointless since commit 89aafcf2a7 ("trace:
remove code that depends on setting vcpu"), remote it.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-id: 20240606102631.78152-1-philmd@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

crypto/block: drop qcrypto_block_open() n_threads argument

The n_threads argument is no longer used since the previous commit.
Remove it.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240527155851.892885-3-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/crypto: create ciphers on demand

Ciphers are pre-allocated by qcrypto_block_init_cipher() depending on
the given number of threads. The -device
virtio-blk-pci,iothread-vq-mapping= feature allows users to assign
multiple IOThreads to a virtio-blk device, but the association between
the virtio-blk device and the block driver happens after the block
driver is already open.

When the number of threads given to qcrypto_block_init_cipher() is
smaller than the actual number of threads at runtime, the
block->n_free_ciphers > 0 assertion in qcrypto_block_pop_cipher() can
fail.

Get rid of qcrypto_block_init_cipher() n_thread's argument and allocate
ciphers on demand.

Reported-by: Qing Wang <qinwang@redhat.com>
Buglink: https://issues.redhat.com/browse/RHEL-36159
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240527155851.892885-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

linux-aio: add IO_CMD_FDSYNC command support

Libaio defines IO_CMD_FDSYNC command to sync all outstanding
asynchronous I/O operations, by flushing out file data to the
disk storage. Enable linux-aio to submit such aio request.

When using aio=native without fdsync() support, QEMU creates
pthreads, and destroying these pthreads results in TLB flushes.
In a real-time guest environment, TLB flushes cause a latency
spike. This patch helps to avoid such spikes.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
Message-ID: <20240425070412.37248-1-ppandit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/copy-before-write: use uint64_t for timeout in nanoseconds

rather than the uint32_t for which the maximum is slightly more than 4
seconds and larger values would overflow. The QAPI interface allows
specifying the number of seconds, so only values 0 to 4 are safe right
now, other values lead to a much lower timeout than a user expects.

The block_copy() call where this is used already takes a uint64_t for
the timeout, so no change required there.

Fixes: 6db7fd1ca9 ("block/copy-before-write: implement cbw-timeout option")
Reported-by: Friedrich Weber <f.weber@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20240429141934.442154-1-f.ebner@proxmox.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-io: add cvtnum() error handling for zone commands

cvtnum() parses positive int64_t values and returns a negative errno on
failure. Print errors and return early when cvtnum() fails.

While we're at it, also reject nr_zones values greater or equal to 2^32
since they cannot be represented.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Cc: Sam Li <faithilikerun@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240507180558.377233-1-stefanha@redhat.com>
Reviewed-by: Sam Li <faithilikerun@gmail.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

aio: warn about iohandler_ctx special casing

The main loop has two AioContexts: qemu_aio_context and iohandler_ctx.
The main loop runs them both, but nested aio_poll() calls on
qemu_aio_context exclude iohandler_ctx.

Which one should qemu_get_current_aio_context() return when called from
the main loop? Document that it's always qemu_aio_context.

This has subtle effects on functions that use
qemu_get_current_aio_context(). For example, aio_co_reschedule_self()
does not work when moving from iohandler_ctx to qemu_aio_context because
qemu_get_current_aio_context() does not differentiate these two
AioContexts.

Document this in order to reduce the chance of future bugs.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240506190622.56095-3-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Revert "monitor: use aio_co_reschedule_self()"

Commit 1f25c172f837 ("monitor: use aio_co_reschedule_self()") was a code
cleanup that uses aio_co_reschedule_self() instead of open coding
coroutine rescheduling.

Bug RHEL-34618 was reported and Kevin Wolf <kwolf@redhat.com> identified
the root cause. I missed that aio_co_reschedule_self() ->
qemu_get_current_aio_context() only knows about
qemu_aio_context/IOThread AioContexts and not about iohandler_ctx. It
does not function correctly when going back from the iohandler_ctx to
qemu_aio_context.

Go back to open coding the AioContext transitions to avoid this bug.

This reverts commit 1f25c172f83704e350c0829438d832384084a74d.

Cc: qemu-stable@nongnu.org
Buglink: https://issues.redhat.com/browse/RHEL-34618
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240506190622.56095-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: drop force_dup parameter of raw_reconfigure_getfd()

Since commit 72373e40fbc, this parameter is always passed as 'false'
from the caller.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Hanna Reitz <hreitz@redhat.com>
Message-ID: <20240430170213.148558-1-den@openvz.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Merge tag 'bsd-user-misc-2024q2-pull-request' of gitlab.com:bsdimp/qemu into staging

bsd-user: Baby Steps towards eliminating qemu_host_page_size, et al

First baby-steps towards eliminating qemu_host_page_size: tackle the reserve_va
calculation (which is easier to copy from linux-user than to fix).

# -----BEGIN PGP SIGNATURE-----
# Comment: GPGTools - https://gpgtools.org
#
# iQIzBAABCgAdFiEEIDX4lLAKo898zeG3bBzRKH2wEQAFAmZl3pgACgkQbBzRKH2w
# EQBfpg//U4YdJAA0H4okwPtowP1wIK1gpWvVd5FIN17pCXLKT4FR4efhWeEnQh8U
# +dXvkCpX/MnhBkStYoGZBmYe1rNKkEAn8BPCsQqX4y3af5RzKyKWo0gZXOjN3L9e
# ixmeFcg/7BTwnSbcO02xd9BOPPaRiFBDSidh28gr/1sxpXRxlbQHzIUpTBncDaN6
# 4w5DnF+b1RFHCz05ytrP517cj7E32Ig9S/cVMmBd1pGJiLnHiOp/peMprCL6tnI+
# YNBzttCbRPNH2z0zVd9En/hDnVirGPYX+LXg0Djkw3I+stJj4jwbJTuDG+5Lzghp
# YrYfiU6x7OG9ywjFJgY1/pExVT1cwkNjuGCXL+F4R49R5LfIEHq5/MlQp+tjpYYO
# g5WmpiLnFpFosmXIPJmxr16zqm2sLD+P0Jr/kdIz58fTWmIQeKwi/Vu/73h4kxST
# vjBbhC3eg56lQDaospc4h8+RehmI6LdSWYx0kxv2JKpXH3lQPqsDSrOcm9hEbWYS
# DeV++vkyQcXrbCnwomfxG1U+dVYBlJ1L1wClxc/1WD9KxXXJIwlvGmIu3o3c2+xj
# BM6eRe3evWioqdqhc2lY+XxATwbIUxiect6ml+F6E0KJxlm3Ajqy6qw49G+uhZxa
# XWUEIYGDd6/xHMlBeo6FKUpe/Ez/i3eCFXr4AD4iO7AtTuukrO4=
# =3EaH
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 09 Jun 2024 09:55:52 AM PDT
# gpg:                using RSA key 2035F894B00AA3CF7CCDE1B76C1CD1287DB01100
# gpg: Good signature from "Warner Losh <wlosh@netflix.com>" [unknown]
# gpg:                 aka "Warner Losh <imp@bsdimp.com>" [unknown]
# gpg:                 aka "Warner Losh <imp@freebsd.org>" [unknown]
# gpg:                 aka "Warner Losh <imp@village.org>" [unknown]
# gpg:                 aka "Warner Losh <wlosh@bsdimp.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2035 F894 B00A A3CF 7CCD  E1B7 6C1C D128 7DB0 1100

* tag 'bsd-user-misc-2024q2-pull-request' of gitlab.com:bsdimp/qemu:
  bsd-user: Catch up to run-time reserved_va math
  bsd-user: port linux-user:ff8a8bbc2ad1 for variable page sizes
  linux-user: Adjust comment to reflect the code.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

hw/arm: xen: Enable use of grant mappings

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

xen: mapcache: Add support for grant mappings

Add a second mapcache for grant mappings. The mapcache for
grants needs to work with XC_PAGE_SIZE granularity since
we can't map larger ranges than what has been granted to us.

Like with foreign mappings (xen_memory), machines using grants
are expected to initialize the xen_grants MR and map it
into their address-map accordingly.

CC: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

xen: mapcache: Pass the ram_addr offset to xen_map_cache()

Pass the ram_addr offset to xen_map_cache.
This is in preparation for adding grant mappings that need
to compute the address within the RAMBlock.

No functional changes.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

xen: mapcache: Unmap first entries in buckets

When invalidating memory ranges, if we happen to hit the first
entry in a bucket we were never unmapping it. This was harmless
for foreign mappings but now that we're looking to reuse the
mapcache for transient grant mappings, we must unmap entries
when invalidated.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

xen: mapcache: Make MCACHE_BUCKET_SHIFT runtime configurable

Make MCACHE_BUCKET_SHIFT runtime configurable per cache instance.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

bsd-user: Catch up to run-time reserved_va math

Catch up to linux-user's 8f67b9c694d0, 13c13397556a, 2f7828b57293, and
95059f9c313a by Richard Henderson which made reserved_va a run-time
calculation, defaulting to nothing except in the case of 64-bit host
32-bit target. Also include the adjustment of the comment heading that
work submitted in the same patch stream. Since this is a direct copy,
squash it into one patch rather than follow the Linux evolution since
breaking this down further at this point doesn't make sense for this
"new code".

Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

bsd-user: port linux-user:ff8a8bbc2ad1 for variable page sizes

Bring in Richard Henderson's ff8a8bbc2ad1 to finalize the page size to
allow TARGET_PAGE_BITS_VARY. bsd-user's "blitz" fork has aarch64
support, which is now variable page size. Add support for it here, even
though it's effectively a nop in upstream qemu.

Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

linux-user: Adjust comment to reflect the code.

If the user didn't specify reserved_va, there's an else for 64-bit host
32-bit (or fewer) target to reserve 32-bits of address space. Update the
comments to reflect this, and rejustify comment to 80 columns.

Signed-off-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

Merge tag 'pull-hex-20240608' of https://github.com/quic/qemu into staging

idef-parser cleanup, HVX & PC-alignment fixes

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEEPWaq5HRZSCTIjOD4GlSvuOVkbDIFAmZk/L4ACgkQGlSvuOVk
# bDLKag//ZzAuoChJOkz7EPeRzFKWuz4QL9cXA6+FrWRoD43geXiJ/eDihlLIjFvr
# JN2deqaYZTyqlfbWR1BUIgkPxYnwBFlaqCnOO4xgbAaUJSxtdmkuWr8GBlftZt2s
# PV2Nm9pDjDOEJrnfbSA2f3nMkMa7e64N+tXZ5Svt8pJC8DOZg3oI3KXWX6uZZ5YA
# 9DAGgiHBlZONKQk/EebQ1DAcc+RDu68f+UtzsQ9Q4MiO/Mga/Z2u5wdOdrXmk5Lh
# ba6W4sLqBNU8oB6hkA5sy+5EhlzPIhX1+G1c21fRSlLR74BFK8ByZ802kWSVY1j/
# /MS01yH46Kb3aFVqpMvoYzBZ+kGlbMVKYY4c9AXtrH5tojHQ83ijnl2V/0y+s+i8
# f6bqErchbDZPM8H6vVDdbUewx3Sq/KA7WhiK9GCgnHWc0Z5kj15l121vJr6JVMwS
# fkccK1s8fOTUNCZNJiu4czakNQTGsf4jWGjcOo7EREstIXin0E/cUxZKrJWYshzc
# 88Ys1pxSk+1f7ajla4+uQ3oDw+RDqkA1unUA5cfJz/61ho5TWx6dcd5XKziNk7o4
# PyOhxfoLSV9j5+XczAO+nugpN0zQUHb7lz2k0sNiypScbXVSIw/ebKgYMVlLyMSf
# yEZTh8p+rbzmmJbkJBB5X/8kpU0qyp6fK5dRv1wvNPau0ExBwcs=
# =CwAl
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 08 Jun 2024 05:52:14 PM PDT
# gpg:                using RSA key 3D66AAE474594824C88CE0F81A54AFB8E5646C32
# gpg: Good signature from "Brian Cain (QUIC) <quic_bcain@quicinc.com>" [unknown]
# gpg:                 aka "Brian Cain <bcain@kernel.org>" [unknown]
# gpg:                 aka "Brian Cain (QuIC) <bcain@quicinc.com>" [unknown]
# gpg:                 aka "Brian Cain (CAF) <bcain@codeaurora.org>" [unknown]
# gpg:                 aka "bcain" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6350 20F9 67A7 7164 79EF  49E0 175C 464E 541B 6D47
#      Subkey fingerprint: 3D66 AAE4 7459 4824 C88C  E0F8 1A54 AFB8 E564 6C32

* tag 'pull-hex-20240608' of https://github.com/quic/qemu:
  target/hexagon: idef-parser simplify predicate init
  target/hexagon: idef-parser fix leak of init_list
  target/hexagon: idef-parser remove undefined functions
  target/hexagon: idef-parser remove unused defines
  Hexagon: add PC alignment check and exception
  Hexagon: fix HVX store new

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/hexagon: idef-parser simplify predicate init

Only predicate instruction arguments need to be initialized by
idef-parser. This commit removes registers from the init_list and
simplifies gen_inst_init_args() slightly.

Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20240523125901.27797-5-anjo@rev.ng>
Signed-off-by: Brian Cain <bcain@quicinc.com>

target/hexagon: idef-parser fix leak of init_list

gen_inst_init_args() is called for instructions using a predicate as an
rvalue. Upon first call, the list of arguments which might need
initialization init_list is freed to indicate that they have been
processed. For instructions without an rvalue predicate,
gen_inst_init_args() isn't called and init_list will never be freed.

Free init_list from free_instruction() if it hasn't already been freed.
A comment in free_instruction is also updated.

Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20240523125901.27797-4-anjo@rev.ng>
Signed-off-by: Brian Cain <bcain@quicinc.com>

target/hexagon: idef-parser remove undefined functions

Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20240523125901.27797-3-anjo@rev.ng>
Signed-off-by: Brian Cain <bcain@quicinc.com>

target/hexagon: idef-parser remove unused defines

Before switching to GArray/g_string_printf we used fixed size arrays for
output buffers and instructions arguments among other things.

Macros defining the sizes of these buffers were left behind, remove
them.

Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <20240523125901.27797-2-anjo@rev.ng>
Signed-off-by: Brian Cain <bcain@quicinc.com>

Hexagon: add PC alignment check and exception

The Hexagon Programmer's Reference Manual says that the exception 0x1e
should be raised upon an unaligned program counter. Let's implement that
and also add some tests.

Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-Id: <277b7aeda2c717a96d4dde936b3ac77707cb6517.1714755107.git.quic_mathbern@quicinc.com>
Signed-off-by: Brian Cain <bcain@quicinc.com>

Hexagon: fix HVX store new

At 09a7e7db0f (Hexagon (target/hexagon) Remove uses of
op_regs_generated.h.inc, 2024-03-06), we've changed the logic of
check_new_value() to use the new pre-calculated
packet->insn[...].dest_idx instead of calculating the index on the fly
using opcode_reginfo[...]. The dest_idx index is calculated roughly like
the following:

    for reg in iset[tag]["syntax"]:
        if reg.is_written():
            dest_idx = regno
            break

Thus, we take the first register that is writtable. Before that,
however, we also used to follow an alphabetical order on the register
type: 'd', 'e', 'x', and 'y'. No longer following that makes us select
the wrong register index and the HVX store new instruction does not
update the memory like expected.

Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Reviewed-by: Taylor Simpson <ltaylorsimpson@gmail.com>
Message-Id: <f548dc1c240819c724245e887f29f918441e9125.1716220379.git.quic_mathbern@quicinc.com>
Signed-off-by: Brian Cain <bcain@quicinc.com>

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* scsi-disk: Don't silently truncate serial number
* backends/hostmem: Report error on unavailable qemu_madvise() features or unaligned memory sizes
* target/i386: fixes and documentation for INHIBIT_IRQ/TF/RF and debugging
* i386/hvf: Adds support for INVTSC cpuid bit
* i386/hvf: Fixes for dirty memory tracking
* i386/hvf: Use hv_vcpu_interrupt() and hv_vcpu_run_until()
* hvf: Cleanups
* stubs: fixes for --disable-system build
* i386/kvm: support for FRED
* i386/kvm: fix MCE handling on AMD hosts

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZkF2oUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroPNlQf+N9y6Eh0nMEEQ69twtV8ytglTY+uX
# FsogvnsXHNMVubOWmmeItM6kFXTAkR9cmFaL8dqI1Gs03xEQdQXbF1KejJZOAZVl
# RQMOW8Fg2Afr+0lwqCXHvhsmZ4hr5yUkRndyucA/E9AO2uGrtgwsWGDBGaHJOZIA
# lAsEMOZgKjXHZnefXjhMrvpk/QNovjEV6f1RHX3oKZjKSI5/G4IqGSmwNYToot8p
# 2fgs4Qti4+1gNyM2oBLq7cCMjMS61tSxOMH4uqVoIisjyckPlAFRvc+DXtKsUAAs
# 9AgM++pNgpB0IXv67czRUNdRoK7OI8I0ULhI4qHXi6Yg2QYAHqpQ6WL4Lg==
# =RP7U
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 08 Jun 2024 01:33:46 AM PDT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (42 commits)
  python: mkvenv: remove ensure command
  Revert "python: use vendored tomli"
  i386: Add support for overflow recovery
  i386: Add support for SUCCOR feature
  i386: Fix MCE support for AMD hosts
  docs: i386: pc: Avoid mentioning limit of maximum vCPUs
  target/i386: Add get/set/migrate support for FRED MSRs
  target/i386: enumerate VMX nested-exception support
  vmxcap: add support for VMX FRED controls
  target/i386: mark CR4.FRED not reserved
  target/i386: add support for FRED in CPUID enumeration
  hvf: Makes assert_hvf_ok report failed expression
  i386/hvf: Updates API usage to use modern vCPU run function
  i386/hvf: In kick_vcpu use hv_vcpu_interrupt to force exit
  i386/hvf: Fixes dirty memory tracking by page granularity RX->RWX change
  hvf: Consistent types for vCPU handles
  i386/hvf: Fixes some compilation warnings
  i386/hvf: Adds support for INVTSC cpuid bit
  stubs/meson: Fix qemuutil build when --disable-system
  scsi-disk: Don't silently truncate serial number
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

python: mkvenv: remove ensure command

This was used to bootstrap the venv with a TOML parser, after which
ensuregroup is used. Now that we expect it to be present as a system
package (either tomli or, for Python 3.11, tomllib), it is not needed
anymore.

Note that this means that, when implemented, the hypothetical "isolated"
mode that does not use any system packages will only work with Python
3.11+.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Revert "python: use vendored tomli"

Now that Ubuntu 20.04 is not included anymore, there is no need to ship
it as part of QEMU; Ubuntu 22.04 includes it and Leap users anyway
need to install all the required dependencies from PyPI.

This mostly reverts commit ec77ee7634de123b7c899739711000fd21dab68b,
with just some changes to the wording.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386: Add support for overflow recovery

Add cpuid bit definition for overflow recovery. This is needed in the case
where a deferred error has been sent to the guest, a guest process accesses the
poisoned memory, but the machine_check_poll function has not yet handled the
original deferred error. If overflow recovery is not set in this case, when we
handle the uncorrected error from the poisoned memory access, the overflow bit
will be set and will result in the guest being shut down.

By the time the MCE reaches the guest, the overflow has been handled
by the host and has not caused a shutdown, so include the bit unconditionally.

Signed-off-by: John Allen <john.allen@amd.com>
Message-ID: <20240603193622.47156-4-john.allen@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386: Add support for SUCCOR feature

Add cpuid bit definition for the SUCCOR feature. This cpuid bit is required to
be exposed to guests to allow them to handle machine check exceptions on AMD
hosts.

----
v2:
- Add "succor" feature word.
- Add case to kvm_arch_get_supported_cpuid for the SUCCOR feature.

Reported-by: William Roche <william.roche@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: John Allen <john.allen@amd.com>
Message-ID: <20240603193622.47156-3-john.allen@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386: Fix MCE support for AMD hosts

For the most part, AMD hosts can use the same MCE injection code as Intel, but
there are instances where the qemu implementation is Intel specific. First, MCE
delivery works differently on AMD and does not support broadcast. Second,
kvm_mce_inject generates MCEs that include a number of Intel specific status
bits. Modify kvm_mce_inject to properly generate MCEs on AMD platforms.

Reported-by: William Roche <william.roche@oracle.com>
Signed-off-by: John Allen <john.allen@amd.com>
Message-ID: <20240603193622.47156-2-john.allen@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

docs: i386: pc: Avoid mentioning limit of maximum vCPUs

Different versions of PC machine support different maximum vCPUs, and
even different features have limits on the maximum number of vCPUs (
For example, if x2apic is not enabled in the TCG case, the maximum of
255 vCPUs are supported).

It is difficult to list the maximum vCPUs under all restrictions. Thus,
to avoid confusion, avoid mentioning specific maximum vCPU number
limitations here.

Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240606085436.2028900-1-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: Add get/set/migrate support for FRED MSRs

FRED CPU states are managed in 9 new FRED MSRs, in addtion to a few
existing CPU registers and MSRs, e.g., CR4.FRED and MSR_IA32_PL0_SSP.

Save/restore/migrate FRED MSRs if FRED is exposed to the guest.

Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Message-ID: <20231109072012.8078-7-xin3.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: enumerate VMX nested-exception support

Allow VMX nested-exception support to be exposed in KVM guests, thus
nested KVM guests can enumerate it.

Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Message-ID: <20231109072012.8078-6-xin3.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

vmxcap: add support for VMX FRED controls

Report secondary vm-exit controls and the VMX controls used to
save/load FRED MSRs.

Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Message-ID: <20231109072012.8078-5-xin3.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: mark CR4.FRED not reserved

The CR4.FRED bit, i.e., CR4[32], is no longer a reserved bit when FRED
is exposed to guests, otherwise it is still a reserved bit.

Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20231109072012.8078-3-xin3.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: add support for FRED in CPUID enumeration

FRED, i.e., the Intel flexible return and event delivery architecture,
defines simple new transitions that change privilege level (ring
transitions).

The new transitions defined by the FRED architecture are FRED event
delivery and, for returning from events, two FRED return instructions.
FRED event delivery can effect a transition from ring 3 to ring 0, but
it is used also to deliver events incident to ring 0.  One FRED
instruction (ERETU) effects a return from ring 0 to ring 3, while the
other (ERETS) returns while remaining in ring 0.  Collectively, FRED
event delivery and the FRED return instructions are FRED transitions.

In addition to these transitions, the FRED architecture defines a new
instruction (LKGS) for managing the state of the GS segment register.
The LKGS instruction can be used by 64-bit operating systems that do
not use the new FRED transitions.

WRMSRNS is an instruction that behaves exactly like WRMSR, with the
only difference being that it is not a serializing instruction by
default.  Under certain conditions, WRMSRNS may replace WRMSR to improve
performance.  FRED uses it to switch RSP0 in a faster manner.

Search for the latest FRED spec in most search engines with this search
pattern:

  site:intel.com FRED (flexible return and event delivery) specification

The CPUID feature flag CPUID.(EAX=7,ECX=1):EAX[17] enumerates FRED, and
the CPUID feature flag CPUID.(EAX=7,ECX=1):EAX[18] enumerates LKGS, and
the CPUID feature flag CPUID.(EAX=7,ECX=1):EAX[19] enumerates WRMSRNS.

Add CPUID definitions for FRED/LKGS/WRMSRNS, and expose them to KVM guests.

Because FRED relies on LKGS and WRMSRNS, add that to feature dependency
map.

Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Message-ID: <20231109072012.8078-2-xin3.li@intel.com>
[Fix order of dependencies, add dependencies from LM to FRED. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hvf: Makes assert_hvf_ok report failed expression

When a macOS Hypervisor.framework call fails which is checked by
assert_hvf_ok(), Qemu exits printing the error value, but not the
location
in the code, as regular assert() macro expansions would.

This change turns assert_hvf_ok() into a macro similar to other
assertions, which expands to a call to the corresponding _impl()
function together with information about the expression that failed
the assertion and its location in the code.

Additionally, stringifying the numeric hv_return_t code is factored
into a helper function that can be reused for diagnostics and debugging
outside of assertions.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Message-ID: <20240605112556.43193-8-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/hvf: Updates API usage to use modern vCPU run function

macOS 10.15 introduced the more efficient hv_vcpu_run_until() function
to supersede hv_vcpu_run(). According to the documentation, there is no
longer any reason to use the latter on modern host OS versions, especially
after 11.0 added support for an indefinite deadline.

Observed behaviour of the newer function is that as documented, it exits
much less frequently - and most of the original function’s exits seem to
have been effectively pointless.

Another reason to use the new function is that it is a prerequisite for
using newer features such as in-kernel APIC support. (Not covered by
this patch.)

This change implements the upgrade by selecting one of three code paths
at compile time: two static code paths for the new and old functions
respectively, when building for targets where the new function is either
not available, or where the built executable won’t run on older
platforms lacking the new function anyway. The third code path selects
dynamically based on runtime detected availability of the weakly-linked
symbol.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Message-ID: <20240605112556.43193-7-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/hvf: In kick_vcpu use hv_vcpu_interrupt to force exit

When interrupting a vCPU thread, this patch actually tells the hypervisor to
stop running guest code on that vCPU.

Calling hv_vcpu_interrupt actually forces a vCPU exit, analogously to
hv_vcpus_exit on aarch64. Alternatively, if the vCPU thread
is not
running the VM, it will immediately cause an exit when it attempts
to do so.

Previously, hvf_kick_vcpu_thread relied upon hv_vcpu_run returning very
frequently, including many spurious exits, which made it less of a problem that
nothing was actively done to stop the vCPU thread running guest code.
The newer, more efficient hv_vcpu_run_until exits much more rarely, so a true
"kick" is needed before switching to that.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Message-ID: <20240605112556.43193-6-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/hvf: Fixes dirty memory tracking by page granularity RX->RWX change

When using x86 macOS Hypervisor.framework as accelerator, detection of
dirty memory regions is implemented by marking logged memory region
slots as read-only in the EPT, then setting the dirty flag when a
guest write causes a fault. The area marked dirty should then be marked
writable in order for subsequent writes to succeed without a VM exit.

However, dirty bits are tracked on a per-page basis, whereas the fault
handler was marking the whole logged memory region as writable. This
change fixes the fault handler so only the protection of the single
faulting page is marked as dirty.

(Note: the dirty page tracking appeared to work despite this error
because HVF’s hv_vcpu_run() function generated unnecessary EPT fault
exits, which ended up causing the dirty marking handler to run even
when the memory region had been marked RW. When using
hv_vcpu_run_until(), a change planned for a subsequent commit, these
spurious exits no longer occur, so dirty memory tracking malfunctions.)

Additionally, the dirty page is set to permit code execution, the same
as all other guest memory; changing memory protection from RX to RW not
RWX appears to have been an oversight.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Roman Bolshakov <roman@roolebo.dev>
Tested-by: Roman Bolshakov <roman@roolebo.dev>
Message-ID: <20240605112556.43193-5-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hvf: Consistent types for vCPU handles

macOS Hypervisor.framework uses different types for identifying vCPUs, hv_vcpu_t or hv_vcpuid_t, depending on host architecture. They are not just differently named typedefs for the same primitive type, but reference different-width integers.

Instead of using an integer type and casting where necessary, this change introduces a typedef which resolves the active architecture’s hvf typedef. It also removes a now-unnecessary cast.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Roman Bolshakov <roman@roolebo.dev>
Tested-by: Roman Bolshakov <roman@roolebo.dev>
Message-ID: <20240605112556.43193-4-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/hvf: Fixes some compilation warnings

A bunch of function definitions used empty parentheses instead of (void) syntax, yielding the following warning when building with clang on macOS:

warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]

In addition to fixing these function headers, it also fixes what appears to be a typo causing a variable to be unused after initialisation.

warning: variable 'entry_ctls' set but not used [-Wunused-but-set-variable]

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Roman Bolshakov <roman@roolebo.dev>
Tested-by: Roman Bolshakov <roman@roolebo.dev>
Message-ID: <20240605112556.43193-3-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

i386/hvf: Adds support for INVTSC cpuid bit

This patch adds the INVTSC bit to the Hypervisor.framework accelerator's
CPUID bit passthrough allow-list. Previously, specifying +invtsc in the CPU
configuration would fail with the following warning despite the host CPU
advertising the feature:

qemu-system-x86_64: warning: host doesn't support requested feature:
CPUID.80000007H:EDX.invtsc [bit 8]

x86 macOS itself relies on a fixed rate TSC for its own Mach absolute time
timestamp mechanism, so there's no reason we can't enable this bit for guests.
When the feature is enabled, a migration blocker is installed.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Roman Bolshakov <roman@roolebo.dev>
Tested-by: Roman Bolshakov <roman@roolebo.dev>
Message-ID: <20240605112556.43193-2-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

stubs/meson: Fix qemuutil build when --disable-system

Compiling without system, user, tools or guest-agent fails with the
following error message:

./configure --disable-system --disable-user --disable-tools \
--disable-guest-agent

error message:

/usr/bin/ld: libqemuutil.a.p/util_error-report.c.o: in function `error_printf':
/media/liuzhao/data/qemu-cook/build/../util/error-report.c:38: undefined reference to `error_vprintf'
/usr/bin/ld: libqemuutil.a.p/util_error-report.c.o: in function `vreport':
/media/liuzhao/data/qemu-cook/build/../util/error-report.c:215: undefined reference to `error_vprintf'
collect2: error: ld returned 1 exit status

This is because tests/bench and tests/unit both need qemuutil, which
requires error_vprintf stub when system is disabled.

Add error_vprintf stub into stub_ss for all cases other than disabling
system.

Fixes: 3a15604900c4 ("stubs: include stubs only if needed")
Reported-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240605152549.1795762-1-zhao1.liu@intel.com>
[Include error-printf.c unconditionally. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

scsi-disk: Don't silently truncate serial number

Before this commit, scsi-disk accepts a string of arbitrary length for
its "serial" property. However, the value visible on the guest is
actually truncated to 36 characters. This limitation doesn't come from
the SCSI specification, it is an arbitrary limit that was initially
picked as 20 and later bumped to 36 by commit 48b62063.

Similarly, device_id was introduced as a copy of the serial number,
limited to 20 characters, but commit 48b62063 forgot to actually bump
it.

As long as we silently truncate the given string, extending the limit is
actually not a harmless change, but break the guest ABI. This is the
most important reason why commit 48b62063 was really wrong (and it's
also why we can't change device_id to be in sync with the serial number
again and use 36 characters now, it would be another guest ABI
breakage).

In order to avoid future breakage, don't silently truncate the serial
number string any more, but just error out if it would be truncated.

Buglink: https://issues.redhat.com/browse/RHEL-3542
Suggested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240604161755.63448-1-kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hostmem: simplify the code for merge and dump properties

No semantic change, just simpler control flow.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

machine, hostmem: improve error messages for unsupported features

Detect early unsupported MADV_MERGEABLE and MADV_DONTDUMP, and print a clearer
error message that points to the deficiency of the host.

Cc: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

backends/hostmem: Report error when memory size is unaligned

If memory-backend-{file,ram} has a size that's not aligned to
underlying page size it is not only wasteful, but also may lead
to hard to debug behaviour. For instance, in case
memory-backend-file and hugepages, madvise() and mbind() fail.
Rightfully so, page is the smallest unit they can work with. And
even though an error is reported, the root cause it not very
clear:

qemu-system-x86_64: Couldn't set property 'dump' on 'memory-backend-file': Invalid argument

After this commit:

qemu-system-x86_64: backend 'memory-backend-file' memory size must be multiple of 2 MiB

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Message-ID: <b5b9f9c6bba07879fb43f3c6f496c69867ae3716.1717584048.git.mprivozn@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

osdep: Make qemu_madvise() return ENOSYS on unsupported OSes

Not every OS is capable of madvise() or posix_madvise() even. In
that case, errno should be set to ENOSYS as it reflects the cause
better.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <b381c23bd8f413f1453a2c1a66e0979beaf27433.1717584048.git.mprivozn@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

osdep: Make qemu_madvise() to set errno in all cases

The unspoken premise of qemu_madvise() is that errno is set on
error. And it is mostly the case except for posix_madvise() which
is documented to return either zero (on success) or a positive
error number. This means, we must set errno ourselves. And while
at it, make the function return a negative value on error, just
like other error paths do.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <af17113e7c1f2cc909ffd36d23f5a411b63b8764.1717584048.git.mprivozn@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

meson: Don't even detect posix_madvise() on Darwin

On Darwin, posix_madvise() has the same return semantics as plain
madvise() [1]. That's not really what our usage expects.
Fortunately, madvise() is available and preferred anyways so we
may stop detecting posix_madvise() on Darwin.

1: https://opensource.apple.com/source/xnu/xnu-7195.81.3/bsd/man/man2/madvise.2.auto.html

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Message-ID: <00f71753bdeb8c0f049fda05fb63b84bb5502fb3.1717584048.git.mprivozn@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

machine: default -M mem-merge to off is QEMU_MADV_MERGEABLE is not available

Otherwise, starting any guest on a non-Linux guests results in

qemu-system-arm: Couldn't set property 'merge' on 'memory-backend-ram': Invalid argument

Cc: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: fix size of EBP writeback in gen_enter()

The calculation of FrameTemp is done using the size indicated by mo_pushpop()
before being written back to EBP, but the final writeback to EBP is done using
the size indicated by mo_stacksize().

In the case where mo_pushpop() is MO_32 and mo_stacksize() is MO_16 then the
final writeback to EBP is done using MO_16 which can leave junk in the top
16-bits of EBP after executing ENTER.

Change the writeback of EBP to use the same size indicated by mo_pushpop() to
ensure that the full value is written back.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198
Message-ID: <20240606095319.229650-5-mark.cave-ayland@ilande.co.uk>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: fix SP when taking a memory fault during POP

When OS/2 Warp configures its segment descriptors, many of them are configured with
the P flag clear to allow for a fault-on-demand implementation. In the case where
the stack value is POPped into the segment registers, the SP is incremented before
calling gen_helper_load_seg() to validate the segment descriptor:

IN:
0xffef2c0c: 66 07 popl %es

OP:
ld_i32 loc9,env,$0xfffffffffffffff8
sub_i32 loc9,loc9,$0x1
brcond_i32 loc9,$0x0,lt,$L0
st16_i32 loc9,env,$0xfffffffffffffff8
st8_i32 $0x1,env,$0xfffffffffffffffc

---- 0000000000000c0c 0000000000000000
ext16u_i64 loc0,rsp
add_i64 loc0,loc0,ss_base
ext32u_i64 loc0,loc0
qemu_ld_a64_i64 loc0,loc0,noat+un+leul,5
add_i64 loc3,rsp,$0x4
deposit_i64 rsp,rsp,loc3,$0x0,$0x10
extrl_i64_i32 loc5,loc0
call load_seg,$0x0,$0,env,$0x0,loc5
add_i64 rip,rip,$0x2
ext16u_i64 rip,rip
exit_tb $0x0
set_label $L0
exit_tb $0x7fff58000043

If helper_load_seg() generates a fault when validating the segment descriptor then as
the SP has already been incremented, the topmost word of the stack is overwritten by
the arguments pushed onto the stack by the CPU before taking the fault handler. As a
consequence things rapidly go wrong upon return from the fault handler due to the
corrupted stack.

Update the logic for the existing writeback condition so that a POP into the segment
registers also calls helper_load_seg() first before incrementing the SP, so that if a
fault occurs the SP remains unaltered.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198
Message-ID: <20240606095319.229650-4-mark.cave-ayland@ilande.co.uk>
Fixes: cc1d28bdbe0 ("target/i386: move 00-5F opcodes to new decoder", 2024-05-07)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>