git.maquefel.me Git - qemu.git/log

vfio/migration: Refactor vfio_devices_all_dirty_tracking() logic

During dirty page log sync, vfio_devices_all_dirty_tracking() is used to
check if dirty tracking has been started in order to avoid errors. The
current logic checks if migration is in ACTIVE or DEVICE states to
ensure dirty tracking has been started.

However, recently there has been an effort to simplify the migration
status API and reduce it to a single migration_is_running() function.

To accommodate this, refactor vfio_devices_all_dirty_tracking() logic so
it won't use migration_is_active() and migration_is_device(). Instead,
use internal VFIO dirty tracking flags.

As a side effect, now that migration status is no longer used to detect
dirty tracking status, VFIO log syncs are untied from migration. This
will make calc-dirty-rate more accurate as now it will also include VFIO
dirty pages.

While at it, as VFIODevice->dirty_tracking is now used to detect dirty
tracking status, add a comment that states how it's protected.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Tested-by: Joao Martins <joao.m.martins@oracle.com>
Link: https://lore.kernel.org/r/20241218134022.21264-3-avihaih@nvidia.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>

vfio/container: Add dirty tracking started flag

Add a flag to VFIOContainerBase that indicates whether dirty tracking
has been started for the container or not.

This will be used in the following patches to allow dirty page syncs
only if dirty tracking has been started.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Tested-by: Joao Martins <joao.m.martins@oracle.com>
Link: https://lore.kernel.org/r/20241218134022.21264-2-avihaih@nvidia.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>

vfio/igd: add x-igd-gms option back to set DSM region size for guest

DSM region is likely to store framebuffer in Windows, a small DSM
region may cause display issues (e.g. half of the screen is black).
Since 971ca22f041b ("vfio/igd: don't set stolen memory size to zero"),
the x-igd-gms option was functionally removed, QEMU uses host's
original value, which is determined by DVMT Pre-Allocated option in
Intel FSP of host bios.

However, some vendors do not expose this config item to users. In
such cases, x-igd-gms option can be used to manually set the data
stolen memory size for guest. So this commit brings this option back,
keeping its old behavior. When it is not specified, QEMU uses host's
value.

When DVMT Pre-Allocated option is available in host BIOS, user should
set DSM region size there instead of using x-igd-gms option.

Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20241206122749.9893-11-tomitamoeko@gmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>

vfio/igd: emulate BDSM in mmio bar0 for gen 6-10 devices

A recent commit in i915 driver [1] claims the BDSM register at 0x1080c0
of mmio bar0 has been there since gen 6. Mirror this register to the 32
bit BDSM register at 0x5c in pci config space for gen6-10 devices.

[1] https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-7-ville.syrjala@linux.intel.com

Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com>
Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20241206122749.9893-10-tomitamoeko@gmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>

vfio/igd: emulate GGC register in mmio bar0

The GGC register at 0x50 of pci config space is a mirror of the same
register at 0x108040 of mmio bar0 [1]. i915 driver also reads that
register from mmio bar0 instead of config space. As GGC is programmed
and emulated by qemu, the mmio address should also be emulated, in the
same way of BDSM register.

[1] 4.1.28, 12th Generation Intel Core Processors Datasheet Volume 2
https://www.intel.com/content/www/us/en/content-details/655259

Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20241206122749.9893-9-tomitamoeko@gmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>

vfio/igd: add macro for declaring mirrored registers

igd devices have multipe registers mirroring mmio address and pci
config space, more than a single BDSM register. To support this,
the read/write functions are made common and a macro is defined to
simplify the declaration of MemoryRegionOps.

Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20241206122749.9893-8-tomitamoeko@gmail.com
[ clg : Fixed conversion specifier on 32-bit platform ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>

vfio/igd: add Alder/Raptor/Rocket/Ice/Jasper Lake device ids

All gen 11 and 12 igd devices have 64 bit BDSM register at 0xC0 in its
config space, add them to the list to support igd passthrough on Alder/
Raptor/Rocket/Ice/Jasper Lake platforms.

Tested legacy mode of igd passthrough works properly on both linux and
windows guests with AlderLake-S GT1 (8086:4680).

Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com>
Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20241206122749.9893-7-tomitamoeko@gmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>

vfio/igd: add Gemini Lake and Comet Lake device ids

Both Gemini Lake and Comet Lake are gen 9 devices. Many user reports
on internet shows legacy mode of igd passthrough works as qemu treats
them as gen 8 devices by default before e433f208973f ("vfio/igd:
return an invalid generation for unknown devices").

Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com>
Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20241206122749.9893-6-tomitamoeko@gmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>

vfio/igd: canonicalize memory size calculations

Add helper functions igd_gtt_memory_size() and igd_stolen_size() for
calculating GTT stolen memory and Data stolen memory size in bytes,
and use macros to replace the hardware-related magic numbers for
better readability.

Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20241206122749.9893-5-tomitamoeko@gmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>

vfio/igd: align generation with i915 kernel driver

Define the igd device generations according to i915 kernel driver to
avoid confusion, and adjust comment placement to clearly reflect the
relationship between ids and devices.

The condition of how GTT stolen memory size is calculated is changed
accordingly as GGMS is in multiple of 2 starting from gen 8.

Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com>
Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20241206122749.9893-4-tomitamoeko@gmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>

vfio/igd: remove unsupported device ids

Since e433f208973f ("vfio/igd: return an invalid generation for unknown
devices"), the default return of igd_gen() was changed to unsupported.
There is no need to filter out those unsupported devices.

Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com>
Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Link: https://lore.kernel.org/r/20241206122749.9893-3-tomitamoeko@gmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>

vfio/igd: fix GTT stolen memory size calculation for gen 8+

On gen 8 and later devices, the GTT stolen memory size when GGMS equals
0 is 0 (no preallocated memory) rather than 1MB [1].

[1] 3.1.13, 5th Generation Intel Core Processor Family Datasheet Vol. 2
https://www.intel.com/content/www/us/en/content-details/330835

Fixes: c4c45e943e51 ("vfio/pci: Intel graphics legacy mode assignment")
Reported-By: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20241206122749.9893-2-tomitamoeko@gmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>

Merge tag 'pull-tcg-20241224' of https://gitlab.com/rth7680/qemu into staging

tcg/optimize: Remove in-flight mask data from OptContext
fpu: Add float*_muladd_scalbn
fpu: Remove float_muladd_halve_result
fpu: Add float_round_nearest_even_max
fpu: Add float_muladd_suppress_add_product_zero
target/hexagon: Use float32_muladd
accel/tcg: Move gen_intermediate_code to TCGCPUOps.translate_core

# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmdrE7QdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+l2Qf/aECUfMn07wns7WjX
# ebWxzIRKp//ktsIJg9InL8zrCStyRqrBj0VQE9LUfO2Vhvqf8faUdh+uh2ek/Ewa
# f1hfo0kDK7e7oWnCicSbHmdC0FQIrKpg2i+YXIsbd4XWOkmFAhkNenISuQfCrL3k
# 3UYAA12seK9uCls+fljvhK6iid3h+4ReDFW7DPg7mumFCCz6CwzYYW/4cnhcAmOn
# qVehtts8W+6SFMjTE04S8NV8OBaMisf8AbCcZf2PedRl1cHGSumLOjvjOxcQU8Hw
# nGUjL8/hYWkEetzU4YzJyfHOe6F9lPJBMnDattwIswwYrTOD/Sq7VbBWFbW0EwUy
# 7XIZ8Q==
# =DZgo
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 24 Dec 2024 15:04:04 EST
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* tag 'pull-tcg-20241224' of https://gitlab.com/rth7680/qemu: (72 commits)
  accel/tcg: Move gen_intermediate_code to TCGCPUOps.translate_core
  target/hexagon: Simplify internal_mpyhh setup
  target/hexagon: Use mulu64 for int128_mul_6464
  target/hexagon: Remove Double
  target/hexagon: Remove Float
  target/hexagon: Expand GEN_XF_ROUND
  target/hexagon: Remove internal_fmafx
  target/hexagon: Use float32_muladd for helper_sffm[as]_lib
  target/hexagon: Use float32_muladd_scalbn for helper_sffma_sc
  target/hexagon: Use float32_muladd for helper_sffms
  target/hexagon: Use float32_muladd for helper_sffma
  target/hexagon: Use float32_mul in helper_sfmpy
  softfloat: Add float_muladd_suppress_add_product_zero
  softfloat: Add float_round_nearest_even_max
  softfloat: Remove float_muladd_halve_result
  target/sparc: Use float*_muladd_scalbn
  target/arm: Use float*_muladd_scalbn
  softfloat: Add float{16,32,64}_muladd_scalbn
  tcg/optimize: Move fold_cmp_vec, fold_cmpsel_vec into alphabetic sort
  tcg/optimize: Move fold_bitsel_vec into alphabetic sort
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

accel/tcg: Move gen_intermediate_code to TCGCPUOps.translate_core

Convert all targets simultaneously, as the gen_intermediate_code
function disappears from the target. While there are possible
workarounds, they're larger than simply performing the conversion.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/hexagon: Simplify internal_mpyhh setup

Initialize x with accumulated via direct assignment,
rather than multiplying by 1.

Reviewed-by: Brian Cain <brian.cain@oss.qualcomm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/hexagon: Use mulu64 for int128_mul_6464

No need to open-code 64x64->128-bit multiplication.

Reviewed-by: Brian Cain <brian.cain@oss.qualcomm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/hexagon: Remove Double

This structure, with bitfields, is incorrect for big-endian.
Use extract64 and deposit64 instead.

Reviewed-by: Brian Cain <brian.cain@oss.qualcomm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/hexagon: Remove Float

This structure, with bitfields, is incorrect for big-endian.
Use the existing float32_getexp_raw which uses extract32.

Reviewed-by: Brian Cain <brian.cain@oss.qualcomm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/hexagon: Expand GEN_XF_ROUND

This massive macro is now only used once.
Expand it for use only by float64.

Reviewed-by: Brian Cain <brian.cain@oss.qualcomm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/hexagon: Remove internal_fmafx

The function is now unused.

Reviewed-by: Brian Cain <brian.cain@oss.qualcomm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/hexagon: Use float32_muladd for helper_sffm[as]_lib

There are multiple special cases for this instruction.
(1) The saturate to normal maximum instead of overflow to infinity is
    handled by the new float_round_nearest_even_max rounding mode.
(2) The 0 * n + c special case is handled by the new
    float_muladd_suppress_add_product_zero flag.
(3) The Inf - Inf -> 0 special case can be detected after the fact
    by examining float_flag_invalid_isi.

Reviewed-by: Brian Cain <brian.cain@oss.qualcomm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/hexagon: Use float32_muladd_scalbn for helper_sffma_sc

This instruction has a special case that 0 * x + c returns c
without the normal sign folding that comes with 0 + -0.
Use the new float_muladd_suppress_add_product_zero to
describe this.

Reviewed-by: Brian Cain <brian.cain@oss.qualcomm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/hexagon: Use float32_muladd for helper_sffms

There are no special cases for this instruction. Since hexagon
always uses default-nan mode, explicitly negating the first
input is unnecessary. Use float_muladd_negate_product instead.

Reviewed-by: Brian Cain <brian.cain@oss.qualcomm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/hexagon: Use float32_muladd for helper_sffma

There are no special cases for this instruction.

Reviewed-by: Brian Cain <brian.cain@oss.qualcomm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/hexagon: Use float32_mul in helper_sfmpy

There are no special cases for this instruction.
Remove internal_mpyf as unused.

Reviewed-by: Brian Cain <brian.cain@oss.qualcomm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

softfloat: Add float_muladd_suppress_add_product_zero

Certain Hexagon instructions suppress changes to the result
when the product of fma() is a true zero.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

softfloat: Add float_round_nearest_even_max

This rounding mode is used by Hexagon.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

softfloat: Remove float_muladd_halve_result

All uses have been convered to float*_muladd_scalbn.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/sparc: Use float*_muladd_scalbn

Use the scalbn interface instead of float_muladd_halve_result.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/arm: Use float*_muladd_scalbn

Use the scalbn interface instead of float_muladd_halve_result.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

softfloat: Add float{16,32,64}_muladd_scalbn

We currently have a flag, float_muladd_halve_result, to scale
the result by 2**-1. Extend this to handle arbitrary scaling.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Move fold_cmp_vec, fold_cmpsel_vec into alphabetic sort

The big comment just above says functions should be sorted.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Move fold_bitsel_vec into alphabetic sort

The big comment just above says functions should be sorted.
Add forward declarations as needed.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Re-enable sign-mask optimizations

All instances of s_mask have been converted to the new
representation. We can now re-enable usage.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Remove z_mask, s_mask from OptContext

All mask setting is now done with parameters via fold_masks_*.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use finish_folding as default in tcg_optimize

All non-default cases now finish folding within each function.
Do the same with the default case and assert it is done after.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use finish_folding in fold_bitsel_vec

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_zs in fold_xor

Avoid the use of the OptContext slots. Find TempOptInfo once.
Remove fold_masks as the function becomes unused.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use finish_folding in fold_tcg_ld_memcopy

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_zs in fold_tcg_ld

Avoid the use of the OptContext slots.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use finish_folding in fold_sub, fold_sub_vec

Duplicate fold_sub_vec into fold_sub instead of calling it,
now that fold_sub_vec always returns true.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Simplify sign bit test in fold_shift

Merge the two conditions, sign != 0 && !(z_mask & sign),
by testing ~z_mask & sign. If sign == 0, the logical and
will produce false.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_zs, fold_masks_s in fold_shift

Avoid the use of the OptContext slots. Find TempOptInfo once.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_zs in fold_sextract

Avoid the use of the OptContext slots. Find TempOptInfo once.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use finish_folding in fold_cmpsel_vec

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use finish_folding in fold_cmp_vec

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_z in fold_setcond2

Avoid the use of the OptContext slots.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_s in fold_negsetcond

Avoid the use of the OptContext slots.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_z in fold_setcond

Avoid the use of the OptContext slots.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Distinguish simplification in fold_setcond_zmask

Change return from bool to int; distinguish between
complete folding, simplification, and no change.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use finish_folding in fold_remainder

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Return true from fold_qemu_st, fold_tcg_st

Stores have no output operands, and so need no further work.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_zs in fold_qemu_ld

Avoid the use of the OptContext slots.

Be careful not to call fold_masks_zs when the memory operation
is wide enough to require multiple outputs, so split into two
functions: fold_qemu_ld_1reg and fold_qemu_ld_2reg.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_zs in fold_orc

Avoid the use of the OptContext slots.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_zs in fold_or

Avoid the use of the OptContext slots. Find TempOptInfo once.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_s in fold_not

Avoid the use of the OptContext slots.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_s in fold_nor

Avoid the use of the OptContext slots.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_z in fold_neg_no_const

Avoid the use of the OptContext slots.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_s in fold_nand

Avoid the use of the OptContext slots.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use finish_folding in fold_mul*

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_zs in fold_movcond

Avoid the use of the OptContext slots. Find TempOptInfo once.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_z in fold_extu

Avoid the use of the OptContext slots.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_zs in fold_exts

Avoid the use of the OptContext slots. Find TempOptInfo once.
Explicitly sign-extend z_mask instead of doing that manually.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use finish_folding in fold_extract2

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_z in fold_extract

Avoid the use of the OptContext slots. Find TempOptInfo once.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_s in fold_eqv

Add fold_masks_s as a trivial wrapper around fold_masks_zs.
Avoid the use of the OptContext slots.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use finish_folding in fold_dup, fold_dup2

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use finish_folding in fold_divide

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Compute sign mask in fold_deposit

The input which overlaps the sign bit of the output can
have its input s_mask propagated to the output s_mask.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_and and fold_masks_z in fold_deposit

Avoid the use of the OptContext slots. Find TempOptInfo once.
When we fold to and, use fold_and.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_z in fold_ctpop

Add fold_masks_z as a trivial wrapper around fold_masks_zs.
Avoid the use of the OptContext slots.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_zs in fold_count_zeros

Avoid the use of the OptContext slots. Find TempOptInfo once.
Compute s_mask from the union of the maximum count and the
op2 fallback for op1 being zero.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_zs in fold_bswap

Avoid the use of the OptContext slots. Find TempOptInfo once.
Always set s_mask along the BSWAP_OS path, since the result is
being explicitly sign-extended.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_zs in fold_andc

Avoid the use of the OptContext slots. Find TempOptInfo once.
Avoid double inversion of the value of second const operand.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use fold_masks_zs in fold_and

Avoid the use of the OptContext slots. Find TempOptInfo once.
Sink mask computation below fold_affected_mask early exit.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Introduce const value accessors for TempOptInfo

Introduce ti_is_const, ti_const_val, ti_is_const_val.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Use finish_folding in fold_add, fold_add_vec, fold_addsub2

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Change representation of s_mask

Change the representation from sign bit repetitions to all bits equal
to the sign bit, including the sign bit itself.

The previous format has a problem in that it is difficult to recreate
a valid sign mask after a shift operation: the "repetitions" part of
the previous format meant that applying the same shift as for the value
lead to an off-by-one value.

The new format, including the sign bit itself, means that the sign mask
can be manipulated in exactly the same way as the value, canonicalization
is easier.

Canonicalize the s_mask in fold_masks_zs, rather than requiring callers
to do so. Treat 0 as a non-canonical but typeless input for no sign
information, which will be reset as appropriate for the data type.
We can easily fold in the data from z_mask while canonicalizing.

Temporarily disable optimizations using s_mask while each operation is
converted to use fold_masks_zs and to the new form.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Augment s_mask from z_mask in fold_masks_zs

Consider the passed s_mask to be a minimum deduced from
either existing s_mask or from a sign-extension operation.
We may be able to deduce more from the set of known zeros.
Remove identical logic from several opcode folders.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Split out fold_masks_zs

Add a routine to which masks can be passed directly, rather than
storing them into OptContext. To be used in upcoming patches.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Copy mask writeback to fold_masks

Use of fold_masks should be restricted to those opcodes that
can reliably make use of it -- those with a single output,
and from higher-level folders that set up the masks.
Prepare for conversion of each folder in turn.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Split out fold_affected_mask

There are only a few logical operations which can compute
an "affected" mask. Split out handling of this optimization
to a separate function, only to be called when applicable.

Remove the a_mask field from OptContext, as the mask is
no longer stored anywhere.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

tcg/optimize: Split out finish_bb, finish_ebb

Call them directly from the opcode switch statement in tcg_optimize,
rather than in finish_folding based on opcode flags. Adjust folding
of conditional branches to match.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

plugins: optimize cpu_index code generation

When running with a single vcpu, we can return a constant instead of a
load when accessing cpu_index.
A side effect is that all tcg operations using it are optimized, most
notably scoreboard access.
When running a simple loop in user-mode, the speedup is around 20%.

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20241128213843.1023080-1-pierrick.bouvier@linaro.org>

tests/tcg: Do not use inttypes.h in multiarch/system/memory.c

make check-tcg fails on Fedora with the following error message:

    alpha-linux-gnu-gcc [...] qemu/tests/tcg/multiarch/system/memory.c -o memory [...]
    qemu/tests/tcg/multiarch/system/memory.c:17:10: fatal error: inttypes.h: No such file or directory
       17 | #include <inttypes.h>
          |          ^~~~~~~~~~~~
    compilation terminated.

The reason is that Fedora has cross-compilers, but no cross-glibc
headers. Fix by hardcoding the format specifiers and dropping the
include.

An alternative fix would be to introduce a configure check for
inttypes.h. But this would make it impossible to use Fedora
cross-compilers for softmmu tests, which used to work so far.

Fixes: ecbcc9ead2f8 ("tests/tcg: add a system test to check memory instrumentation")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20241010085906.226249-1-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Merge tag 'mem-2024-12-21' of https://github.com/davidhildenbrand/qemu into staging

Hi,

"Host Memory Backends" and "Memory devices" queue ("mem"):
- Fixup handling of virtio-mem unplug during system resets, as
  preparation for s390x support (especially kdump in the Linux guest)
- virtio-mem support for s390x

# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAmdnFD4RHGRhdmlkQHJl
# ZGhhdC5jb20ACgkQTd4Q9wD/g1rWBBAAp7WkYaNAjRy1PgpjNZ3z1gUJc/vk+skJ
# xVgGodA8txrJOFpNrbTyfhrdLs2TV4oWDvB/zrZRRtuxvur3O1EhFd9k6EqXuydr
# 0FunvLvVJwRHfEZycjN4aacQMRH3CJw07OaTzexeSl5UR/6w5PRofwUK4HX7W/Ka
# arqomGa3OJrs1+WgkV0Qcn4vh9HLRVv3iNC2Xo4W1wOCr1Du9zSPn9oC7zOQ0EO4
# ZC//7QsdkNRjUX/yMXMkhlSXx3b/RmRg2DBrxo7BZXg27VwGu4uHxL4LRBZiB2A7
# V9MqFOcVKzPMkXKTRjrgZ0vXQx1MPJ6WprEihMzMpYU6DrpA7KN/l8Ca8H24B2ln
# h7+bmkDsHVVcWovE9ii/9cMRfws6uWXXg3KoA8RQ8IbX1tU02lblw2uHhXEzcoge
# npqp/Z5LAiKVMetEnNnLH5thjut5PAEjuqD00cmZAMy4DNngLX2bGSdzMeVBkDMa
# 78ehLGRplm3t7ibUfaZaMKe6UD9tFrcD6XKsvUTXXHNbYO8ynbx58WOxSZmY98zU
# n3JNQRqtXYjBVlH3Dqm47vOTZHgOzFv3raa8BmSLpcBDeTXCTcUIl20s77dGw/vT
# r5YNCMN7O4YPFKUoRK9604QTgw6qlYaRTQlJD09usprGqVylb6gQtfZZuZkYDMp8
# sEI77QHsePA=
# =HDxr
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 21 Dec 2024 14:17:18 EST
# gpg:                using RSA key 1BD9CAAD735C4C3A460DFCCA4DDE10F700FF835A
# gpg:                issuer "david@redhat.com"
# gpg: Good signature from "David Hildenbrand <david@redhat.com>" [unknown]
# gpg:                 aka "David Hildenbrand <davidhildenbrand@gmail.com>" [full]
# gpg:                 aka "David Hildenbrand <hildenbr@in.tum.de>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 1BD9 CAAD 735C 4C3A 460D  FCCA 4DDE 10F7 00FF 835A

* tag 'mem-2024-12-21' of https://github.com/davidhildenbrand/qemu:
  s390x: virtio-mem support
  s390x/virtio-ccw: add support for virtio based memory devices
  s390x: remember the maximum page size
  s390x/pv: prepare for memory devices
  s390x/s390-virtio-ccw: prepare for memory devices
  s390x/s390-skeys: prepare for memory devices
  s390x/s390-stattrib-kvm: prepare for memory devices and sparse memory layouts
  s390x/s390-hypercall: introduce DIAG500 STORAGE_LIMIT
  s390x: introduce s390_get_memory_limit()
  s390x/s390-virtio-ccw: move setting the maximum guest size from sclp to machine code
  s390x: rename s390-virtio-hcall* to s390-hypercall*
  s390x/s390-virtio-hcall: prepare for more diag500 hypercalls
  s390x/s390-virtio-hcall: remove hypercall registration mechanism
  s390x/s390-virtio-ccw: don't crash on weird RAM sizes
  virtio-mem: unplug memory only during system resets, not device resets

Conflicts:
- hw/s390x/s390-stattrib-kvm.c
  sysemu/ -> system/ header rename conflict.
- hw/s390x/virtio-ccw-mem.c
  Make Property array const and removed DEFINE_PROP_END_OF_LIST() to
  conform to the latest conventions.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

s390x: virtio-mem support

Let's add our virtio-mem-ccw proxy device and wire it up. We should
be supporting everything (e.g., device unplug, "dynamic-memslots") that
we already support for the virtio-pci variant.

With a Linux guest that supports virtio-mem (and has automatic memory
onlining properly configured) the following example will work:

1. Start a VM with 4G initial memory and a virtio-mem device with a maximum
   capacity of 16GB:

   qemu/build/qemu-system-s390x \
    --enable-kvm \
    -m 4G,maxmem=20G \
    -nographic \
    -smp 8 \
    -hda Fedora-Server-KVM-40-1.14.s390x.qcow2 \
    -chardev socket,id=monitor,path=/var/tmp/monitor,server,nowait \
    -mon chardev=monitor,mode=readline \
    -object memory-backend-ram,id=mem0,size=16G,reserve=off \
    -device virtio-mem-ccw,id=vmem0,memdev=mem0,dynamic-memslots=on

2. Query the current size of virtio-mem device:

    (qemu) info memory-devices
    Memory device [virtio-mem]: "vmem0"
      memaddr: 0x100000000
      node: 0
      requested-size: 0
      size: 0
      max-size: 17179869184
      block-size: 1048576
      memdev: /objects/mem0

3. Request to grow it to 8GB (hotplug 8GB):

    (qemu) qom-set vmem0 requested-size 8G
    (qemu) info memory-devices
    Memory device [virtio-mem]: "vmem0"
      memaddr: 0x100000000
      node: 0
      requested-size: 8589934592
      size: 8589934592
      max-size: 17179869184
      block-size: 1048576
      memdev: /objects/mem0

4. Request to grow to 16GB (hotplug another 8GB):

    (qemu) qom-set vmem0 requested-size 16G
    (qemu) info memory-devices
    Memory device [virtio-mem]: "vmem0"
      memaddr: 0x100000000
      node: 0
      requested-size: 17179869184
      size: 17179869184
      max-size: 17179869184
      block-size: 1048576
      memdev: /objects/mem0

5. Try to hotunplug all memory again, shrinking to 0GB:

    (qemu) qom-set vmem0 requested-size 0G
    (qemu) info memory-devices
    Memory device [virtio-mem]: "vmem0"
      memaddr: 0x100000000
      node: 0
      requested-size: 0
      size: 0
      max-size: 17179869184
      block-size: 1048576
      memdev: /objects/mem0

6. If it worked, unplug the device

    (qemu) device_del vmem0
    (qemu) info memory-devices
    (qemu) object_del mem0

7. Hotplug a new device with a smaller capacity and directly size it to 1GB

    (qemu) object_add memory-backend-ram,id=mem0,size=8G,reserve=off
    (qemu) device_add virtio-mem-ccw,id=vmem0,memdev=mem0,\
                      dynamic-memslots=on,requested-size=1G
    (qemu) info memory-devices
    Memory device [virtio-mem]: "vmem0"
      memaddr: 0x100000000
      node: 0
      requested-size: 1073741824
      size: 1073741824
      max-size: 8589934592
      block-size: 1048576
      memdev: /objects/mem0

Trying to use a virtio-mem device backed by hugetlb into a !hugetlb VM
correctly results in the error:
   ... Memory device uses a bigger page size than initial memory

Note that the virtio-mem driver in Linux will supports 1 MiB (pageblock)
granularity.

Message-ID: <20241219144115.2820241-15-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>

Merge tag 'exec-20241220' of https://github.com/philmd/qemu into staging

Accel & Exec patch queue

- Ignore writes to CNTP_CTL_EL0 on HVF ARM (Alexander)
- Add '-d invalid_mem' logging option (Zoltan)
- Create QOM containers explicitly (Peter)
- Rename sysemu/ -> system/ (Philippe)
- Re-orderning of include/exec/ headers (Philippe)
  Move a lot of declarations from these legacy mixed bag headers:
    . "exec/cpu-all.h"
    . "exec/cpu-common.h"
    . "exec/cpu-defs.h"
    . "exec/exec-all.h"
    . "exec/translate-all"
  to these more specific ones:
    . "exec/page-protection.h"
    . "exec/translation-block.h"
    . "user/cpu_loop.h"
    . "user/guest-host.h"
    . "user/page-protection.h"

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmdlnyAACgkQ4+MsLN6t
# wN6mBw//QFWi7CrU+bb8KMM53kOU9C507tjn99LLGFb5or73/umDsw6eo/b8DHBt
# KIwGLgATel42oojKfNKavtAzLK5rOrywpboPDpa3SNeF1onW+99NGJ52LQUqIX6K
# A6bS0fPdGG9ZzEuPpbjDXlp++0yhDcdSgZsS42fEsT7Dyj5gzJYlqpqhiXGqpsn8
# 4Y0UMxSL21K3HEexlzw2hsoOBFA3tUm2ujNDhNkt8QASr85yQVLCypABJnuoe///
# 5Ojl5wTBeDwhANET0rhwHK8eIYaNboiM9fHopJYhvyw1bz6yAu9jQwzF/MrL3s/r
# xa4OBHBy5mq2hQV9Shcl3UfCQdk/vDaYaWpgzJGX8stgMGYfnfej1SIl8haJIfcl
# VMX8/jEFdYbjhO4AeGRYcBzWjEJymkDJZoiSWp2NuEDi6jqIW+7yW1q0Rnlg9lay
# ShAqLK5Pv4zUw3t0Jy3qv9KSW8sbs6PQxtzXjk8p97rTf76BJ2pF8sv1tVzmsidP
# 9L92Hv5O34IqzBu2oATOUZYJk89YGmTIUSLkpT7asJZpBLwNM2qLp5jO00WVU0Sd
# +kAn324guYPkko/TVnjC/AY7CMu55EOtD9NU35k3mUAnxXT9oDUeL4NlYtfgrJx6
# x1Nzr2FkS68+wlPAFKNSSU5lTjsjNaFM0bIJ4LCNtenJVP+SnRo=
# =cjz8
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 20 Dec 2024 11:45:20 EST
# gpg:                using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD  6BB2 E3E3 2C2C DEAD C0DE

* tag 'exec-20241220' of https://github.com/philmd/qemu: (59 commits)
  util/qemu-timer: fix indentation
  meson: Do not define CONFIG_DEVICES on user emulation
  system/accel-ops: Remove unnecessary 'exec/cpu-common.h' header
  system/numa: Remove unnecessary 'exec/cpu-common.h' header
  hw/xen: Remove unnecessary 'exec/cpu-common.h' header
  target/mips: Drop left-over comment about Jazz machine
  target/mips: Remove tswap() calls in semihosting uhi_fstat_cb()
  target/xtensa: Remove tswap() calls in semihosting simcall() helper
  accel/tcg: Un-inline translator_is_same_page()
  accel/tcg: Include missing 'exec/translation-block.h' header
  accel/tcg: Move tcg_cflags_has/set() to 'exec/translation-block.h'
  accel/tcg: Restrict curr_cflags() declaration to 'internal-common.h'
  qemu/coroutine: Include missing 'qemu/atomic.h' header
  exec/translation-block: Include missing 'qemu/atomic.h' header
  accel/tcg: Declare cpu_loop_exit_requested() in 'exec/cpu-common.h'
  exec/cpu-all: Include 'cpu.h' earlier so MMU_USER_IDX is always defined
  target/sparc: Move sparc_restore_state_to_opc() to cpu.c
  target/sparc: Uninline cpu_get_tb_cpu_state()
  target/loongarch: Declare loongarch_cpu_dump_state() locally
  user: Move various declarations out of 'exec/exec-all.h'
  ...

Conflicts:
hw/char/riscv_htif.c
hw/intc/riscv_aplic.c
target/s390x/cpu.c

Apply sysemu header path changes to not in the pull request.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

s390x/virtio-ccw: add support for virtio based memory devices

Let's implement support for abstract virtio based memory devices, using
the virtio-pci implementation as an orientation. Wire them up in the
machine hotplug handler, taking care of s390x page size limitations.

As we neither support virtio-mem or virtio-pmem yet, the code is
effectively unused. We'll implement support for virtio-mem based on this
next.

Note that we won't wire up the virtio-pci variant (should currently be
impossible due to lack of support for MSI-X), but we'll add a safety net
to reject plugging them in the pre-plug handler.

Message-ID: <20241219144115.2820241-14-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>

s390x: remember the maximum page size

Let's remember the value (successfully) set via s390_set_max_pagesize().
This will be helpful to reject hotplugged memory devices that would exceed
this initially set page size.

Handle it just like how we handle s390_get_memory_limit(), storing it in
the machine, and moving the handling to machine code.

Message-ID: <20241219144115.2820241-13-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>

s390x/pv: prepare for memory devices

Let's avoid checking for the maxram_size, and instead rely on the memory
limit determined in s390_memory_init(), that might be larger than
maxram_size, for example due to alignment purposes.

This check now correctly mimics what the kernel will check in
kvm_s390_pv_set_aside(), whereby a VM <= 2 GiB VM would end up using
a segment type ASCE.

Message-ID: <20241219144115.2820241-12-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>

s390x/s390-virtio-ccw: prepare for memory devices

Let's prepare our address space for memory devices if enabled via
"maxmem" and if we have CONFIG_MEM_DEVICE enabled at all. Note that
CONFIG_MEM_DEVICE will be selected automatically once we add support
for devices.

Just like on other architectures, the region container for memory devices
is placed directly above our initial memory. For now, we only align the
start address of the region up to 1 GiB, but we won't add any additional
space to the region for internal alignment purposes; this can be done in
the future if really required.

The RAM size returned via SCLP is not modified, as this only
covers initial RAM (and standby memory we don't implement) and not memory
devices; clarify that in the docs of read_SCP_info(). Existing OSes without
support for memory devices will keep working as is, even when memory
devices would be attached the VM.

Guest OSs which support memory devices, such as virtio-mem, will
consult diag500(), to find out the maximum possible pfn. Guest OSes that
don't support memory devices, don't have to be changed and will continue
relying on information provided by SCLP.

There are no remaining maxram_size users in s390x code, and the remaining
ram_size users only care about initial RAM:
* hw/s390x/ipl.c
* hw/s390x/s390-hypercall.c
* hw/s390x/sclp.c
* target/s390x/kvm/pv.c

Message-ID: <20241219144115.2820241-11-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>

s390x/s390-skeys: prepare for memory devices

With memory devices, we will have storage keys for memory that
exceeds the initial ram size.

The TODO already states that current handling is subopimal,
but we won't worry about improving that (TCG-only) thing for now.

Message-ID: <20241219144115.2820241-10-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>

s390x/s390-stattrib-kvm: prepare for memory devices and sparse memory layouts

With memory devices, we will have storage attributes for memory that
exceeds the initial ram size. Further, we can easily have memory holes,
for which there (currently) are no storage attributes.

In particular, with memory holes, KVM_S390_SET_CMMA_BITS will fail to set
some storage attributes.

So let's do it like we handle storage keys migration, relying on
guest_phys_blocks_append(). However, in contrast to storage key
migration, we will handle it on the migration destination.

This is a preparation for virtio-mem support. Note that ever since the
"early migration" feature was added (x-early-migration), the state
of device blocks (plugged/unplugged) is migrated early such that
guest_phys_blocks_append() will properly consider all currently plugged
memory blocks and skip any unplugged ones.

In the future, we should try getting rid of the large temporary buffer
and also not send any attributes for any memory holes, just so they
get ignored on the destination.

Message-ID: <20241219144115.2820241-9-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>

s390x/s390-hypercall: introduce DIAG500 STORAGE_LIMIT

A guest OS that supports memory hotplug / memory devices must during
boot be aware of the maximum possible physical memory address that it might
have to handle at a later stage during its runtime.

For example, the maximum possible memory address might be required to
prepare the kernel virtual address space accordingly (e.g., select page
table hierarchy depth).

On s390x there is currently no such mechanism that is compatible with
paravirtualized memory devices, because the whole SCLP interface was
designed around the idea of "storage increments" and "standby memory".
Paravirtualized memory devices we want to support, such as virtio-mem, have
no intersection with any of that, but could co-exist with them in the
future if ever needed.

In particular, a guest OS must never detect and use device memory
without the help of a proper device driver. Device memory must not be
exposed in any firmware-provided memory map (SCLP or diag260 on s390x).
For this reason, these memory devices will be places in memory *above*
the "maximum storage increment" exposed via SCLP.

Let's provide a new diag500 subcode to query the memory limit determined in
s390_memory_init().

Message-ID: <20241219144115.2820241-8-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>

s390x: introduce s390_get_memory_limit()

Let's add s390_get_memory_limit(), to query what has been successfully
set via s390_set_memory_limit(). Allow setting the limit only once.

We'll remember the limit in the machine state. Move
s390_set_memory_limit() to machine code, merging it into
set_memory_limit(), because this really is a machine property.

Message-ID: <20241219144115.2820241-7-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>

s390x/s390-virtio-ccw: move setting the maximum guest size from sclp to machine code

Nowadays, it feels more natural to have that code located in
s390_memory_init(), where we also have direct access to the machine
object.

While at it, use the actual RAM size, not the maximum RAM size which
cannot currently be reached without support for any memory devices.
Consequently update s390_pv_vm_try_disable_async() to rely on the RAM size
as well, to avoid temporary issues while we further rework that
handling.

set_memory_limit() is temporary, we'll merge it with
s390_set_memory_limit() next.

Message-ID: <20241219144115.2820241-6-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>

s390x: rename s390-virtio-hcall* to s390-hypercall*

Let's make it clearer that we are talking about general
QEMU/KVM-specific hypercalls.

Message-ID: <20241219144115.2820241-5-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>

s390x/s390-virtio-hcall: prepare for more diag500 hypercalls

Let's generalize, abstracting the virtio bits. diag500 is now a generic
hypercall to handle QEMU/KVM specific things. Explicitly specify all
already defined subcodes, including legacy ones (so we know what we can
use for new hypercalls).

Move the PGM_SPECIFICATION injection into the renamed function
handle_diag_500(), so we can turn it into a void function.

We'll rename the files separately, so git properly detects the rename.

Message-ID: <20241219144115.2820241-4-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>

s390x/s390-virtio-hcall: remove hypercall registration mechanism

Nowadays, we only have a single machine type in QEMU, everything is based
on virtio-ccw and the traditional virtio machine does no longer exist. No
need to dynamically register diag500 handlers. Move the two existing
handlers into s390-virtio-hcall.c.

Message-ID: <20241219144115.2820241-3-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>