git.maquefel.me Git - qemu.git/log

target/arm: Fix SVE SDOT/UDOT/USDOT (4-way, indexed)

Our implementation of the indexed version of SVE SDOT/UDOT/USDOT got
the calculation of the inner loop terminator wrong.  Although we
correctly account for the element size when we calculate the
terminator for the first iteration:
   intptr_t segend = MIN(16 / sizeof(TYPED), opr_sz_n);
we don't do that when we move it forward after the first inner loop
completes.  The intention is that we process the vector in 128-bit
segments, which for a 64-bit element size should mean (1, 2), (3, 4),
(5, 6), etc.  This bug meant that we would iterate (1, 2), (3, 4, 5,
6), (7, 8, 9, 10) etc and apply the wrong indexed element to some of
the operations, and also index off the end of the vector.

You don't see this bug if the vector length is small enough that we
don't need to iterate the outer loop, i.e.  if it is only 128 bits,
or if it is the 64-bit special case from AA32/AA64 AdvSIMD.  If the
vector length is 256 bits then we calculate the right results for the
elements in the vector but do index off the end of the vector. Vector
lengths greater than 256 bits see wrong answers. The instructions
that produce 32-bit results behave correctly.

Fix the recalculation of 'segend' for subsequent iterations, and
restore a version of the comment that was lost in the refactor of
commit 7020ffd656a5 that explains why we only need to clamp segend to
opr_sz_n for the first iteration, not the later ones.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2595
Fixes: 7020ffd656a5 ("target/arm: Macroize helper_gvec_{s,u}dot_idx_{b,h}")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241101185544.2130972-1-peter.maydell@linaro.org

target/arm: Add new MMU indexes for AArch32 Secure PL1&0

Our current usage of MMU indexes when EL3 is AArch32 is confused.
Architecturally, when EL3 is AArch32, all Secure code runs under the
Secure PL1&0 translation regime:
* code at EL3, which might be Mon, or SVC, or any of the
   other privileged modes (PL1)
* code at EL0 (Secure PL0)

This is different from when EL3 is AArch64, in which case EL3 is its
own translation regime, and EL1 and EL0 (whether AArch32 or AArch64)
have their own regime.

We claimed to be mapping Secure PL1 to our ARMMMUIdx_EL3, but didn't
do anything special about Secure PL0, which meant it used the same
ARMMMUIdx_EL10_0 that NonSecure PL0 does.  This resulted in a bug
where arm_sctlr() incorrectly picked the NonSecure SCTLR as the
controlling register when in Secure PL0, which meant we were
spuriously generating alignment faults because we were looking at the
wrong SCTLR control bits.

The use of ARMMMUIdx_EL3 for Secure PL1 also resulted in the bug that
we wouldn't honour the PAN bit for Secure PL1, because there's no
equivalent _PAN mmu index for it.

Fix this by adding two new MMU indexes:
* ARMMMUIdx_E30_0 is for Secure PL0
* ARMMMUIdx_E30_3_PAN is for Secure PL1 when PAN is enabled
The existing ARMMMUIdx_E3 is used to mean "Secure PL1 without PAN"
(and would be named ARMMMUIdx_E30_3 in an AArch32-centric scheme).

These extra two indexes bring us up to the maximum of 16 that the
core code can currently support.

This commit:
* adds the new MMU index handling to the various places
   where we deal in MMU index values
* adds assertions that we aren't AArch32 EL3 in a couple of
   places that currently use the E10 indexes, to document why
   they don't also need to handle the E30 indexes
* documents in a comment why regime_has_2_ranges() doesn't need
   updating

Notes for backporting: this commit depends on the preceding revert of
4c2c04746932; that revert and this commit should probably be
backported to everywhere that we originally backported 4c2c04746932.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2326
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2588
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241101142845.1712482-3-peter.maydell@linaro.org

Revert "target/arm: Fix usage of MMU indexes when EL3 is AArch32"

This reverts commit 4c2c0474693229c1f533239bb983495c5427784d.

This commit tried to fix a problem with our usage of MMU indexes when
EL3 is AArch32, using what it described as a "more complicated
approach" where we share the same MMU index values for Secure PL1&0
and NonSecure PL1&0. In theory this should work, but the change
didn't account for (at least) two things:

(1) The design change means we need to flush the TLBs at any point
where the CPU state flips from one to the other.  We already flush
the TLB when SCR.NS is changed, but we don't flush the TLB when we
take an exception from NS PL1&0 into Mon or when we return from Mon
to NS PL1&0, and the commit didn't add any code to do that.

(2) The ATS12NS* address translate instructions allow Mon code (which
is Secure) to do a stage 1+2 page table walk for NS.  I thought this
was OK because do_ats_write() does a page table walk which doesn't
use the TLBs, so because it can pass both the MMU index and also an
ARMSecuritySpace argument we can tell the table walk that we want NS
stage1+2, not S.  But that means that all the code within the ptw
that needs to find e.g.  the regime EL cannot do so only with an
mmu_idx -- all these functions like regime_sctlr(), regime_el(), etc
would need to pass both an mmu_idx and the security_space, so they
can tell whether this is a translation regime controlled by EL1 or
EL3 (and so whether to look at SCTLR.S or SCTLR.NS, etc).

In particular, because regime_el() wasn't updated to look at the
ARMSecuritySpace it would return 1 even when the CPU was in Monitor
mode (and the controlling EL is 3).  This meant that page table walks
in Monitor mode would look at the wrong SCTLR, TCR, etc and would
generally fault when they should not.

Rather than trying to make the complicated changes needed to rescue
the design of 4c2c04746932, we revert it in order to instead take the
route that that commit describes as "the most straightforward" fix,
where we add new MMU indexes EL30_0, EL30_3, EL30_3_PAN to correspond
to "Secure PL1&0 at PL0", "Secure PL1&0 at PL1", and "Secure PL1&0 at
PL1 with PAN".

This revert will re-expose the "spurious alignment faults in
Secure PL0" issue #2326; we'll fix it again in the next commit.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-id: 20241101142845.1712482-2-peter.maydell@linaro.org
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

softfloat: Remove fallback rule from pickNaN()

Now that all targets have been converted to explicitly set a NaN
propagation rule, we can remove the set of target ifdefs (which now
list every target) and clean up the references to fallback behaviour
for float_2nan_prop_none.

The "default" case in the switch will catch any remaining places
where status->float_2nan_prop_rule was not set by the target.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-22-peter.maydell@linaro.org

target/rx: Explicitly set 2-NaN propagation rule

Set the NaN propagation rule explicitly for the float_status word
used in the rx target.

This not the architecturally correct behaviour, but since this is a
no-behaviour-change patch, we leave a TODO note to that effect.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-21-peter.maydell@linaro.org

target/openrisc: Explicitly set 2-NaN propagation rule

Set the NaN propagation rule explicitly for the float_status word
used in the openrisc target.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-20-peter.maydell@linaro.org

target/microblaze: Explicitly set 2-NaN propagation rule

Set the NaN propagation rule explicitly for the float_status word
used in the microblaze target.

This is probably not the architecturally correct behaviour,
but since this is a no-behaviour-change patch, we leave a
TODO note to that effect.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-19-peter.maydell@linaro.org

target/microblaze: Move setting of float rounding mode to reset

Although the floating point rounding mode for Microblaze is always
nearest-even, we cannot set it just once in the CPU initfn. This is
because env->fp_status is in the part of the CPU state struct that is
zeroed on reset.

Move the call to set_float_rounding_mode() into the reset fn.

(This had no guest-visible effects because it happens that the
float_round_nearest_even enum value is 0, so when the struct was
zeroed it didn't corrupt the setting.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-18-peter.maydell@linaro.org

target/alpha: Explicitly set 2-NaN propagation rule

Set the NaN propagation rule explicitly for the float_status word
used in this target.

This is a no-behaviour-change commit, so we retain the existing
behaviour of x87-style pick-largest-significand NaN propagation.
This is however not the architecturally correct handling, so we leave
a TODO note to that effect.

We also leave a TODO note pointing out that all this code in the cpu
initfn (including the existing setting up of env->flags and the FPCR)
should be in a currently non-existent CPU reset function.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-17-peter.maydell@linaro.org

target/i386: Set 2-NaN propagation rule explicitly

Set the NaN propagation rule explicitly for the float_status words
used in the x86 target.

This is a no-behaviour-change commit, so we retain the existing
behaviour of using the x87-style "prefer QNaN over SNaN, then prefer
the NaN with the larger significand" for MMX and SSE. This is
however not the documented hardware behaviour, so we leave a TODO
note about what we should be doing instead.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-16-peter.maydell@linaro.org

target/xtensa: Explicitly set 2-NaN propagation rule

Set the NaN propagation rule explicitly in xtensa_use_first_nan().

(When we convert the softfloat pickNaNMulAdd routine to also
select a NaN propagation rule at runtime, we will be able to
remove the use_first_nan flag because the propagation rules
will handle everything.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-15-peter.maydell@linaro.org

target/xtensa: Factor out calls to set_use_first_nan()

In xtensa we currently call set_use_first_nan() in a lot of
places where we want to switch the NaN-propagation handling.
We're about to change the softfloat API we use to do that,
so start by factoring all the calls out into a single
xtensa_use_first_nan() function.

The bulk of this change was done with
sed -i -e 's/set_use_first_nan($[^,]*$,[^)]*)/xtensa_use_first_nan(env, \1)/' target/xtensa/fpu_helper.c

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-14-peter.maydell@linaro.org

target/sparc: Explicitly set 2-NaN propagation rule

Set the NaN propagation rule explicitly in the float_status
words we use.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-13-peter.maydell@linaro.org

target/sparc: Move cpu_put_fsr(env, 0) call to reset

Currently we call cpu_put_fsr(0) in sparc_cpu_realizefn(), which
initializes various fields in the CPU struct:
* fsr_cexc_ftt
* fcc[]
* fsr_qne
* fsr
It also sets the rounding mode in env->fp_status.

This is largely pointless, because when we later reset the CPU
this will zero out all the fields up until the "end_reset_fields"
label, which includes all of these (but not fp_status!)

Move the cpu_put_fsr(env, 0) call to reset, because that expresses
the logical requirement: we want to reset FSR to 0 on every reset.
This isn't a behaviour change because the fields are all zero anyway.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-12-peter.maydell@linaro.org

target/m68k: Initialize float_status fields in gdb set/get functions

In cf_fpu_gdb_get_reg() and cf_fpu_gdb_set_reg() we use a temporary
float_status variable to pass to floatx80_to_float64() and
float64_to_floatx80(), but we don't initialize it, meaning that those
functions could access uninitialized data. Zero-init the structs.

(We don't need to set a NaN-propagation rule here because we
don't use these with a 2-argument fpu operation.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-11-peter.maydell@linaro.org

target/m68k: Explicitly set 2-NaN propagation rule

Explicitly set the 2-NaN propagation rule on env->fp_status
and on the temporary fp_status that we use in frem (since
we pass that to a division operation function).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

target/ppc: Explicitly set 2-NaN propagation rule

Set the 2-NaN propagation rule explicitly in env->fp_status
and env->vec_status.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-9-peter.maydell@linaro.org

target/s390x: Explicitly set 2-NaN propagation rule

Set the 2-NaN propagation rule explicitly in env->fpu_status.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-8-peter.maydell@linaro.org

target/hppa: Explicitly set 2-NaN propagation rule

Set the 2-NaN propagation rule explicitly in env->fp_status.

Really we only need to do this at CPU reset (after reset has zeroed
out most of the CPU state struct, which typically includes fp_status
fields). However target/hppa does not currently implement CPU reset
at all, so leave a TODO comment to note that this could be moved if
we ever do implement reset.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-7-peter.maydell@linaro.org

target/loongarch: Explicitly set 2-NaN propagation rule

Set the 2-NaN propagation rule explicitly in the float_status word we
use.

(There are a couple of places in fpu_helper.c where we create a
dummy float_status word with "float_status *s = { };", but these
are only used for calling float*_is_quiet_nan() so it doesn't
matter that we don't set a 2-NaN propagation rule there.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-6-peter.maydell@linaro.org

target/mips: Explicitly set 2-NaN propagation rule

Set the 2-NaN propagation rule explicitly in the float_status words
we use.

For active_fpu.fp_status, we do this in a new fp_reset() function
which mirrors the existing msa_reset() function in doing "first call
restore to set the fp status parts that depend on CPU state, then set
the fp status parts that are constant".

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20241025141254.2141506-5-peter.maydell@linaro.org

target/arm: Explicitly set 2-NaN propagation rule

Set the 2-NaN propagation rule explicitly in the float_status words
we use. We wrap this plus the pre-existing setting of the
tininess-before-rounding flag in a new function
arm_set_default_fp_behaviours() to avoid repetition, since we have a
lot of float_status words at this point.

The situation with FPA11 emulation in linux-user is a little odd, and
arguably "correct" behaviour there would be to exactly match a real
Linux kernel's FPA11 emulation. However FPA11 emulation is
essentially dead at this point and so it seems better to continue
with QEMU's current behaviour and leave a comment describing the
situation.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-4-peter.maydell@linaro.org

tests/fp: Explicitly set 2-NaN propagation rule

Explicitly set a 2-NaN propagation rule in the softfloat tests. In
meson.build we put -DTARGET_ARM in fpcflags, and so we should select
here the Arm propagation rule of float_2nan_prop_s_ab.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-3-peter.maydell@linaro.org

softfloat: Allow 2-operand NaN propagation rule to be set at runtime

IEEE 758 does not define a fixed rule for which NaN to pick as the
result if both operands of a 2-operand operation are NaNs.  As a
result different architectures have ended up with different rules for
propagating NaNs.

QEMU currently hardcodes the NaN propagation logic into the binary
because pickNaN() has an ifdef ladder for different targets.  We want
to make the propagation rule instead be selectable at runtime,
because:
* this will let us have multiple targets in one QEMU binary
* the Arm FEAT_AFP architectural feature includes letting
   the guest select a NaN propagation rule at runtime
* x86 specifies different propagation rules for x87 FPU ops
   and for SSE ops, and specifying the rule in the float_status
   would let us emulate this, instead of wrongly using the
   x87 rules everywhere

In this commit we add an enum for the propagation rule, the field in
float_status, and the corresponding getters and setters.  We change
pickNaN to honour this, but because all targets still leave this
field at its default 0 value, the fallback logic will pick the rule
type with the old ifdef ladder.

It's valid not to set a propagation rule if default_nan_mode is
enabled, because in that case there's no need to pick a NaN; all the
callers of pickNaN() catch this case and skip calling it.  So we can
already assert that we don't get into the "no rule defined" codepath
for our four targets which always set default_nan_mode: Hexagon,
RiscV, SH4 and Tricore, and for the one target which does not have FP
at all: avr.  These targets will not need to be updated to call
set_float_2nan_prop_rule().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-2-peter.maydell@linaro.org

Merge tag 'pull-request-2024-11-04' of https://gitlab.com/thuth/qemu into staging

* Remove the redundant macOS-15 CI job
* Various fixes, improvements and additions for the functional test suite
* Restore the sh4eb target
* Fix the OpenBSD VM test
* Re-enable the pci-bridge device on s390x
* Minor clean-ups / fixes for the next-cube machine

# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmcoyoQRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbVwRg/+M8RWxOW5M2GmEfAj/e1IatLS2eXek6fE
# YOCPxvc5VK5rjXzcRKNqNKP53gBkF0PRho68b3IkBI6ylDOdzdRcDYsi8CSLWbG4
# O6heGJRzn9HyIS+UShAoqoj9l7lxODcZvEJK2ueiy/Hri/Zc4TpullLhSgAPKTgn
# Ln75nd+hWwS9e0df1BSOBax2iEU/2j1yuBVCcFgFHH8K39Wqrs6Xtyay9yPjYLUg
# pHNGObikrLF47KGI5yZ22/iVgwr5yhd3KzycjbxHVccCqZSsGl2xkCBwKNlIodRO
# RMhTzUhOMi/RSjvdSbM5d2Nh4aCJ5mNzzWSUklHdYWnrMOv6uECJ0h2o0ve5L4kT
# jtTGTcLe8a+JsDs+UxeVWqqlUf4w8Vv0DRky6D6ln25hcqrOveJE++o58FHFt/AX
# jEolRU5k2tMpOSMgE3wAi5BVCttpI3Idly/IC+rntMjQOTwdKPlgfcBIqQmXI6M8
# dM6oUf9WnIr/CAt7qG6QjCONjeBmuMlZV4+v7xdqFsJpwCTyo6k3LwoHx3pTC73z
# 6x0SmpeDoTzdw6B7O1HlLNllW7hd2/5GQ5qTH+E1pKAktkOf3MQeSD6qQEMjwH7T
# e7hNUV+APgtDqpnQ0xcTL5AwNAkDGKoKBmaIp0vlwGUET55fw5N0Wb6Oo9LOgeFl
# yqi5GxIuJu4=
# =CTOw
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 04 Nov 2024 13:22:12 GMT
# gpg:                using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg:                issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg:                 aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg:                 aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg:                 aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3  EAB9 2ED9 D774 FE70 2DB5

* tag 'pull-request-2024-11-04' of https://gitlab.com/thuth/qemu:
  tests/functional: Convert the OrangePi tests to the functional framework
  tests/functional: Convert BananaPi tests to the functional framework
  tests/functional: Convert the tcg_plugins test
  next-cube: remove cpu parameter from next_scsi_init()
  next-cube: fix up compilation when DEBUG_NEXT is enabled
  hw/s390x: Re-enable the pci-bridge device on s390x
  tests/functional: Fix the s390x and ppc64 tuxrun tests
  tests/vm/openbsd: Remove the "Time appears wrong" workaround
  tests/functional: Add a test for sh4eb
  Revert "Remove the unused sh4eb target"
  tests/functional: make cached asset files read-only
  tests/functional: make tuxrun disk images writable
  .gitlab-ci.d/cirrus: Remove the macos-15 job

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge tag 'seabios-hppa-v17-pull-request' of https://github.com/hdeller/qemu-hppa into staging

SeaBIOS-hppa v17 pull request

Please pull a single commit, which updates SeaBIOS-hppa
to version 17.

If comes with some important firmware and SCSI fixes and
prepares for futher development to support 64-bit HP-UX
and MPE/UX in the future.

New PDC functions & general enhancements:
- Add PDC_MODEL_GET_INSTALL_KERNEL firmware call
- Add PDC_PAT_EVENT firmware call
- Support ENTRY_IO_BOOTOUT
- Prefer memory-access over io-access of GSP serial port
- Disable LMMIO_DIRECT0 range during modification
- Small optimizations in IODC call

Fixes:
- esp-scsi: indicate acceptance of MESSAGE IN phase data
- Avoid crash when booting without SCSI controller
- Remove exec flag from hppa-firmware.img
- Fix LMMIO detection for PCI cards on Astro/Elroy
- Avoid trashing MPE IPL bootloader stack
- HP-UX 11 64-bit saves number of RAM pages in PAGE0 at 0x33c
- Fix return value of PDC_CACHE/PDC_CACHE_RET_SPID for space id hashing
- Allow PDC functions to act when called in narrow mode
- pcidevice: Use portaddr_t for io port addresses

Cleanups:
- Change default make target to "parisc"
- Clean the "out-64" directory on "make clean"

# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZyfV0AAKCRD3ErUQojoP
# X63XAP9sxLngebfWXdb9YI4+3N2xBpT772tQha3QYdejF0QvrAEAwpB8g8MFHHz3
# QKZfvPERw2nBhjtpf+Dl9iexoKh8YQI=
# =MjU+
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 03 Nov 2024 19:58:08 GMT
# gpg:                using EDDSA key BCE9123E1AD29F07C049BBDEF712B510A23A0F5F
# gpg: Good signature from "Helge Deller <deller@gmx.de>" [unknown]
# gpg:                 aka "Helge Deller <deller@kernel.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4544 8228 2CD9 10DB EF3D  25F8 3E5F 3D04 A7A2 4603
#      Subkey fingerprint: BCE9 123E 1AD2 9F07 C049  BBDE F712 B510 A23A 0F5F

* tag 'seabios-hppa-v17-pull-request' of https://github.com/hdeller/qemu-hppa:
  target/hppa: Update SeaBIOS-hppa to version 17

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge tag 'pull-loongarch-20241102' of https://gitlab.com/gaosong/qemu into staging

pull-loongarch-20241102

# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZyXbXgAKCRBAov/yOSY+
# 37a9BADZ7vI2idWNXdH+mLNDZNSOxfdKp6ggNgKS3S48Hi2zR72MEhwvR9dGlHDL
# 98agrbV7/jI9Z+0dLAxvlyl1MvXfnn2sXYgUuZp6IAaQzFBa11HBAK7UFh3sTA4A
# gD4oPwl8AdJiFvDN6vNjS+dO0ls+j/YMaoLkAKLv15dlWtg4Rw==
# =EZnr
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 02 Nov 2024 07:57:18 GMT
# gpg:                using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C  6C2C 40A2 FFF2 3926 3EDF

* tag 'pull-loongarch-20241102' of https://gitlab.com/gaosong/qemu:
  target/loongarch: Add steal time support on migration
  hw/loongarch/boot: Use warn_report when no kernel filename
  linux-headers: Update to Linux v6.12-rc5
  linux-headers: loongarch: Add kvm_para.h
  linux-headers: Add unistd_64.h
  target/loongarch/kvm: Implement LoongArch PMU extension
  target/loongarch: Implement lbt registers save/restore function
  target/loongarch: Add loongson binary translation feature

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

tests/functional: Convert the OrangePi tests to the functional framework

Move the OrangePi tests from tests/avocado/boot_linux_console.py into
a new file dedicated for OrangePi tests in the functional framework
and update the hash sums of the assets to sha256 along the way.
For the buildroot image and the Armbian image, we've got to switch to
a newer version since the old images have been removed from the server,
and the NetBSD image has been moved to the archive, so we need to update
this URL as well.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20241029092440.25021-3-thuth@redhat.com>

tests/functional: Convert BananaPi tests to the functional framework

Move the BananaPi tests from tests/avocado/boot_linux_console.py into
a new file dedicated for Banana Pi tests in the functional framework.
Update the hash sums of the assets to sha256 along the way and fix the
broken link for the buildroot image from storage.kernelci.org.

(Note: The test_arm_bpim2u_openwrt_22_03_3 test is currently broken
due to a regression in commit 4c2c047469 ("target/arm: Fix usage of MMU
indexes when EL3 is AArch32") - it works if that commit gets reverted)

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20241029092440.25021-2-thuth@redhat.com>

tests/functional: Convert the tcg_plugins test

A straight forward conversion, only the usual changes were required
here (i.e. adjustment for asset downloading, machine selection).

Message-ID: <20241023051754.813412-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>

next-cube: remove cpu parameter from next_scsi_init()

The parameter is not used.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20241023085852.1061031-5-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <thuth@redhat.com>

next-cube: fix up compilation when DEBUG_NEXT is enabled

These were accidentally introduced by my last series.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-ID: <20241023085852.1061031-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Thomas Huth <thuth@redhat.com>

hw/s390x: Re-enable the pci-bridge device on s390x

Commit e779e5c05a ("hw/pci-bridge: Add a Kconfig switch for the
normal PCI bridge") added a config switch for the pci-bridge, so
that the device is not included in the s390x target anymore (since
the pci-bridge is not really useful on s390x).

However, it seems like libvirt is still adding pci-bridge devices
automatically to the guests' XML definitions (when adding a PCI
device to a non-zero PCI bus), so these guests are now broken due
to the missing pci-bridge in the QEMU binary.

To avoid disruption of the users, let's re-enable the pci-bridge
device on s390x for the time being.

Message-ID: <20241024130405.62134-1-thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>

tests/functional: Fix the s390x and ppc64 tuxrun tests

I forgot to add the tests to the meson.build file and looks
like I even managed to somehow mix up the hashsums in the
ppc64 test!

Message-ID: <20241023141919.930689-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>

tests/vm/openbsd: Remove the "Time appears wrong" workaround

Seems like the server now reports the right time again, so we have
to drop the workaround to get the installer working again.

Message-ID: <20241023072414.827732-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>

tests/functional: Add a test for sh4eb

Now that we are aware of binaries that are available for sh4eb,
we should make sure that there are no regressions with this
target and test it regularly in our CI.

Message-ID: <20241024082735.42324-3-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>

Revert "Remove the unused sh4eb target"

This reverts commit 73ceb12960e686b763415f0880cc5171ccce01cf.

The "r2d" machine can work in big endian mode, see:

https://lore.kernel.org/qemu-devel/d6755445-1060-48a8-82b6-2f392c21f9b9@landley.net/

So the reasoning for removing sh4eb was wrong.

Message-ID: <20241024082735.42324-2-thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Rob Landley <rob@landley.net>
Signed-off-by: Thomas Huth <thuth@redhat.com>

tests/functional: make cached asset files read-only

This ensures that if a functional test runs QEMU with a writable
disk pointing to a cached asset, an error will be reported, rather
than silently modifying the cache file.

As an example, tweaking test_sbsaref.py to set snapshot=off,
results in a clear error:

Command: ./build/qemu-system-aarch64 ...snip... -drive file=/var/home/berrange/.cache/qemu/download/44cdbae275ef1bb6dab1d5fbb59473d4f741e1c8ea8a80fd9e906b531d6ad461,format=raw,snapshot=off -cpu max,pauth=off
Output: qemu-system-aarch64: Could not open '/var/home/berrange/.cache/qemu/download/44cdbae275ef1bb6dab1d5fbb59473d4f741e1c8ea8a80fd9e906b531d6ad461': Permission denied

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20241025092659.2312118-3-berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>

tests/functional: make tuxrun disk images writable

The zstd command will preserve the input archive permissions on the
output file. So when we decompress the readonly cached image, the
resulting per-test run private disk image will also be readonly.
We need it to be writable, so make it so.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20241025092659.2312118-2-berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>

.gitlab-ci.d/cirrus: Remove the macos-15 job

Cirrus-CI stopped providing the possibility to run macOS 15 jobs.
Quoting https://cirrus-ci.org/guide/macOS/ :

"Cirrus CI Cloud only allows ghcr.io/cirruslabs/macos-runner:sonoma image ..."

If you still try to run a Sequoia image, it gets automatically "upgraded"
to Sonoma instead. So the macos-15 job in the QEMU CI now does not
make sense anymore, thus let's remove it.

Message-ID: <20241021124722.139348-1-thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>

Merge tag 'migration-20241030-pull-request' of https://gitlab.com/peterx/qemu into staging

Migration pull request for softfreeze

v2:
- Patch "migration: Move cpu-throttle.c from system to migration",
  fix build on MacOS, and subject spelling

NOTE: checkpatch.pl could report a false positive on this branch:

  WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
  #21:
   {include/sysemu => migration}/cpu-throttle.h | 0

That's covered by "F: migration/" entry.

Changelog:

- Peter's cleanup patch on migrate_fd_cleanup()
- Peter's cleanup patch to introduce thread name macros
- Hanna's error path fix for vmstate subsection save()s
- Hyman's auto converge enhancement on background dirty sync
- Peter's additional tracepoints for save state entries
- Thomas's build fix for OpenBSD in dirtyrate.c
- Peter's deprecation of query-migrationthreads command
- Peter's cleanup/fixes from the "export misc.h" series
- Maciej's two small patches from multifd+vfio series

# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZyTbVRIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wan3wD+L4TVNDc34Hy4mvWu7u1lCOePX0GBdUEc
# oEeBGblwbrcBAIR8d+5z9O5YcWH1coozG1aUC4qCtSHHk5TGbJk4/UUD
# =XB5Q
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 01 Nov 2024 13:44:53 GMT
# gpg:                using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg:                issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg:                 aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D  D1A9 3B5F CCCD F3AB D706

* tag 'migration-20241030-pull-request' of https://gitlab.com/peterx/qemu:
  migration/multifd: Zero p->flags before starting filling a packet
  migration/ram: Add load start trace event
  migration: Drop migration_is_idle()
  migration: Drop migration_is_setup_or_active()
  migration: Unexport ram_mig_init()
  migration: Unexport dirty_bitmap_mig_init()
  migration: Take migration object refcount earlier for threads
  migration: Deprecate query-migrationthreads command
  migration/dirtyrate: Silence warning about strcpy() on OpenBSD
  tests/migration: Add case for periodic ramblock dirty sync
  migration: Support periodic RAMBlock dirty bitmap sync
  migration: Remove "rs" parameter in migration_bitmap_sync_precopy
  migration: Move cpu-throttle.c from system to migration
  migration: Stop CPU throttling conditionally
  accel/tcg/icount-common: Remove the reference to the unused header file
  migration: Ensure vmstate_save() sets errp
  migration: Put thread names together with macros
  migration: Cleanup migrate_fd_cleanup() on accessing to_dst_file

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

target/hppa: Update SeaBIOS-hppa to version 17

This is SeaBIOS for the hppa architecture v17.
If comes with some important firmware and SCSI fixes and
prepares for futher development to support 64-bit HP-UX
and MPE/UX in the future.

New PDC functions & general enhancements:
- Add PDC_MODEL_GET_INSTALL_KERNEL firmware call
- Add PDC_PAT_EVENT firmware call
- Support ENTRY_IO_BOOTOUT
- Prefer memory-access over io-access of GSP serial port
- Disable LMMIO_DIRECT0 range during modification
- Small optimizations in IODC call

Fixes:
- esp-scsi: indicate acceptance of MESSAGE IN phase data
- Avoid crash when booting without SCSI controller
- Remove exec flag from hppa-firmware.img
- Fix LMMIO detection for PCI cards on Astro/Elroy
- Avoid trashing MPE IPL bootloader stack
- HP-UX 11 64-bit saves number of RAM pages in PAGE0 at 0x33c
- Fix return value of PDC_CACHE/PDC_CACHE_RET_SPID for space id hashing
- Allow PDC functions to act when called in narrow mode
- pcidevice: Use portaddr_t for io port addresses

Cleanups:
- Change default make target to "parisc"
- Clean the "out-64" directory on "make clean"

Signed-off-by: Helge Deller <deller@gmx.de>

Merge tag 'for-upstream-i386' of https://gitlab.com/bonzini/qemu into staging

* target/i386: new feature bits for AMD processors
* target/i386/tcg: improvements around flag handling
* target/i386: add AVX10 support
* target/i386: add GraniteRapids-v2 model
* dockerfiles: add libcbor
* New nitro-enclave machine type
* qom: cleanups to object_new
* configure: detect 64-bit MIPS for rust
* configure: deprecate 32-bit MIPS

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmcjvkQUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroPIKgf/etNpO2T+eLFtWN/Qd5eopBXqNd9k
# KmeK9EgW9lqx2IPGNen33O+uKpb/TsMmubSsSF+YxTp7pmkc8+71f3rBMaIAD02r
# /paHSMVw0+f12DAFQz1jdvGihR7Mew0wcF/UdEt737y6vEmPxLTyYG3Gfa4NSZwT
# /V5jTOIcfUN/UEjNgIp6NTuOEESKmlqt22pfMapgkwMlAJYeeJU2X9eGYE86wJbq
# ZSXNgK3jL9wGT2XKa3e+OKzHfFpSkrB0JbQbdico9pefnBokN/hTeeUJ81wBAc7u
# i00W1CEQVJ5lhBc121d4AWMp83ME6HijJUOTMmJbFIONPsITFPHK1CAkng==
# =D4nR
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 31 Oct 2024 17:28:36 GMT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* tag 'for-upstream-i386' of https://gitlab.com/bonzini/qemu: (49 commits)
  target/i386: Introduce GraniteRapids-v2 model
  target/i386: Add AVX512 state when AVX10 is supported
  target/i386: Add feature dependencies for AVX10
  target/i386: add CPUID.24 features for AVX10
  target/i386: add AVX10 feature and AVX10 version property
  target/i386: return bool from x86_cpu_filter_features
  target/i386: do not rely on ExtSaveArea for accelerator-supported XCR0 bits
  target/i386: cpu: set correct supported XCR0 features for TCG
  target/i386: use + to put flags together
  target/i386: use higher-precision arithmetic to compute CF
  target/i386: use compiler builtin to compute PF
  target/i386: make flag variables unsigned
  target/i386: add a note about gen_jcc1
  target/i386: add a few more trivial CCPrepare cases
  target/i386: optimize TEST+Jxx sequences
  target/i386: optimize computation of ZF from CC_OP_DYNAMIC
  target/i386: Wrap cc_op_live with a validity check
  target/i386: Introduce cc_op_size
  target/i386: Rearrange CCOp
  target/i386: remove CC_OP_CLR
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

target/loongarch: Add steal time support on migration

With pv steal time supported, VM machine needs get physical address
of each vcpu and notify new host during migration. Here two
functions kvm_get_stealtime/kvm_set_stealtime, and guest steal time
physical address is only updated on KVM_PUT_FULL_STATE stage.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240930064040.753929-1-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>

hw/loongarch/boot: Use warn_report when no kernel filename

When we run “qemu-system-loongarch64 -qmp stdio -vnc none -S”,
we get an error message “Need kernel filename” and then we can't use qmp cmd to query some information.
So, we just throw a warning and then the cpus starts running from address VIRT_FLASH0_BASE.

Signed-off-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20241030012359.4040817-1-gaosong@loongson.cn>

linux-headers: Update to Linux v6.12-rc5

update linux-headers to v6.12-rc5. Pass to compile on aarch64, arm,
loongarch64, x86_64, i386, riscv64,riscv32 softmmu and linux-user.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Acked-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20241028023809.1554405-4-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>

linux-headers: loongarch: Add kvm_para.h

KVM LBT supports on LoongArch depends on the linux-header file
kvm_para.h, add header file kvm_para.h here.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Acked-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20241028023809.1554405-3-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>

linux-headers: Add unistd_64.h

since 6.11, unistd.h includes header file unistd_64.h directly on
some platforms, here add unistd_64.h on these platforms. Affected
platforms are ARM64, LoongArch64 and Riscv. Otherwise there will
be compiling error such as:

linux-headers/asm/unistd.h:3:10: fatal error: asm/unistd_64.h: No such file or directory
#include <asm/unistd_64.h>

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Acked-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20241028023809.1554405-2-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>

target/loongarch/kvm: Implement LoongArch PMU extension

Implement PMU extension for LoongArch kvm mode. Use OnOffAuto type
variable pmu to check the PMU feature. If the PMU Feature is not supported
with KVM host, it reports error if there is pmu=on command line.

If there is no any command line about pmu parameter, it checks whether
KVM host supports the PMU Feature and set the corresponding value in cpucfg.

This patch is based on lbt patch located at
https://lore.kernel.org/qemu-devel/20240904061859.86615-1-maobibo@loongson.cn

Co-developed-by: Song Gao <gaosong@loongson.cn>
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240918082315.2345034-1-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>

target/loongarch: Implement lbt registers save/restore function

Six registers scr0 - scr3, eflags and ftop are added in percpu vmstate.
And two functions kvm_loongarch_get_lbt/kvm_loongarch_put_lbt are added
to save/restore lbt registers.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240929070405.235200-3-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>

target/loongarch: Add loongson binary translation feature

Loongson Binary Translation (LBT) is used to accelerate binary
translation, which contains 4 scratch registers (scr0 to scr3), x86/ARM
eflags (eflags) and x87 fpu stack pointer (ftop).

Now LBT feature is added in kvm mode, not supported in TCG mode since
it is not emulated. Feature variable lbt is added with OnOffAuto type,
If lbt feature is not supported with KVM host, it reports error if there
is lbt=on command line.

If there is no any command line about lbt parameter, it checks whether
KVM host supports lbt feature and set the corresponding value in cpucfg.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240929070405.235200-2-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>

migration/multifd: Zero p->flags before starting filling a packet

This way there aren't stale flags there.

p->flags can't contain SYNC to be sent at the next RAM packet since syncs
are now handled separately in multifd_send_thread.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/r/1c96b6cdb797e6f035eb1a4ad9bfc24f4c7f5df8.1730203967.git.maciej.szmigiero@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>

migration/ram: Add load start trace event

There's a RAM load complete trace event but there wasn't its start equivalent.

Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/94ddfa7ecb83a78f73b82867dd30c8767592d257.1730203967.git.maciej.szmigiero@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>

migration: Drop migration_is_idle()

Now with the current migration_is_running(), it will report exactly the
opposite of what will be reported by migration_is_idle().

Drop migration_is_idle(), instead use "!migration_is_running()" which
should be identical on functionality.

In reality, most of the idle check is inverted, so it's even easier to
write with "migrate_is_running()" check.

Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20241024213056.1395400-6-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>

migration: Drop migration_is_setup_or_active()

This helper is mostly the same as migration_is_running(), except that one
has COLO reported as true, the other has CANCELLING reported as true.

Per my past years experience on the state changes, none of them should
matter.

To make it slightly safer, report both COLO || CANCELLING to be true in
migration_is_running(), then drop the other one. We kept the 1st only
because the name is simpler, and clear enough.

Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20241024213056.1395400-5-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>

migration: Unexport ram_mig_init()

It's only used within migration/.

Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20241024213056.1395400-4-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>

migration: Unexport dirty_bitmap_mig_init()

It's only used within migration/, so it shouldn't be exported.

Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20241024213056.1395400-3-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>

migration: Take migration object refcount earlier for threads

Both migration thread or background snapshot thread will take a refcount of
the migration object at the entrace of the thread function.

That makes sense, because it protects the object from being freed by the
main thread in migration_shutdown() later, but it might still race with it
if the thread is scheduled too late. Consider the case right after
pthread_create() happened, VM shuts down with the object released, but
right after that the migration thread finally got created, referencing
MigrationState* in the opaque pointer which is already freed.

The only 100% safe way to make sure it won't get freed is taking the
refcount right before the thread is created, meanwhile when BQL is held.

Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20241024213056.1395400-2-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>

migration: Deprecate query-migrationthreads command

Per previous discussion [1,2], this patch deprecates query-migrationthreads
command.

To summarize, the major reason of the deprecation is due to no sensible way
to consume the API properly:

  (1) The reported list of threads are incomplete (ignoring destination
      threads and non-multifd threads).

  (2) For CPU pinning, there's no way to properly pin the threads with
      the API if the threads will start running right away after migration
      threads can be queried, so the threads will always run on the default
      cores for a short window.

  (3) For VM debugging, one can use "-name $VM,debug-threads=on" instead,
      which will provide proper names for all migration threads.

[1] https://lore.kernel.org/r/20240930195837.825728-1-peterx@redhat.com
[2] https://lore.kernel.org/r/20241011153417.516715-1-peterx@redhat.com

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Acked-by: Markus Armbruster <armbru@redhat.com>
Link: https://lore.kernel.org/r/20241022194501.1022443-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>

migration/dirtyrate: Silence warning about strcpy() on OpenBSD

The linker on OpenBSD complains:

ld: warning: dirtyrate.c:447 (../src/migration/dirtyrate.c:447)(...):
warning: strcpy() is almost always misused, please use strlcpy()

It's currently not a real problem in this case since both arrays
have the same size (256 bytes). But just in case somebody changes
the size of the source array in the future, let's better play safe
and use g_strlcpy() here instead, with an additional check that the
string has been copied as a whole.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Hyman Huang <yong.huang@smartx.com>
Link: https://lore.kernel.org/r/20241022063402.184213-1-thuth@redhat.com
[peterx: Fix over-80 chars]
Signed-off-by: Peter Xu <peterx@redhat.com>

tests/migration: Add case for periodic ramblock dirty sync

Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/cb61504f1a1e9d5f2ca4dac12e518deb076ce9f3.1729146786.git.yong.huang@smartx.com
Signed-off-by: Peter Xu <peterx@redhat.com>

migration: Support periodic RAMBlock dirty bitmap sync

When VM is configured with huge memory, the current throttle logic
doesn't look like to scale, because migration_trigger_throttle()
is only called for each iteration, so it won't be invoked for a long
time if one iteration can take a long time.

The periodic dirty sync aims to fix the above issue by synchronizing
the ramblock from remote dirty bitmap and, when necessary, triggering
the CPU throttle multiple times during a long iteration.

This is a trade-off between synchronization overhead and CPU throttle
impact.

Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/f61f1b3653f2acf026901103e1c73d157d38b08f.1729146786.git.yong.huang@smartx.com
[peterx: make prev_cnt global, and reset for each migration]
Signed-off-by: Peter Xu <peterx@redhat.com>

migration: Remove "rs" parameter in migration_bitmap_sync_precopy

The global static variable ram_state in fact is referred to by the
"rs" parameter in migration_bitmap_sync_precopy. For ease of calling
by the callees, use the global variable directly in
migration_bitmap_sync_precopy and remove "rs" parameter.

The migration_bitmap_sync_precopy will be exported in the next commit.

Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/283c335d61463bf477160da91b24da45cdaf3e43.1729146786.git.yong.huang@smartx.com
Signed-off-by: Peter Xu <peterx@redhat.com>

migration: Move cpu-throttle.c from system to migration

Move cpu-throttle.c from system to migration since it's
only used for migration; this makes us avoid exporting the
util functions and variables in misc.h but export them in
migration.h when implementing the periodic ramblock dirty
sync feature in the upcoming commits.

Since CPU throttle timers are only used in migration, move
their registry to migration_object_init.

Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/c1b3efaa0cb49e03d422e9da97bdb65cc3d234d1.1729146786.git.yong.huang@smartx.com
[peterx: Fix build on MacOS on cocoa.m, not move cpu-throttle.h yet]
[peterx: Fix subject spelling, per pm215]
Signed-off-by: Peter Xu <peterx@redhat.com>

migration: Stop CPU throttling conditionally

Since CPU throttling only occurs when auto-converge
is on, stop it conditionally.

Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/f0c787080bb9ab0c37952f0ca5bfaa525d5ddd14.1729146786.git.yong.huang@smartx.com
Signed-off-by: Peter Xu <peterx@redhat.com>

accel/tcg/icount-common: Remove the reference to the unused header file

Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/5e33b423d0b8506e5cb33fff42b50aa301b7731b.1729146786.git.yong.huang@smartx.com
Signed-off-by: Peter Xu <peterx@redhat.com>

migration: Ensure vmstate_save() sets errp

migration/savevm.c contains some calls to vmstate_save() that are
followed by migrate_set_error() if the integer return value indicates an
error.  migrate_set_error() requires that the `Error *` object passed to
it is set.  Therefore, vmstate_save() is assumed to always set *errp on
error.

Right now, that assumption is not met: vmstate_save_state_v() (called
internally by vmstate_save()) will not set *errp if
vmstate_subsection_save() or vmsd->post_save() fail.  Fix that by adding
an *errp parameter to vmstate_subsection_save(), and by generating a
generic error in case post_save() fails (as is already done for
pre_save()).

Without this patch, qemu will crash after vmstate_subsection_save() or
post_save() have failed inside of a vmstate_save() call (unless
migrate_set_error() then happen to discard the new error because
s->error is already set).  This happens e.g. when receiving the state
from a virtio-fs back-end (virtiofsd) fails.

Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Link: https://lore.kernel.org/r/20241015170437.310358-1-hreitz@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>

migration: Put thread names together with macros

Keep migration thread names together, so it's easier to see a list of all
possible migration threads.

Still two functional changes below besides the macro defintions:

  - There's one dirty rate thread that we overlooked before, now we add
  that too and name it as "mig/dirtyrate" following the old rules.

  - The old name "mig/src/rp-thr" has "-thr" but it may not be useful if
  it's a thread name anyway, while "rp" can be slightly hard to read.
  Taking this chance to rename it to "mig/src/return", hopefully a better
  name.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Acked-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Link: https://lore.kernel.org/r/20241011153652.517440-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>

migration: Cleanup migrate_fd_cleanup() on accessing to_dst_file

The cleanup function can in many cases needs cleanup on its own.

The major thing we want to do here is not referencing to_dst_file when
without the file mutex. When at it, touch things elsewhere too to make it
look slightly better in general.

One thing to mention is, migration_thread has its own "running" boolean, so
it doesn't need to rely on to_dst_file being non-NULL. Multifd has a
dependency so it needs to be skipped if to_dst_file is not yet set; add a
richer comment for such reason.

Resolves: Coverity CID 1527402
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240919163042.116767-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>

target/i386: Introduce GraniteRapids-v2 model

Update GraniteRapids CPU model to add AVX10 and the missing features(ss,
tsc-adjust, cldemote, movdiri, movdir64b).

Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241028024512.156724-7-tao1.su@linux.intel.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20241031085233.425388-9-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: Add AVX512 state when AVX10 is supported

AVX10 state enumeration in CPUID leaf D and enabling in XCR0 register
are identical to AVX512 state regardless of the supported vector lengths.

Given that some E-cores will support AVX10 but not support AVX512, add
AVX512 state components to guest when AVX10 is enabled.

Based on a patch by Tao Su <tao1.su@linux.intel.com>

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241031085233.425388-8-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: Add feature dependencies for AVX10

Since the highest supported vector length for a processor implies that
all lesser vector lengths are also supported, add the dependencies of
the supported vector lengths. If all vector lengths aren't supported,
clear AVX10 enable bit as well.

Note that the order of AVX10 related dependencies should be kept as:
        CPUID_24_0_EBX_AVX10_128     -> CPUID_24_0_EBX_AVX10_256,
        CPUID_24_0_EBX_AVX10_256     -> CPUID_24_0_EBX_AVX10_512,
        CPUID_24_0_EBX_AVX10_VL_MASK -> CPUID_7_1_EDX_AVX10,
        CPUID_7_1_EDX_AVX10          -> CPUID_24_0_EBX,
so that prevent user from setting weird CPUID combinations, e.g. 256-bits
and 512-bits are supported but 128-bits is not, no vector lengths are
supported but AVX10 enable bit is still set.

Since AVX10_128 will be reserved as 1, adding these dependencies has the
bonus that when user sets -cpu host,-avx10-128, CPUID_7_1_EDX_AVX10 and
CPUID_24_0_EBX will be disabled automatically.

Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241028024512.156724-5-tao1.su@linux.intel.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20241031085233.425388-7-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: add CPUID.24 features for AVX10

Introduce features for the supported vector bit lengths.

Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241028024512.156724-3-tao1.su@linux.intel.com
Link: https://lore.kernel.org/r/20241028024512.156724-4-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Link: https://lore.kernel.org/r/20241031085233.425388-6-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: add AVX10 feature and AVX10 version property

When AVX10 enable bit is set, the 0x24 leaf will be present as "AVX10
Converged Vector ISA leaf" containing fields for the version number and
the supported vector bit lengths.

Introduce avx10-version property so that avx10 version can be controlled
by user and cpu model. Per spec, avx10 version can never be 0, the default
value of avx10-version is set to 0 to determine whether it is specified by
user. The default can come from the device model or, for the max model,
from KVM's reported value.

Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241028024512.156724-3-tao1.su@linux.intel.com
Link: https://lore.kernel.org/r/20241028024512.156724-4-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Link: https://lore.kernel.org/r/20241031085233.425388-5-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: return bool from x86_cpu_filter_features

Prepare for filtering non-boolean features such as AVX10 version.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241031085233.425388-4-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: do not rely on ExtSaveArea for accelerator-supported XCR0 bits

Right now, QEMU is using the "feature" and "bits" fields of ExtSaveArea
to query the accelerator for the support status of extended save areas.
This is a problem for AVX10, which attaches two feature bits (AVX512F
and AVX10) to the same extended save states.

To keep the AVX10 hacks to the minimum, limit usage of esa->features
and esa->bits. Instead, just query the accelerator for the 0xD leaf.
Do it in common code and clear esa->size if an extended save state is
unsupported.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20241031085233.425388-3-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: cpu: set correct supported XCR0 features for TCG

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20241031085233.425388-2-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: use + to put flags together

This gives greater opportunity for reassociation on x86 targets,
since addition can use the LEA instruction.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: use higher-precision arithmetic to compute CF

If the operands of the arithmetic instruction fit within a half-register,
it's easiest to use a comparison instruction to compute the carry.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: use compiler builtin to compute PF

This removes the 256 byte parity table from the executable.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: make flag variables unsigned

This makes it easier for the compiler to understand which bits are set,
and it also removes "cltq" instructions to canonicalize the output value
as 32-bit signed.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: add a note about gen_jcc1

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: add a few more trivial CCPrepare cases

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: optimize TEST+Jxx sequences

Mostly used for TEST+JG and TEST+JLE, but it is easy to cover
also JBE/JA and JL/JGE; shaves about 0.5% TCG ops.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: optimize computation of ZF from CC_OP_DYNAMIC

Most uses of CC_OP_DYNAMIC are for CMP/JB/JE or similar sequences.
We can optimize many of them to avoid computation of the flags.
This eliminates both TCG ops to set up the new cc_op, and helper
instructions because evaluating just ZF is much cheaper.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: Wrap cc_op_live with a validity check

Assert that op is known and that cc_op_live_ is populated.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: Introduce cc_op_size

Replace arithmetic on cc_op with a helper function.
Assert that the op has a size and that it is valid
for the configuration.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20240701025115.1265117-6-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: Rearrange CCOp

Give the first few enumerators explicit integer constants,
align the BWLQ enumerators.

This will be used to simplify ((op - CC_OP_*B) & 3).

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20240701025115.1265117-4-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: remove CC_OP_CLR

Just use CC_OP_EFLAGS; it is not that likely that the flags computed by
CC_OP_CLR survive the end of the basic block, in which case there is no
need to spill cc_op_src.

cc_op_src now does need spilling if the XOR is followed by a memory
operation, but this only costs 0.2% extra TCG ops. They will be recouped
by simplifications in how QEMU evaluates ZF at runtime, which are even
greater with this change.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: Tidy cc_op_str usage

Make const. Use the read-only strings directly; do not copy
them into an on-stack buffer with snprintf. Allow for holes
in the cc_op_str array, now present with CC_OP_POPCNT.

Fixes: 460231ad369 ("target/i386: give CC_OP_POPCNT low bits corresponding to MO_TL")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20240701025115.1265117-2-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: use tcg_gen_ext_tl when applicable

Prefer it to gen_ext_tl in the common case where the destination is known.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

ci: always invoke meson through pyvenv

Do not assume that the distro-installed meson is compatible with the one
in the virtual environment.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

docs/nitro-enclave: Documentation for nitro-enclave machine type

Signed-off-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20241008211727.49088-7-dorjoychy111@gmail.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

machine/nitro-enclave: New machine type for AWS Nitro Enclaves

AWS nitro enclaves[1] is an Amazon EC2[2] feature that allows creating
isolated execution environments, called enclaves, from Amazon EC2
instances which are used for processing highly sensitive data. Enclaves
have no persistent storage and no external networking. The enclave VMs
are based on the Firecracker microvm with a vhost-vsock device for
communication with the parent EC2 instance that spawned it and a Nitro
Secure Module (NSM) device for cryptographic attestation. The parent
instance VM always has CID 3 while the enclave VM gets a dynamic CID.

An EIF (Enclave Image Format)[3] file is used to boot an AWS nitro enclave
virtual machine. This commit adds support for AWS nitro enclave emulation
using a new machine type option '-M nitro-enclave'. This new machine type
is based on the 'microvm' machine type, similar to how real nitro enclave
VMs are based on Firecracker microvm. For nitro-enclave to boot from an
EIF file, the kernel and ramdisk(s) are extracted into a temporary kernel
and a temporary initrd file which are then hooked into the regular x86
boot mechanism along with the extracted cmdline. The EIF file path should
be provided using the '-kernel' QEMU option.

In QEMU, the vsock emulation for nitro enclave is added using vhost-user-
vsock as opposed to vhost-vsock. vhost-vsock doesn't support sibling VM
communication which is needed for nitro enclaves. So for the vsock
communication to CID 3 to work, another process that does the vsock
emulation in  userspace must be run, for example, vhost-device-vsock[4]
from rust-vmm, with necessary vsock communication support in another
guest VM with CID 3. Using vhost-user-vsock also enables the possibility
to implement some proxying support in the vhost-user-vsock daemon that
will forward all the packets to the host machine instead of CID 3 so
that users of nitro-enclave can run the necessary applications in their
host machine instead of running another whole VM with CID 3. The following
mandatory nitro-enclave machine option has been added related to the
vhost-user-vsock device.
  - 'vsock': The chardev id from the '-chardev' option for the
vhost-user-vsock device.

AWS Nitro Enclaves have built-in Nitro Secure Module (NSM) device which
has been added using the virtio-nsm device added in a previous commit.
In Nitro Enclaves, all the PCRs start in a known zero state and the first
16 PCRs are locked from boot and reserved. The PCR0, PCR1, PCR2 and PCR8
contain the SHA384 hashes related to the EIF file used to boot the VM
for validation. The following optional nitro-enclave machine options
have been added related to the NSM device.
  - 'id': Enclave identifier, reflected in the module-id of the NSM
device. If not provided, a default id will be set.
  - 'parent-role': Parent instance IAM role ARN, reflected in PCR3
of the NSM device.
  - 'parent-id': Parent instance identifier, reflected in PCR4 of the
NSM device.

[1] https://docs.aws.amazon.com/enclaves/latest/user/nitro-enclave.html
[2] https://aws.amazon.com/ec2/
[3] https://github.com/aws/aws-nitro-enclaves-image-format
[4] https://github.com/rust-vmm/vhost-device/tree/main/vhost-device-vsock

Signed-off-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20241008211727.49088-6-dorjoychy111@gmail.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

core/machine: Make create_default_memdev machine a virtual method

This is in preparation for the next commit where the nitro-enclave
machine type will need to instead use a memfd backend, for the built-in
vhost-user-vsock device to work.

Signed-off-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20241008211727.49088-5-dorjoychy111@gmail.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

hw/core: Add Enclave Image Format (EIF) related helpers

An EIF (Enclave Image Format)[1] file is used to boot an AWS nitro
enclave[2] virtual machine. The EIF file contains the necessary kernel,
cmdline, ramdisk(s) sections to boot.

Some helper functions have been introduced for extracting the necessary
sections from an EIF file and then writing them to temporary files as
well as computing SHA384 hashes from the section data. These will be
used in the following commit to add support for nitro-enclave machine
type in QEMU.

The files added in this commit are not compiled yet but will be added
to the hw/core/meson.build file in the following commit where
CONFIG_NITRO_ENCLAVE will be introduced.

[1] https://github.com/aws/aws-nitro-enclaves-image-format
[2] https://docs.aws.amazon.com/enclaves/latest/user/nitro-enclave.html

Signed-off-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20241008211727.49088-4-dorjoychy111@gmail.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

device/virtio-nsm: Support for Nitro Secure Module device

Nitro Secure Module (NSM)[1] device is used in AWS Nitro Enclaves[2]
for stripped down TPM functionality like cryptographic attestation.
The requests to and responses from NSM device are CBOR[3] encoded.

This commit adds support for NSM device in QEMU. Although related to
AWS Nitro Enclaves, the virito-nsm device is independent and can be
used in other machine types as well. The libcbor[4] library has been
used for the CBOR encoding and decoding functionalities.

[1] https://lists.oasis-open.org/archives/virtio-comment/202310/msg00387.html
[2] https://docs.aws.amazon.com/enclaves/latest/user/nitro-enclave.html
[3] http://cbor.io/
[4] https://libcbor.readthedocs.io/en/latest/

Signed-off-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20241008211727.49088-3-dorjoychy111@gmail.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

tests/lcitool: Update libvirt-ci and add libcbor dependency

libcbor dependecy is necessary for adding virtio-nsm and nitro-enclave
machine support in the following commits. libvirt-ci has already been
updated with the dependency upstream and this commit updates libvirt-ci
submodule in QEMU to latest upstream. Also the libcbor dependency has
been added to tests/lcitool/projects/qemu.yml.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20241008211727.49088-2-dorjoychy111@gmail.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386/hvf: fix handling of XSAVE-related CPUID bits

The call to xgetbv() is passing the ecx value for cpuid function 0xD,
index 0. The xgetbv call thus returns false (OSXSAVE is bit 27, which is
well out of the range of CPUID[0xD,0].ECX) and eax is not modified. While
fixing it, cache the whole computation of supported XCR0 bits since it
will be used for more than just CPUID leaf 0xD.

Furthermore, unsupported subleafs of CPUID 0xD (including all those
corresponding to zero bits in host's XCR0) must be hidden; if OSXSAVE
is not set at all, the whole of CPUID leaf 0xD plus the XSAVE bit must
be hidden.

Finally, unconditionally drop XSTATE_BNDREGS_MASK and XSTATE_BNDCSR_MASK;
real hardware will only show them if the MPX bit is set in CPUID;
this is never the case for hvf_get_supported_cpuid() because QEMU's
Hypervisor.framework support does not handle the VMX fields related to
MPX (even in the unlikely possibility that the host has MPX enabled).
So hide those bits in the new cache_host_xcr0().

Cc: Phil Dennis-Jordan <lists@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

target/i386: Expose new feature bits in CPUID 8000_0021_EAX/EBX

Newer AMD CPUs support ERAPS (Enhanced Return Address Prediction Security)
feature that enables the auto-clear of RSB entries on a TLB flush, context
switches and VMEXITs. The number of default RSP entries is reflected in
RapSize.

Add the feature bit and feature word to support these features.

CPUID_Fn80000021_EAX
Bits   Feature Description
24     ERAPS:
       Indicates support for enhanced return address predictor security.

CPUID_Fn80000021_EBX
Bits   Feature Description
31-24  Reserved
23:16  RapSize:
       Return Address Predictor size. RapSize x 8 is the minimum number
       of CALL instructions software needs to execute to flush the RAP.
15-00  MicrocodePatchSize. Read-only.
       Reports the size of the Microcode patch in 16-byte multiples.
       If 0, the size of the patch is at most 5568 (15C0h) bytes.

Link: https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/programmer-references/57238.zip
Signed-off-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/7c62371fe60af1e9bbd853f5f8e949bf2d908bd0.1729807947.git.babu.moger@amd.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>