qemu.git
5 weeks agofpu: Always decide snan_bit_is_one() at runtime
Peter Maydell [Mon, 24 Feb 2025 11:15:22 +0000 (11:15 +0000)]
fpu: Always decide snan_bit_is_one() at runtime

Currently we have a compile-time shortcut where we return a hardcode
value from snan_bit_is_one() on everything except MIPS, because we
know that's the only target that needs to change
status->no_signaling_nans at runtime.

Remove the ifdef, so we always look at the status flag.  This means
we must update the two targets (HPPA and SH4) that were previously
hardcoded to return true so that they set the status flag correctly.

This has no behavioural change, but will be necessary if we want to
build softfloat once for all targets.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250224111524.1101196-11-peter.maydell@linaro.org
Message-id: 20250217125055.160887-9-peter.maydell@linaro.org

5 weeks agofpu: Always decide no_signaling_nans() at runtime
Peter Maydell [Mon, 24 Feb 2025 11:15:21 +0000 (11:15 +0000)]
fpu: Always decide no_signaling_nans() at runtime

Currently we have a compile-time shortcut where we
return false from no_signaling_nans() on everything except
Xtensa, because we know that's the only target that
might ever set status->no_signaling_nans.

Remove the ifdef, so we always look at the status flag;
this has no behavioural change, but will be necessary
if we want to build softfloat once for all targets.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250224111524.1101196-10-peter.maydell@linaro.org
Message-id: 20250217125055.160887-8-peter.maydell@linaro.org

5 weeks agofpu: Move m68k_denormal fmt flag into floatx80_behaviour
Peter Maydell [Mon, 24 Feb 2025 11:15:20 +0000 (11:15 +0000)]
fpu: Move m68k_denormal fmt flag into floatx80_behaviour

Currently we compile-time set an 'm68k_denormal' flag in the FloatFmt
for floatx80 for m68k.  This controls our handling of what the Intel
documentation calls a "pseudo-denormal": a value where the exponent
field is zero and the explicit integer bit is set.

For x86, the x87 FPU is supposed to accept a pseudo-denormal as
input, but never generate one on output.  For m68k, these values are
permitted on input and may be produced on output.

Replace the flag in the FloatFmt with a flag indicating whether the
float format has an explicit bit (which will be true for floatx80 for
all targets, and false for every other float type).  Then we can gate
the handling of these pseudo-denormals on the setting of a
floatx80_behaviour flag.

As far as I can see from the code we don't actually handle the
x86-mandated "accept on input but don't generate" behaviour, because
the handling in partsN(canonicalize) looked at fmt->m68k_denormal.
So I have added TODO comments to that effect.

This commit doesn't change any behaviour for any target.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250224111524.1101196-9-peter.maydell@linaro.org
Message-id: 20250217125055.160887-7-peter.maydell@linaro.org

5 weeks agofpu: Make floatx80 invalid encoding settable at runtime
Peter Maydell [Mon, 24 Feb 2025 11:15:19 +0000 (11:15 +0000)]
fpu: Make floatx80 invalid encoding settable at runtime

Because floatx80 has an explicit integer bit, this permits some
odd encodings where the integer bit is not set correctly for the
floating point value type. In In Intel terminology the
 categories are:
  exp == 0, int = 0, mantissa == 0 : zeroes
  exp == 0, int = 0, mantissa != 0 : denormals
  exp == 0, int = 1 : pseudo-denormals
  0 < exp < 0x7fff, int = 0 : unnormals
  0 < exp < 0x7fff, int = 1 : normals
  exp == 0x7fff, int = 0, mantissa == 0 : pseudo-infinities
  exp == 0x7fff, int = 1, mantissa == 0 : infinities
  exp == 0x7fff, int = 0, mantissa != 0 : pseudo-NaNs
  exp == 0x7fff, int = 1, mantissa == 0 : NaNs

The usual IEEE cases of zero, denormal, normal, inf and NaN are always valid.
x87 permits as input also pseudo-denormals.
m68k permits all those and also pseudo-infinities, pseudo-NaNs and unnormals.

Currently we have an ifdef in floatx80_invalid_encoding() to select
the x86 vs m68k behaviour.  Add new floatx80_behaviour flags to
select whether pseudo-NaN and unnormal are valid, and use these
(plus the existing pseudo_inf_valid flag) to decide whether these
encodings are invalid at runtime.

We leave pseudo-denormals as always-valid, since both x86 and m68k
accept them.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250224111524.1101196-8-peter.maydell@linaro.org
Message-id: 20250217125055.160887-6-peter.maydell@linaro.org

5 weeks agofpu: Pass float_status to floatx80_invalid_encoding()
Peter Maydell [Mon, 24 Feb 2025 11:15:18 +0000 (11:15 +0000)]
fpu: Pass float_status to floatx80_invalid_encoding()

The definition of which floatx80 encodings are invalid is
target-specific.  Currently we handle this with an ifdef, but we
would like to defer this decision to runtime.  In preparation, pass a
float_status argument to floatx80_invalid_encoding().

We will change the implementation from ifdef to looking at
the status argument in the following commit.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250224111524.1101196-7-peter.maydell@linaro.org

5 weeks agofpu: Make targets specify whether floatx80 Inf can have Int bit clear
Peter Maydell [Mon, 24 Feb 2025 11:15:17 +0000 (11:15 +0000)]
fpu: Make targets specify whether floatx80 Inf can have Int bit clear

In Intel terminology, a floatx80 Infinity with the explicit integer
bit clear is a "pseudo-infinity"; for x86 these are not valid
infinity values.  m68k is looser and does not care whether the
Integer bit is set or clear in an infinity.

Move this setting to runtime rather than using an ifdef in
floatx80_is_infinity().

Since this was the last use of the floatx80_infinity global constant,
we remove it and its definition here.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250224111524.1101196-6-peter.maydell@linaro.org
Message-id: 20250217125055.160887-5-peter.maydell@linaro.org

5 weeks agofpu: Pass float_status to floatx80_is_infinity()
Peter Maydell [Mon, 24 Feb 2025 11:15:16 +0000 (11:15 +0000)]
fpu: Pass float_status to floatx80_is_infinity()

Unlike the other float formats, whether a floatx80 value is
considered to be an Infinity is target-dependent.  (On x86 if the
explicit integer bit is clear this is a "pseudo-infinity" and not a
valid infinity; m68k does not care about the value of the integer
bit.)

Currently we select this target-specific logic at compile time with
an ifdef.  We're going to want to do this at runtime, so change the
floatx80_is_infinity() function to take a float_status.

This commit doesn't change any logic; we'll do that in the
next commit.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250224111524.1101196-5-peter.maydell@linaro.org

5 weeks agotarget/i386: Avoid using floatx80_infinity global const
Peter Maydell [Mon, 24 Feb 2025 11:15:15 +0000 (11:15 +0000)]
target/i386: Avoid using floatx80_infinity global const

The global const floatx80_infinity is (unlike all the other
float*_infinity values) target-specific, because whether the explicit
Integer bit is set or not varies between m68k and i386.  We want to
be able to compile softfloat once for multiple targets, so we can't
continue to use a single global whose value needs to be different
between targets.

Replace the direct uses of floatx80_infinity in target/i386 with
calls to the new floatx80_default_inf() function. Note that because
we can ask the function for either a negative or positive infinity,
we don't need to change the sign of a positive infinity via
floatx80_chs() for the negative-Inf case.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250224111524.1101196-4-peter.maydell@linaro.org
Message-id: 20250217125055.160887-4-peter.maydell@linaro.org

5 weeks agotarget/m68k: Avoid using floatx80_infinity global const
Peter Maydell [Mon, 24 Feb 2025 11:15:14 +0000 (11:15 +0000)]
target/m68k: Avoid using floatx80_infinity global const

The global const floatx80_infinity is (unlike all the other
float*_infinity values) target-specific, because whether the explicit
Integer bit is set or not varies between m68k and i386.  We want to
be able to compile softfloat once for multiple targets, so we can't
continue to use a single global whose value needs to be different
between targets.

Replace the direct uses of floatx80_infinity in target/m68k with
calls to the new floatx80_default_inf() function.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250224111524.1101196-3-peter.maydell@linaro.org
Message-id: 20250217125055.160887-3-peter.maydell@linaro.org

5 weeks agofpu: Make targets specify floatx80 default Inf at runtime
Peter Maydell [Mon, 24 Feb 2025 11:15:13 +0000 (11:15 +0000)]
fpu: Make targets specify floatx80 default Inf at runtime

Currently we hardcode at compile time whether the floatx80 default
Infinity value has the explicit integer bit set or not (x86 sets it;
m68k does not).  To be able to compile softfloat once for all targets
we'd like to move this setting to runtime.

Define a new FloatX80Behaviour enum which is a set of flags that
define the target's floatx80 handling.  Initially we define just one
flag, for whether the default Infinity has the Integer bit set or
not, but we will expand this in future commits to cover the other
floatx80 target specifics that we currently make compile-time
settings.

Define a new function floatx80_default_inf() which returns the
appropriate default Infinity value of the given sign, and use it in
the code that was previously directly using the compile-time constant
floatx80_infinity_{low,high} values when packing an infinity into a
floatx80.

Since floatx80 is highly unlikely to be supported in any new
architecture, and the existing code is generally written as "default
to like x87, with an ifdef for m68k", we make the default value for
the floatx80 behaviour flags be "what x87 does".  This means we only
need to change the m68k target to specify the behaviour flags.

(Other users of floatx80 are the Arm NWFPE emulation, which is
obsolete and probably not actually doing the right thing anyway, and
the PPC xsrqpxp insn.  Making the default be "like x87" avoids our
needing to review and test for behaviour changes there.)

We will clean up the remaining uses of the floatx80_infinity global
constant in subsequent commits.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250224111524.1101196-2-peter.maydell@linaro.org
Message-id: 20250217125055.160887-2-peter.maydell@linaro.org

5 weeks agohw/core/machine.c: Make -machine dumpdtb=file.dtb with no DTB an error
Peter Maydell [Thu, 6 Feb 2025 15:12:14 +0000 (15:12 +0000)]
hw/core/machine.c: Make -machine dumpdtb=file.dtb with no DTB an error

Currently if the user requests via -machine dumpdtb=file.dtb that we
dump the DTB, but the machine doesn't have a DTB, we silently ignore
the option.  This is confusing to users, and is a legacy of the old
board-specific implementation of the option, where if the execution
codepath didn't go via a call to qemu_fdt_dumpdtb() we would never
handle the option.

Now we handle the option in one place in machine.c, we can provide
the user with a useful message if they asked us to dump a DTB when
none exists.  qmp_dumpdtb() already produces this error; remove the
logic in handle_machine_dumpdtb() that was there specifically to
avoid hitting it.

While we're here, beef up the error message a bit with a hint, and
make it consistent about "an FDT" rather than "a FDT".  (In the
qmp_dumpdtb() case this needs an ERRP_GUARD to make
error_append_hint() work when the caller passes error_fatal.)

Note that the three places where we might report "doesn't have an
FDT" are hit in different situations:

(1) in handle_machine_dumpdtb(), if CONFIG_FDT is not set: this is
because the QEMU binary was built without libfdt at all. The
build system will not let you build with a machine type that
needs an FDT but no libfdt, so here we know both that the machine
doesn't use FDT and that QEMU doesn't have the support:

(2) in the device_tree-stub.c qmp_dumpdtb(): this is used when
we had libfdt at build time but the target architecture didn't
enable any machines which did "select DEVICE_TREE", so here we
know that the machine doesn't use FDT.

(3) in qmp_dumpdtb(), if current_machine->fdt is NULL all we know
is that this machine never set it. That might be because it doesn't
use FDT, or it might be because the user didn't pass an FDT
on the command line and the machine doesn't autogenerate an FDT.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2733
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250206151214.2947842-7-peter.maydell@linaro.org

5 weeks agohw: Centralize handling of -machine dumpdtb option
Peter Maydell [Thu, 6 Feb 2025 15:12:13 +0000 (15:12 +0000)]
hw: Centralize handling of -machine dumpdtb option

Currently we handle the 'dumpdtb' machine sub-option ad-hoc in every
board model that has an FDT.  It's up to the board code to make sure
it calls qemu_fdt_dumpdtb() in the right place.

This means we're inconsistent and often just ignore the user's
command line argument:
 * if the board doesn't have an FDT at all
 * if the board supports FDT, but there happens not to be one
   present (usually because of a missing -fdt option)

This isn't very helpful because it gives the user no clue why their
option was ignored.

However, in order to support the QMP/HMP dumpdtb commands we require
now that every FDT machine stores a pointer to the FDT in
MachineState::fdt.  This means we can handle -machine dumpdtb
centrally by calling the qmp_dumpdtb() function, unifying its
handling with the QMP/HMP commands.  All the board code calls to
qemu_fdt_dumpdtb() can then be removed.

For this commit we retain the existing behaviour that if there
is no FDT we silently ignore the -machine dumpdtb option.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
5 weeks agohw/mips/boston: Support dumpdtb monitor commands
Peter Maydell [Thu, 6 Feb 2025 15:12:12 +0000 (15:12 +0000)]
hw/mips/boston: Support dumpdtb monitor commands

The boston machine doesn't set MachineState::fdt to the DTB blob that
it has loaded or created, which means that the QMP/HMP dumpdtb
monitor commands don't work.

Setting MachineState::fdt is easy in the non-FIT codepath: we can
simply do so immediately before loading the DTB into guest memory.
The FIT codepath is a bit more awkward as currently the FIT loader
throws away the memory that the FDT was in after it loads it into
guest memory.  So we add a void *pfdt argument to load_fit() for it
to store the FDT pointer into.

There is some readjustment required of the pointer handling in
loader-fit.c, so that it applies 'const' only where it should (e.g.
the data pointer we get back from fdt_getprop() is const, because
it's into the middle of the input FDT data, but the pointer that
fit_load_image_alloc() should not be const, because it's freshly
allocated memory that the caller can change if it likes).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250206151214.2947842-5-peter.maydell@linaro.org

5 weeks agohw/mips/boston: Check for error return from boston_fdt_filter()
Peter Maydell [Thu, 6 Feb 2025 15:12:11 +0000 (15:12 +0000)]
hw/mips/boston: Check for error return from boston_fdt_filter()

The function boston_fdt_filter() can return NULL on errors (in which
case it will print an error message).  When we call this from the
non-FIT-image codepath, we aren't checking the return value, so we
will plough on with a NULL pointer, and segfault in fdt_totalsize().
Check for errors here.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250206151214.2947842-4-peter.maydell@linaro.org

5 weeks agohw/openrisc: Support monitor dumpdtb command
Peter Maydell [Thu, 6 Feb 2025 15:12:10 +0000 (15:12 +0000)]
hw/openrisc: Support monitor dumpdtb command

The openrisc machines don't set MachineState::fdt to point to their
DTB blob.  This means that although the command line '-machine
dumpdtb=file.dtb' option works, the equivalent QMP and HMP monitor
commands do not, but instead produce the error "This machine doesn't
have a FDT".

Set MachineState::fdt in openrisc_load_fdt(), when we write it to
guest memory.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250206151214.2947842-3-peter.maydell@linaro.org

5 weeks agomonitor/hmp-cmds.c: Clean up hmp_dumpdtb printf
Peter Maydell [Thu, 6 Feb 2025 15:12:09 +0000 (15:12 +0000)]
monitor/hmp-cmds.c: Clean up hmp_dumpdtb printf

In hmp_dumpdtb(), we print a message when the command succeeds.  This
message is missing the trailing \n, so the HMP command prompt is
printed immediately after it.  We also weren't capitalizing 'DTB', or
quoting the filename in the message.  Fix these nits.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250206151214.2947842-2-peter.maydell@linaro.org

5 weeks agohw/arm/virt: Support larger highmem MMIO regions
Matthew R. Ochs [Fri, 21 Feb 2025 14:54:19 +0000 (06:54 -0800)]
hw/arm/virt: Support larger highmem MMIO regions

The MMIO region size required to support virtualized environments with
large PCI BAR regions can exceed the hardcoded limit configured in QEMU.
For example, a VM with multiple NVIDIA Grace-Hopper GPUs passed through
requires more MMIO memory than the amount provided by VIRT_HIGH_PCIE_MMIO
(currently 512GB). Instead of updating VIRT_HIGH_PCIE_MMIO, introduce a
new parameter, highmem-mmio-size, that specifies the MMIO size required
to support the VM configuration.

Example usage with 1TB MMIO region size:
-machine virt,gic-version=3,highmem-mmio-size=1T

Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Message-id: 20250221145419.1281890-1-mochs@nvidia.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
5 weeks agohw/arm/smmuv3: Fill u.f_cd_fetch.addr for SMMU_EVT_F_CD_FETCH
Nicolin Chen [Thu, 20 Feb 2025 21:38:31 +0000 (13:38 -0800)]
hw/arm/smmuv3: Fill u.f_cd_fetch.addr for SMMU_EVT_F_CD_FETCH

When we fill in the SMMUEventInfo for SMMU_EVT_F_CD_FETCH we write
the address into the f_ste_fetch member of the union, but then when
we come to read it back in smmuv3_record_event() we will (correctly)
be using the f_cd_fetch member.

This is more like a cosmetics fix since the f_cd_fetch and f_ste_fetch are
basically the same field since they are in the exact same union with exact
same type, but it's conceptually wrong. Use the correct union member.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Message-id: 20250220213832.80289-1-nicolinc@nvidia.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agoMerge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into...
Stefan Hajnoczi [Fri, 21 Feb 2025 21:06:39 +0000 (05:06 +0800)]
Merge tag 'for_upstream' of https://git./virt/kvm/mst/qemu into staging

virtio,pc,pci: features, fixes, cleanups

Features:

SR-IOV emulation for pci
virtio-mem-pci support for s390
interleave support for cxl
big endian support for vdpa svq
new QAPI events for vhost-user

Also vIOMMU reset order fixups are in.
Fixes, cleanups all over the place.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 # -----BEGIN PGP SIGNATURE-----
 #
 # iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAme4b8sPHG1zdEByZWRo
 # YXQuY29tAAoJECgfDbjSjVRpHKcIAKPJsVqPdda2dJ7b7FdyRT0Q+uwezXqaGHd4
 # 7Lzih1wsxYNkwIAyPtEb76/21qiS7BluqlUCfCB66R9xWjP5/KfvAFj4/r4AEduE
 # fxAgYzotNpv55zcRbcflMyvQ42WGiZZHC+o5Lp7vDXUP3pIyHrl0Ydh5WmcD+hwS
 # BjXvda58TirQpPJ7rUL+sSfLih17zQkkDcfv5/AgorDy1wK09RBKwMx/gq7wG8yJ
 # twy8eBY2CmfmFD7eTM+EKqBD2T0kwLEeLfS/F/tl5Fyg6lAiYgYtCbGLpAmWErsg
 # XZvfZmwqL7CNzWexGvPFnnLyqwC33WUP0k0kT88Y5wh3/h98blw=
 # =tej8
 # -----END PGP SIGNATURE-----
 # gpg: Signature made Fri 21 Feb 2025 20:21:31 HKT
 # gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
 # gpg:                issuer "mst@redhat.com"
 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
 # gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]
 # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
 #      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (41 commits)
  docs/devel/reset: Document reset expectations for DMA and IOMMU
  hw/vfio/common: Add a trace point in vfio_reset_handler
  hw/arm/smmuv3: Move reset to exit phase
  hw/i386/intel-iommu: Migrate to 3-phase reset
  hw/virtio/virtio-iommu: Migrate to 3-phase reset
  vhost-user-snd: correct the calculation of config_size
  net: vhost-user: add QAPI events to report connection state
  hw/virtio/virtio-nsm: Respond with correct length
  vdpa: Fix endian bugs in shadow virtqueue
  MAINTAINERS: add more files to `vhost`
  cryptodev/vhost: allocate CryptoDevBackendVhost using g_mem0()
  vhost-iova-tree: Update documentation
  vhost-iova-tree, svq: Implement GPA->IOVA & partial IOVA->HVA trees
  vhost-iova-tree: Implement an IOVA-only tree
  amd_iommu: Use correct bitmask to set capability BAR
  amd_iommu: Use correct DTE field for interrupt passthrough
  hw/virtio: reset virtio balloon stats on machine reset
  mem/cxl_type3: support 3, 6, 12 and 16 interleave ways
  hw/mem/cxl_type3: Ensure errp is set on realization failure
  hw/mem/cxl_type3: Fix special_ops memory leak on msix_init_exclusive_bar() failure
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6 weeks agodocs/devel/reset: Document reset expectations for DMA and IOMMU
Eric Auger [Tue, 18 Feb 2025 18:25:35 +0000 (19:25 +0100)]
docs/devel/reset: Document reset expectations for DMA and IOMMU

To avoid any translation faults, the IOMMUs are expected to be
reset after the devices they protect. Document that we expect
DMA requests to be stopped during the 'enter' or 'hold' phase
while IOMMUs should be reset during the 'exit' phase.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20250218182737.76722-6-eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agohw/vfio/common: Add a trace point in vfio_reset_handler
Eric Auger [Tue, 18 Feb 2025 18:25:34 +0000 (19:25 +0100)]
hw/vfio/common: Add a trace point in vfio_reset_handler

To ease the debug of reset sequence, let's add a trace point
in vfio_reset_handler()

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20250218182737.76722-5-eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agohw/arm/smmuv3: Move reset to exit phase
Eric Auger [Tue, 18 Feb 2025 18:25:33 +0000 (19:25 +0100)]
hw/arm/smmuv3: Move reset to exit phase

Currently the iommu may be reset before the devices
it protects. For example this happens with virtio-scsi-pci.
when system_reset is issued from qmp monitor: spurious
"virtio: zero sized buffers are not allowed" warnings can
be observed. This happens because outstanding DMA requests
are still happening while the SMMU gets reset.

This can also happen with VFIO devices. In that case
spurious DMA translation faults can be observed on host.

Make sure the SMMU is reset in the 'exit' phase after
all DMA capable devices have been reset during the 'enter'
or 'hold' phase.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20250218182737.76722-4-eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agohw/i386/intel-iommu: Migrate to 3-phase reset
Eric Auger [Tue, 18 Feb 2025 18:25:32 +0000 (19:25 +0100)]
hw/i386/intel-iommu: Migrate to 3-phase reset

Currently the IOMMU may be reset before the devices
it protects. For example this happens with virtio devices
but also with VFIO devices. In this latter case this
produces spurious translation faults on host.

Let's use 3-phase reset mechanism and reset the IOMMU on
exit phase after all DMA capable devices have been reset
on 'enter' or 'hold' phase.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Zhenzhong Duan <zhenzhong.duan@intel.com>

Message-Id: <20250218182737.76722-3-eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agohw/virtio/virtio-iommu: Migrate to 3-phase reset
Eric Auger [Tue, 18 Feb 2025 18:25:31 +0000 (19:25 +0100)]
hw/virtio/virtio-iommu: Migrate to 3-phase reset

Currently the iommu may be reset before the devices
it protects. For example this happens with virtio-net.

Let's use 3-phase reset mechanism and reset the IOMMU on
exit phase after all DMA capable devices have been
reset during the 'enter' or 'hold' phase.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20250218182737.76722-2-eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agovhost-user-snd: correct the calculation of config_size
Matias Ezequiel Vara Larsen [Mon, 17 Feb 2025 13:12:55 +0000 (14:12 +0100)]
vhost-user-snd: correct the calculation of config_size

Use virtio_get_config_size() rather than sizeof(struct
virtio_snd_config) for the config_size in the vhost-user-snd frontend.
The frontend shall rely on device features for the size of the device
configuration space. The presence of `controls` in the config space
depends on VIRTIO_SND_F_CTLS according to the specification (v1.3):
`
5.14.4 Device Configuration Layout
...

controls
(driver-read-only) indicates a total number of all available control
elements if VIRTIO_SND_F_CTLS has been negotiated.
`
This fixes an issue introduced by commit ab0c7fb2 ("linux-headers:
update to current kvm/next") in which the optional field `controls` is
added to the virtio_snd_config structure. This breaks vhost-user-device
backends that do not implement the `controls` field.

Fixes: ab0c7fb22b ("linux-headers: update to current kvm/next")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2805
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Matias Ezequiel Vara Larsen <mvaralar@redhat.com>
Message-Id: <20250217131255.829892-1-mvaralar@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Dorinda Bassey <dbassey@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agonet: vhost-user: add QAPI events to report connection state
Laurent Vivier [Mon, 17 Feb 2025 09:25:50 +0000 (10:25 +0100)]
net: vhost-user: add QAPI events to report connection state

The netdev reports NETDEV_VHOST_USER_CONNECTED event when
the chardev is connected, and NETDEV_VHOST_USER_DISCONNECTED
when it is disconnected.

The NETDEV_VHOST_USER_CONNECTED event includes the chardev id.

This allows a system manager like libvirt to detect when the server
fails.

For instance with passt:

{ 'execute': 'qmp_capabilities' }
{ "return": { } }

[killing passt here]

{ "timestamp": { "seconds": 1739538634, "microseconds": 920450 },
  "event": "NETDEV_VHOST_USER_DISCONNECTED",
  "data": { "netdev-id": "netdev0" } }

[automatic reconnection with reconnect-ms]

{ "timestamp": { "seconds": 1739538638, "microseconds": 354181 },
  "event": "NETDEV_VHOST_USER_CONNECTED",
  "data": { "netdev-id": "netdev0", "chardev-id": "chr0" } }

Tested-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20250217092550.1172055-1-lvivier@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agohw/virtio/virtio-nsm: Respond with correct length
Alexander Graf [Thu, 13 Feb 2025 11:45:41 +0000 (11:45 +0000)]
hw/virtio/virtio-nsm: Respond with correct length

When we return a response packet from NSM, we need to indicate its
length according to the content of the response. Prior to this patch, we
returned the length of the source buffer, which may confuse guest code
that relies on the response size.

Fix it by returning the response payload size instead.

Fixes: bb154e3e0cc715 ("device/virtio-nsm: Support for Nitro Secure Module device")
Reported-by: Vikrant Garg <vikrant1garg@gmail.com>
Signed-off-by: Alexander Graf <graf@amazon.com>
Message-Id: <20250213114541.67515-1-graf@amazon.com>
Reviewed-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
Fixes: bb154e3e0cc715 (&quot;device/virtio-nsm: Support for Nitro Secure Module device&quot;)<br>
Reported-by: Vikrant Garg <vikrant1garg@gmail.com>
Signed-off-by: Alexander Graf <graf@amazon.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Vikrant Garg <vikrant1garg@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agovdpa: Fix endian bugs in shadow virtqueue
Konstantin Shkolnyy [Wed, 12 Feb 2025 16:49:23 +0000 (10:49 -0600)]
vdpa: Fix endian bugs in shadow virtqueue

VDPA didn't work on a big-endian machine due to missing/incorrect
CPU<->LE data format conversions.

Signed-off-by: Konstantin Shkolnyy <kshk@linux.ibm.com>
Message-Id: <20250212164923.1971538-1-kshk@linux.ibm.com>
Fixes: 10857ec0ad ("vhost: Add VhostShadowVirtqueue")
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agoMAINTAINERS: add more files to `vhost`
Stefano Garzarella [Tue, 11 Feb 2025 14:42:57 +0000 (15:42 +0100)]
MAINTAINERS: add more files to `vhost`

While sending a patch for backends/cryptodev-vhost.c I noticed that
Michael wasn`t in CC so I took a look at the files listed under `vhost`
and tried to fix it increasing the coverage by adding new files.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20250211144259.117910-1-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agocryptodev/vhost: allocate CryptoDevBackendVhost using g_mem0()
Stefano Garzarella [Tue, 11 Feb 2025 13:55:23 +0000 (14:55 +0100)]
cryptodev/vhost: allocate CryptoDevBackendVhost using g_mem0()

The function `vhost_dev_init()` expects the `struct vhost_dev`
(passed as a parameter) to be fully initialized. This is important
because some parts of the code check whether `vhost_dev->config_ops`
is NULL to determine if it has been set (e.g. later via
`vhost_dev_set_config_notifier`).

To ensure this initialization, it’s better to allocate the entire
`CryptoDevBackendVhost` structure (which includes `vhost_dev`) using
`g_mem0()`, following the same approach used for other vhost devices,
such as in `vhost_net_init()`.

Fixes: 042cea274c ("cryptodev: add vhost-user as a new cryptodev backend")
Cc: qemu-stable@nongnu.org
Reported-by: myluo24@m.fudan.edu.cn
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20250211135523.101203-1-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agovhost-iova-tree: Update documentation
Jonah Palmer [Mon, 17 Feb 2025 14:49:34 +0000 (09:49 -0500)]
vhost-iova-tree: Update documentation

Minor update to some of the documentation / comments in
hw/virtio/vhost-iova-tree.c.

Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Message-Id: <20250217144936.3589907-4-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agovhost-iova-tree, svq: Implement GPA->IOVA & partial IOVA->HVA trees
Jonah Palmer [Mon, 17 Feb 2025 14:49:33 +0000 (09:49 -0500)]
vhost-iova-tree, svq: Implement GPA->IOVA & partial IOVA->HVA trees

Creates and supports a GPA->IOVA tree and a partial IOVA->HVA tree by
splitting up guest-backed memory maps and host-only memory maps from the
full IOVA->HVA tree. That is, any guest-backed memory maps are now
stored in the GPA->IOVA tree and host-only memory maps stay in the
IOVA->HVA tree.

Also propagates the GPAs (in_addr/out_addr) of a VirtQueueElement to
vhost_svq_translate_addr() to translate GPAs to IOVAs via the GPA->IOVA
tree (when descriptors are backed by guest memory). For descriptors
backed by host-only memory, the existing partial SVQ IOVA->HVA tree is
used.

GPAs are unique in the guest's address space, ensuring unambiguous IOVA
translations. This avoids the issue where different GPAs map to the same
HVA, causing the original HVA->IOVA translation to potentially return an
IOVA associated with the wrong intended GPA.

Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20250217144936.3589907-3-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agovhost-iova-tree: Implement an IOVA-only tree
Jonah Palmer [Mon, 17 Feb 2025 14:49:32 +0000 (09:49 -0500)]
vhost-iova-tree: Implement an IOVA-only tree

Creates and supports an IOVA-only tree to support a SVQ IOVA->HVA and
GPA->IOVA tree for host-only and guest-backed memory, respectively, in
the next patch.

The IOVA allocator still allocates an IOVA range but now adds this range
to the IOVA-only tree as well as to the full IOVA->HVA tree.

In the next patch, the full IOVA->HVA tree will be split into a partial
SVQ IOVA->HVA tree and a GPA->IOVA tree. The motivation behind having an
IOVA-only tree was to have a single tree that would keep track of all
allocated IOVA ranges between the partial SVQ IOVA->HVA and GPA->IOVA
trees.

Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Message-Id: <20250217144936.3589907-2-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agoamd_iommu: Use correct bitmask to set capability BAR
Sairaj Kodilkar [Fri, 7 Feb 2025 04:53:54 +0000 (10:23 +0530)]
amd_iommu: Use correct bitmask to set capability BAR

AMD IOMMU provides the base address of control registers through
IVRS table and PCI capability. Since this base address is of 64 bit,
use 32 bits mask (instead of 16 bits) to set BAR low and high.

Fixes: d29a09ca68 ("hw/i386: Introduce AMD IOMMU")
Signed-off-by: Sairaj Kodilkar <sarunkod@amd.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Message-Id: <20250207045354.27329-3-sarunkod@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agoamd_iommu: Use correct DTE field for interrupt passthrough
Sairaj Kodilkar [Fri, 7 Feb 2025 04:53:53 +0000 (10:23 +0530)]
amd_iommu: Use correct DTE field for interrupt passthrough

Interrupt passthrough is determine by the bits 191,190,187-184.
These bits are part of the 3rd quad word (i.e. index 2) in DTE. Hence
replace dte[3] by dte[2].

Fixes: b44159fe0 ("x86_iommu/amd: Add interrupt remap support when VAPIC is not enabled")
Signed-off-by: Sairaj Kodilkar <sarunkod@amd.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Message-Id: <20250207045354.27329-2-sarunkod@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agohw/virtio: reset virtio balloon stats on machine reset
Daniel P. Berrangé [Tue, 4 Feb 2025 09:42:02 +0000 (09:42 +0000)]
hw/virtio: reset virtio balloon stats on machine reset

When a machine is first booted, all virtio balloon stats are initialized
to their default value -1 (18446744073709551615 when represented as
unsigned).

They remain that way while the firmware is loading, and early phase of
guest OS boot, until the virtio-balloon driver is activated. Thereafter
the reported stats reflect the guest OS activity.

When a machine reset is performed, however, the virtio-balloon stats are
left unchanged by QEMU, despite the guest OS no longer updating them,
nor indeed even still existing.

IOW, the mgmt app keeps getting stale stats until the guest OS starts
once more and loads the virtio-balloon driver (if ever). At that point
the app will see a discontinuity in the reported values as they sudden
jump from the stale value to the new value. This jump is indigituishable
from a valid data update.

While there is an "last-updated" field to report on the freshness of
the stats, that does not unambiguously tell the mgmt app whether the
stats are still conceptually relevant to the current running workload.

It is more conceptually useful to reset the stats to their default
values on machine reset, given that the previous guest workload the
stats reflect no longer exists. The mgmt app can now clearly identify
that there are is no stats information available from the current
executing workload.

The 'last-updated' time is also reset back to 0.

IOW, on every machine reset, the virtio stats are in the same clean
state they were when the macine first powered on.

A functional test is added to validate this behaviour with a real
world guest OS.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20250204094202.2183262-1-berrange@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agomem/cxl_type3: support 3, 6, 12 and 16 interleave ways
Yao Xingtao [Mon, 3 Feb 2025 16:19:08 +0000 (16:19 +0000)]
mem/cxl_type3: support 3, 6, 12 and 16 interleave ways

Since the kernel does not check the interleave capability, a
3-way, 6-way, 12-way or 16-way region can be create normally.

Applications can access the memory of 16-way region normally because
qemu can convert hpa to dpa correctly for the power of 2 interleave
ways, after kernel implementing the check, this kind of region will
not be created any more.

For non power of 2 interleave ways, applications could not access the
memory normally and may occur some unexpected behaviors, such as
segmentation fault.

So implements this feature is needed.

Link: https://lore.kernel.org/linux-cxl/3e84b919-7631-d1db-3e1d-33000f3f3868@fujitsu.com/
Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20250203161908.145406-6-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agohw/mem/cxl_type3: Ensure errp is set on realization failure
Li Zhijian [Mon, 3 Feb 2025 16:19:07 +0000 (16:19 +0000)]
hw/mem/cxl_type3: Ensure errp is set on realization failure

Simply pass the errp to its callee which will set errp if needed, to
enhance error reporting for CXL Type 3 device initialization by setting
the errp when realization functions fail.

Previously, failing to set `errp` could result in errors being overlooked,
causing the system to mistakenly treat failure scenarios as successful and
potentially leading to redundant cleanup operations in ct3_exit().

Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20250203161908.145406-5-Jonathan.Cameron@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agohw/mem/cxl_type3: Fix special_ops memory leak on msix_init_exclusive_bar() failure
Li Zhijian [Mon, 3 Feb 2025 16:19:06 +0000 (16:19 +0000)]
hw/mem/cxl_type3: Fix special_ops memory leak on msix_init_exclusive_bar() failure

Address a memory leak issue by ensuring `regs->special_ops` is freed when
`msix_init_exclusive_bar()` encounters an error during CXL Type3 device
initialization.

Additionally, this patch renames err_address_space_free to err_msix_uninit
for better clarity and logical flow

Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20250203161908.145406-4-Jonathan.Cameron@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agohw/mem/cxl_type3: Add paired msix_uninit_exclusive_bar() call
Li Zhijian [Mon, 3 Feb 2025 16:19:05 +0000 (16:19 +0000)]
hw/mem/cxl_type3: Add paired msix_uninit_exclusive_bar() call

msix_uninit_exclusive_bar() should be paired with msix_init_exclusive_bar()

Ensure proper resource cleanup by adding the missing
`msix_uninit_exclusive_bar()` call for the Type3 CXL device.

Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20250203161908.145406-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agohw/cxl: Introduce CXL_T3_MSIX_VECTOR enumeration
Li Zhijian [Mon, 3 Feb 2025 16:19:04 +0000 (16:19 +0000)]
hw/cxl: Introduce CXL_T3_MSIX_VECTOR enumeration

Introduce the `CXL_T3_MSIX_VECTOR` enumeration to specify MSIX vector
assignments specific to the Type 3 (T3) CXL device.

The primary goal of this change is to encapsulate the MSIX vector uses
that are unique to the T3 device within an enumeration, improving code
readability and maintenance by avoiding magic numbers. This organizational
change allows for more explicit references to each vector’s role, thereby
reducing the potential for misconfiguration.

It also modified `mailbox_reg_init_common` to accept the `msi_n` parameter,
reflecting the new MSIX vector setup.

This pertains to the T3 device privately; other endpoints should refrain from
using it, despite its public accessibility to all of them.

Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20250203161908.145406-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agotests/qtest/vhost-user-test: Use modern virtio for vhost-user tests
Thomas Huth [Mon, 3 Feb 2025 12:43:46 +0000 (13:43 +0100)]
tests/qtest/vhost-user-test: Use modern virtio for vhost-user tests

All other vhost-user tests here use modern virtio, too, so let's
adjust the vhost-user-net test accordingly.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20250203124346.169607-1-thuth@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agohw/i386/microvm: Fix crash that occurs when introspecting the microvm machine
Thomas Huth [Thu, 23 Jan 2025 20:47:08 +0000 (21:47 +0100)]
hw/i386/microvm: Fix crash that occurs when introspecting the microvm machine

QEMU currently crashes when you try to inspect the properties of the
microvm machine:

 $ echo '{ "execute": "qmp_capabilities" }
         { "execute": "qom-list-properties","arguments":
           { "typename": "microvm-machine"}}' | \
   ./qemu-system-x86_64 -qmp stdio
 {"QMP": {"version": {"qemu": {"micro": 50, "minor": 2, "major": 9},
  "package": "v9.2.0-1072-g60af367187-dirty"}, "capabilities": ["oob"]}}
 {"return": {}}
 qemu-system-x86_64: ../qemu/hw/i386/acpi-microvm.c:250:
  void acpi_setup_microvm(MicrovmMachineState *):
   Assertion `x86ms->fw_cfg' failed.
 Aborted (core dumped)

This happens because the microvm machine adds a machine_done (and a
powerdown_req) notifier in their instance_init function - however, the
instance_init of machines are not only called for machines that are
realized, but also for machines that are introspected, so in this case
the listener is added for a microvm machine that is never realized. And
since there is already a running machine, the listener function is
triggered immediately, causing a crash since it was not for the right
machine it was meant for.

Such listener functions must never be installed from an instance_init
function. Let's do it from microvm_machine_state_init() instead - this
function is the MachineClass->init() function instead, i.e. guaranteed
to be only called once in the lifetime of a QEMU process.

Since the microvm_machine_done() and microvm_powerdown_req() were
defined quite late in the microvm.c file, we have to move them now
also earlier, so that we can get their function pointers from
microvm_machine_state_init() without having to introduce a separate
prototype for those functions earlier.

Reviewed-by: Sergio Lopez <slp@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20250123204708.1560305-1-thuth@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agohw/i386/pc: Fix crash that occurs when introspecting TYPE_PC_MACHINE machines
Thomas Huth [Fri, 17 Jan 2025 19:21:06 +0000 (20:21 +0100)]
hw/i386/pc: Fix crash that occurs when introspecting TYPE_PC_MACHINE machines

QEMU currently crashes when you try to inspect the machines based on
TYPE_PC_MACHINE for their properties:

 $ echo '{ "execute": "qmp_capabilities" }
         { "execute": "qom-list-properties","arguments":
                      { "typename": "pc-q35-10.0-machine"}}' \
   | ./qemu-system-x86_64 -M pc -qmp stdio
 {"QMP": {"version": {"qemu": {"micro": 50, "minor": 2, "major": 9},
  "package": "v9.2.0-1070-g87e115c122-dirty"}, "capabilities": ["oob"]}}
 {"return": {}}
 Segmentation fault (core dumped)

This happens because TYPE_PC_MACHINE machines add a machine_init-
done_notifier in their instance_init function - but instance_init
of machines are not only called for machines that are realized,
but also for machines that are introspected, so in this case the
listener is added for a q35 machine that is never realized. But
since there is already a running pc machine, the listener function
is triggered immediately, causing a crash since it was not for the
right machine it was meant for.

Such listener functions must never be installed from an instance_init
function. Let's do it from pc_basic_device_init() instead - this
function is called from the MachineClass->init() function instead,
i.e. guaranteed to be only called once in the lifetime of a QEMU
process.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2779
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20250117192106.471029-1-thuth@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agohw/pci: Assert a bar is not registered multiple times
Nicholas Piggin [Fri, 17 Jan 2025 17:28:41 +0000 (03:28 +1000)]
hw/pci: Assert a bar is not registered multiple times

Nothing should be doing this, but it doesn't get caught by
pci_register_bar(). Add an assertion to prevent misuse.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20250117172842.406338-3-npiggin@gmail.com>
Reviewed-by: Phil Dennis-Jordan <phil@philjordan.eu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agohw/pci/msix: Warn on PBA writes
Nicholas Piggin [Fri, 17 Jan 2025 17:28:40 +0000 (03:28 +1000)]
hw/pci/msix: Warn on PBA writes

Of the MSI-X PBA pending bits, the PCI Local Bus Specification says:

  Software should never write, and should only read
  Pending Bits. If software writes to Pending Bits, the
  result is undefined.

Log a GUEST_ERROR message if the PBA is written to by software.

Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Dmitry Fleytman <dmitry.fleytman@gmail.com>
Cc: Sriram Yagnaraman <sriram.yagnaraman@ericsson.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20250117172842.406338-2-npiggin@gmail.com>
Reviewed-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agoqtest/libqos/pci: Do not write to PBA memory
Nicholas Piggin [Fri, 17 Jan 2025 17:22:40 +0000 (03:22 +1000)]
qtest/libqos/pci: Do not write to PBA memory

The PCI Local Bus Specification says the result of writes to MSI-X
PBA memory is undefined. QEMU implements them as no-ops, so remove
the pointless write from qpci_msix_pending().

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20250117172244.406206-2-npiggin@gmail.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agopcie_sriov: Register VFs after migration
Akihiko Odaki [Thu, 16 Jan 2025 09:01:02 +0000 (18:01 +0900)]
pcie_sriov: Register VFs after migration

pcie_sriov doesn't have code to restore its state after migration, but
igb, which uses pcie_sriov, naively claimed its migration capability.

Add code to register VFs after migration and fix igb migration.

Fixes: 3a977deebe6b ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250116-reuse-v20-11-7cb370606368@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agopcie_sriov: Remove num_vfs from PCIESriovPF
Akihiko Odaki [Thu, 16 Jan 2025 09:01:01 +0000 (18:01 +0900)]
pcie_sriov: Remove num_vfs from PCIESriovPF

num_vfs is not migrated so use PCI_SRIOV_CTRL_VFE and PCI_SRIOV_NUM_VF
instead.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250116-reuse-v20-10-7cb370606368@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agopcie_sriov: Release VFs failed to realize
Akihiko Odaki [Thu, 16 Jan 2025 09:01:00 +0000 (18:01 +0900)]
pcie_sriov: Release VFs failed to realize

Release VFs failed to realize just as we do in unregister_vfs().

Fixes: 7c0fa8dff811 ("pcie: Add support for Single Root I/O Virtualization (SR/IOV)")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250116-reuse-v20-9-7cb370606368@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agopcie_sriov: Reuse SR-IOV VF device instances
Akihiko Odaki [Thu, 16 Jan 2025 09:00:59 +0000 (18:00 +0900)]
pcie_sriov: Reuse SR-IOV VF device instances

Disable SR-IOV VF devices by reusing code to power down PCI devices
instead of removing them when the guest requests to disable VFs. This
allows to realize devices and report VF realization errors at PF
realization time.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250116-reuse-v20-8-7cb370606368@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agopcie_sriov: Ensure VF addr does not overflow
Akihiko Odaki [Thu, 16 Jan 2025 09:00:58 +0000 (18:00 +0900)]
pcie_sriov: Ensure VF addr does not overflow

pci_new() aborts when creating a VF with addr >= PCI_DEVFN_MAX.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250116-reuse-v20-7-7cb370606368@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agopcie_sriov: Do not manually unrealize
Akihiko Odaki [Thu, 16 Jan 2025 09:00:57 +0000 (18:00 +0900)]
pcie_sriov: Do not manually unrealize

A device gets automatically unrealized when being unparented.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250116-reuse-v20-6-7cb370606368@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6 weeks agos390x/pci: Check for multifunction after device realization
Akihiko Odaki [Thu, 16 Jan 2025 09:00:56 +0000 (18:00 +0900)]
s390x/pci: Check for multifunction after device realization

The SR-IOV PFs set the multifunction bit during device realization so
check them after that. There is no functional change because we
explicitly ignore the multifunction bit for SR-IOV devices.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250116-reuse-v20-5-7cb370606368@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agos390x/pci: Allow plugging SR-IOV devices
Akihiko Odaki [Thu, 16 Jan 2025 09:00:55 +0000 (18:00 +0900)]
s390x/pci: Allow plugging SR-IOV devices

The guest cannot use VFs due to the lack of multifunction support but
can use PFs.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250116-reuse-v20-4-7cb370606368@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agos390x/pci: Avoid creating zpci for VFs
Akihiko Odaki [Thu, 16 Jan 2025 09:00:54 +0000 (18:00 +0900)]
s390x/pci: Avoid creating zpci for VFs

VFs are automatically created by PF, and creating zpci for them will
result in unexpected usage of fids. Currently QEMU does not support
multifunction for s390x so we don't need zpci for VFs anyway.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250116-reuse-v20-3-7cb370606368@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agohw/ppc/spapr_pci: Do not reject VFs created after a PF
Akihiko Odaki [Thu, 16 Jan 2025 09:00:53 +0000 (18:00 +0900)]
hw/ppc/spapr_pci: Do not reject VFs created after a PF

A PF may automatically create VFs and the PF may be function 0.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Message-Id: <20250116-reuse-v20-2-7cb370606368@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agohw/ppc/spapr_pci: Do not create DT for disabled PCI device
Akihiko Odaki [Thu, 16 Jan 2025 09:00:52 +0000 (18:00 +0900)]
hw/ppc/spapr_pci: Do not create DT for disabled PCI device

Disabled means it is a disabled SR-IOV VF and hidden from the guest.
Do not create DT when starting the system and also keep the disabled PCI
device not linked to DRC, which generates DT in case of hotplug.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Shivaprasad G Bhat<sbhat@linux.ibm.com>
Tested-by: Shivaprasad G Bhat<sbhat@linux.ibm.com>
Message-Id: <20250116-reuse-v20-1-7cb370606368@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agohw/net: Fix NULL dereference with software RSS
Akihiko Odaki [Thu, 16 Jan 2025 07:30:43 +0000 (16:30 +0900)]
hw/net: Fix NULL dereference with software RSS

When an eBPF program cannot be attached, virtio_net_load_ebpf() returns
false, and virtio_net_device_realize() enters the code path to handle
errors because of this, but it causes NULL dereference because no error
is generated.

Change virtio_net_load_ebpf() to return false only when a fatal error
occurred.

Fixes: b5900dff14e5 ("hw/net: report errors from failing to use eBPF RSS FDs")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250116-software-v1-1-9e5161b534d8@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agodocs/about: Change notes on x86 machine type deprecation into a general one
Thomas Huth [Thu, 16 Jan 2025 06:46:44 +0000 (07:46 +0100)]
docs/about: Change notes on x86 machine type deprecation into a general one

We now have a general note about versioned machine types getting
deprecated and removed at the beginning of the deprecated.rst file,
so we should also have a general note about this in removed-features.rst
(which will also apply to versioned non-x86 machine types) instead of
listing individual old machine types in the document.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20250116064644.65670-1-thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 weeks agoMerge tag 'pull-target-arm-20250220' of https://git.linaro.org/people/pmaydell/qemu...
Stefan Hajnoczi [Thu, 20 Feb 2025 23:14:03 +0000 (07:14 +0800)]
Merge tag 'pull-target-arm-20250220' of https://git.linaro.org/people/pmaydell/qemu-arm into staging

target-arm queue:
 * Fix some incorrect syndrome values in various sysreg traps
 * Clean up sysreg trap code to avoid similar future bugs
 * Make boards/SoCs using a9mpcore and a15mpcore objects specify
   number of GIC interrupts explicitly
 * Kconfig: Extract CONFIG_USB_CHIPIDEA from CONFIG_IMX
 * target/arm: Use uint32_t in t32_expandimm_imm()
 * New board model: NPCM845 Evaluation board "npcm845-evb"

# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAme3Vk8ZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3t8QD/48yOUtFwUMFrwbbszK5rss
# Ghhtk0ylKIxPQirgzjLUNq4hV2MZEtdmyiqaEllvA0aS839oWhFW0hbsGmoL/TVF
# tqXVP3Y1dMubmNHCGcgiFqaxaInnNSC9S1ALiEm5a37g519706WLXPVLqJJ9t31b
# uWHKT1hqxstzWSExhGrEkSEghcgN3u1KyCz0zyq9bk/F3OFWZfHNH6JqutQX18Ua
# 5HtcD1Pum6WjayBc3y4AYVYH4xyQclY7LPR+zKNf2d5GuZ+J6MlXMyfCuE2/J//m
# wHAtAoeuFhi/HFHR4vQP4L7HrhFrECbjfWha85F/rmiOAo6LnbICyPt4tAPe5So3
# FCtSHfht9ToulBqULE+F/AWVCdt8UeDRgOANSHFsMkxYiUK8QpMv8A2AtwJUqiMp
# WbAzw31f6SgANgFQObhoRNE3QyX8V53ZJAsPhDooTxwMiglqVTM3Xux8W2zz9FdU
# BTwCy23efBqKf4RWfeHjAXctGshePI1mTBJmvEKG5G5ligMNeg7ZiQqqfRVBagc/
# gpsQKNjpN9MVVds3thUvMCYO/9NOeeAtcVA2vW7qf7HrYaM72UngCPWjhNfAj/9I
# 9hxgqEnKC6qoD/zMyFv+XwqNlL1PuD79rbvN8TWFd/f8iBIYOY6WVEYmsi7WGugI
# WzYI93RqFaQhrpyHDcRVGw==
# =djUd
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 21 Feb 2025 00:20:31 HKT
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
# gpg:                 aka "Peter Maydell <peter@archaic.org.uk>" [unknown]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* tag 'pull-target-arm-20250220' of https://git.linaro.org/people/pmaydell/qemu-arm: (41 commits)
  docs/system/arm: Add Description for NPCM8XX SoC
  hw/arm: Add NPCM845 Evaluation board
  hw/arm: Add NPCM8XX SoC
  hw/net: Add NPCM8XX PCS Module
  hw/misc: Support NPCM8XX CLK Module Registers
  hw/misc: Add nr_regs and cold_reset_values to NPCM CLK
  hw/misc: Move NPCM7XX CLK to NPCM CLK
  hw/misc: Rename npcm7xx_clk to npcm_clk
  hw/misc: Support 8-bytes memop in NPCM GCR module
  hw/misc: Store DRAM size in NPCM8XX GCR Module
  hw/misc: Add support for NPCM8XX GCR
  hw/misc: Add nr_regs and cold_reset_values to NPCM GCR
  hw/misc: Move NPCM7XX GCR to NPCM GCR
  hw/misc: Rename npcm7xx_gcr to npcm_gcr
  hw/ssi: Make flash size a property in NPCM7XX FIU
  pc-bios: Add NPCM8XX vBootrom
  roms: Update vbootrom to 1287b6e
  target/arm: Use uint32_t in t32_expandimm_imm()
  Kconfig: Extract CONFIG_USB_CHIPIDEA from CONFIG_IMX
  hw/cpu/arm_mpcore: Remove default values for GIC external IRQs
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6 weeks agodocs/system/arm: Add Description for NPCM8XX SoC
Hao Wu [Wed, 19 Feb 2025 18:46:08 +0000 (10:46 -0800)]
docs/system/arm: Add Description for NPCM8XX SoC

NPCM8XX SoC is the successor of the NPCM7XX. It features quad-core
Cortex-A35 (Armv8, 64-bit) CPUs and some additional peripherals.

This document describes the NPCM8XX SoC and an evaluation board
(NPCM 845 EVB).

Signed-off-by: Hao Wu <wuhaotsh@google.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20250219184609.1839281-18-wuhaotsh@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/arm: Add NPCM845 Evaluation board
Hao Wu [Wed, 19 Feb 2025 18:46:07 +0000 (10:46 -0800)]
hw/arm: Add NPCM845 Evaluation board

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Hao Wu <wuhaotsh@google.com>
Message-id: 20250219184609.1839281-17-wuhaotsh@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/arm: Add NPCM8XX SoC
Hao Wu [Wed, 19 Feb 2025 18:46:06 +0000 (10:46 -0800)]
hw/arm: Add NPCM8XX SoC

Signed-off-by: Hao Wu <wuhaotsh@google.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20250219184609.1839281-16-wuhaotsh@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/net: Add NPCM8XX PCS Module
Hao Wu [Wed, 19 Feb 2025 18:46:05 +0000 (10:46 -0800)]
hw/net: Add NPCM8XX PCS Module

The PCS exists in NPCM8XX's GMAC1 and is used to control the SGMII
PHY. This implementation contains all the default registers and
the soft reset feature that are required to load the Linux kernel
driver. Further features have not been implemented yet.

Signed-off-by: Hao Wu <wuhaotsh@google.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20250219184609.1839281-15-wuhaotsh@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/misc: Support NPCM8XX CLK Module Registers
Hao Wu [Wed, 19 Feb 2025 18:46:04 +0000 (10:46 -0800)]
hw/misc: Support NPCM8XX CLK Module Registers

NPCM8XX adds a few new registers and have a different set of reset
values to the CLK modules. This patch supports them.

This patch doesn't support the new clock values generated by these
registers. Currently no modules use these new clock values so they
are not necessary at this point.
Implementation of these clocks might be required when implementing
these modules.

Reviewed-by: Titus Rwantare <titusr@google.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Hao Wu <wuhaotsh@google.com>
Message-id: 20250219184609.1839281-14-wuhaotsh@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/misc: Add nr_regs and cold_reset_values to NPCM CLK
Hao Wu [Wed, 19 Feb 2025 18:46:03 +0000 (10:46 -0800)]
hw/misc: Add nr_regs and cold_reset_values to NPCM CLK

These 2 values are different between NPCM7XX and NPCM8XX
CLKs. So we add them to the class and assign different values
to them.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Hao Wu <wuhaotsh@google.com>
Message-id: 20250219184609.1839281-13-wuhaotsh@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/misc: Move NPCM7XX CLK to NPCM CLK
Hao Wu [Wed, 19 Feb 2025 18:46:02 +0000 (10:46 -0800)]
hw/misc: Move NPCM7XX CLK to NPCM CLK

A lot of NPCM7XX and NPCM8XX CLK modules share the same code,
this commit moves the NPCM7XX CLK to NPCM CLK for these
properties.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Hao Wu <wuhaotsh@google.com>
Message-id: 20250219184609.1839281-12-wuhaotsh@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/misc: Rename npcm7xx_clk to npcm_clk
Hao Wu [Wed, 19 Feb 2025 18:46:01 +0000 (10:46 -0800)]
hw/misc: Rename npcm7xx_clk to npcm_clk

NPCM7XX and NPCM8XX have a different set of CLK registers. This
commit changes the name of the clk files to be used by both
NPCM7XX and NPCM8XX CLK modules.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Hao Wu <wuhaotsh@google.com>
Message-id: 20250219184609.1839281-11-wuhaotsh@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/misc: Support 8-bytes memop in NPCM GCR module
Hao Wu [Wed, 19 Feb 2025 18:46:00 +0000 (10:46 -0800)]
hw/misc: Support 8-bytes memop in NPCM GCR module

The NPCM8xx GCR device can be accessed with 64-bit memory operations.
This patch supports that.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Hao Wu <wuhaotsh@google.com>
Reviewed-by: Philippe Mathieu-Daude <philmd@linaro.org>
Message-id: 20250219184609.1839281-10-wuhaotsh@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/misc: Store DRAM size in NPCM8XX GCR Module
Hao Wu [Wed, 19 Feb 2025 18:45:59 +0000 (10:45 -0800)]
hw/misc: Store DRAM size in NPCM8XX GCR Module

NPCM8XX boot block stores the DRAM size in SCRPAD_B register in GCR
module. Since we don't simulate a detailed memory controller, we
need to store this information directly similar to the NPCM7XX's
INCTR3 register.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Hao Wu <wuhaotsh@google.com>
Message-id: 20250219184609.1839281-9-wuhaotsh@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/misc: Add support for NPCM8XX GCR
Hao Wu [Wed, 19 Feb 2025 18:45:58 +0000 (10:45 -0800)]
hw/misc: Add support for NPCM8XX GCR

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Hao Wu <wuhaotsh@google.com>
Message-id: 20250219184609.1839281-8-wuhaotsh@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/misc: Add nr_regs and cold_reset_values to NPCM GCR
Hao Wu [Wed, 19 Feb 2025 18:45:57 +0000 (10:45 -0800)]
hw/misc: Add nr_regs and cold_reset_values to NPCM GCR

These 2 values are different between NPCM7XX and NPCM8XX
GCRs. So we add them to the class and assign different values
to them.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Hao Wu <wuhaotsh@google.com>
Message-id: 20250219184609.1839281-7-wuhaotsh@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/misc: Move NPCM7XX GCR to NPCM GCR
Hao Wu [Wed, 19 Feb 2025 18:45:56 +0000 (10:45 -0800)]
hw/misc: Move NPCM7XX GCR to NPCM GCR

A lot of NPCM7XX and NPCM8XX GCR modules share the same code,
this commit moves the NPCM7XX GCR to NPCM GCR for these
properties.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Hao Wu <wuhaotsh@google.com>
Message-id: 20250219184609.1839281-6-wuhaotsh@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/misc: Rename npcm7xx_gcr to npcm_gcr
Hao Wu [Wed, 19 Feb 2025 18:45:55 +0000 (10:45 -0800)]
hw/misc: Rename npcm7xx_gcr to npcm_gcr

NPCM7XX and NPCM8XX have a different set of GCRs and the GCR module
needs to fit both. This commit changes the name of the GCR module.
Future commits will add the support for NPCM8XX GCRs.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Hao Wu <wuhaotsh@google.com>
Message-id: 20250219184609.1839281-5-wuhaotsh@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/ssi: Make flash size a property in NPCM7XX FIU
Hao Wu [Wed, 19 Feb 2025 18:45:54 +0000 (10:45 -0800)]
hw/ssi: Make flash size a property in NPCM7XX FIU

This allows different FIUs to have different flash sizes, useful
in NPCM8XX which has multiple different sized FIU modules.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Hao Wu <wuhaotsh@google.com>
Reviewed-by: Philippe Mathieu-Daude <philmd@linaro.org>
Message-id: 20250219184609.1839281-4-wuhaotsh@google.com
[PMM: flash_size must be a uint64_t to build on 32-bit hosts]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agopc-bios: Add NPCM8XX vBootrom
Hao Wu [Wed, 19 Feb 2025 18:45:53 +0000 (10:45 -0800)]
pc-bios: Add NPCM8XX vBootrom

The bootrom is a minimal bootrom used to load an NPCM8XX image.
The source code is located in the same repo as the NPCM7XX one:
github.com/google/vbootrom/tree/master/npcm8xx.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Hao Wu <wuhaotsh@google.com>
Message-id: 20250219184609.1839281-3-wuhaotsh@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agoroms: Update vbootrom to 1287b6e
Hao Wu [Wed, 19 Feb 2025 18:45:52 +0000 (10:45 -0800)]
roms: Update vbootrom to 1287b6e

This newer vbootrom supports NPCM8xx. Similar to the NPCM7XX one
it supports loading the UBoot from the SPI device and not more.

We updated the npcm7xx bootrom to be compiled from this version.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Hao Wu <wuhaotsh@google.com>
Message-id: 20250219184609.1839281-2-wuhaotsh@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agotarget/arm: Use uint32_t in t32_expandimm_imm()
Stephen Longfield [Wed, 19 Feb 2025 16:55:34 +0000 (16:55 +0000)]
target/arm: Use uint32_t in t32_expandimm_imm()

In t32_expandimm_imm(), we take an 8 bit value XY and construct a
32-bit value which might be of the form XY, 00XY00XY, XY00XY00, or
XYXYXYXY.  We do this with multiplications, and we use an 'int' type.
For the cases where we're setting the high byte of the 32-bit value
to XY, this means that we do an integer multiplication that might
overflow, and rely on the -fwrapv semantics to keep this from being
undefined behaviour.

It's clearer to use an unsigned type here, because we're really
doing operations on the value considered as a set of bits. The
result is the same.

The return value from the function remains 'int', because this
is a decodetree !function function, and follows the API for those
functions.

Signed-off-by: Stephen Longfield <slongfield@google.com>
Signed-off-by: Roque Arcudia Hernandez <roqueh@google.com>
Message-id: 20250219165534.3387376-1-slongfield@google.com
[PMM: Rewrote the commit message]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agoKconfig: Extract CONFIG_USB_CHIPIDEA from CONFIG_IMX
Bernhard Beschow [Sun, 9 Feb 2025 10:36:04 +0000 (11:36 +0100)]
Kconfig: Extract CONFIG_USB_CHIPIDEA from CONFIG_IMX

TYPE_CHIPIDEA models an IP block which is also used in TYPE_ZYNQ_MACHINE which
itself is not an IMX device. CONFIG_ZYNQ selects CONFIG_USB_EHCI_SYSBUS while
TYPE_CHIPIDEA is a separate compilation unit, so only works by accident if
CONFIG_IMX is given. Fix that by extracting CONFIG_USB_CHIPIDEA from CONFIG_IMX.

cc: qemu-stable@nongnu.org
Fixes: 616ec12d0fcc "hw/arm/xilinx_zynq: Fix USB port instantiation"
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-id: 20250209103604.29545-1-shentey@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/cpu/arm_mpcore: Remove default values for GIC external IRQs
Philippe Mathieu-Daudé [Wed, 12 Feb 2025 15:43:33 +0000 (16:43 +0100)]
hw/cpu/arm_mpcore: Remove default values for GIC external IRQs

Implicit default values are often hard to figure out, better
be explicit. Now that all boards explicitly set the number of
GIC external IRQs, remove the default values (displaying an
error message if it is out of range).

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250212154333.28644-9-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/arm/highbank: Specify explicitly the GIC has 128 external IRQs
Philippe Mathieu-Daudé [Wed, 12 Feb 2025 15:43:32 +0000 (16:43 +0100)]
hw/arm/highbank: Specify explicitly the GIC has 128 external IRQs

When not specified, Cortex-A9MP configures its GIC with 64 external
IRQs, (see commit a32134aad89 "arm:make the number of GIC interrupts
configurable"), and Cortex-15MP to 128 (see commit  528622421eb
"hw/cpu/a15mpcore: Correct default value for num-irq").
The Caldexa Highbank board however expects a fixed set of 128
interrupts (see the fixed IRQ length when this board was added in
commit 2488514cef2 ("arm: SoC model for Calxeda Highbank"). Add the
GIC_EXT_IRQS definition (with a comment) to make that explicit.

Except explicitly setting a property value to its same implicit
value, there is no logical change intended.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250212154333.28644-8-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/arm/vexpress: Specify explicitly the GIC has 64 external IRQs
Philippe Mathieu-Daudé [Wed, 12 Feb 2025 15:43:31 +0000 (16:43 +0100)]
hw/arm/vexpress: Specify explicitly the GIC has 64 external IRQs

When not specified, Cortex-A9MP configures its GIC with 64 external
IRQs, (see commit a32134aad89 "arm:make the number of GIC interrupts
configurable"), and Cortex-15MP to 128 (see commit  528622421eb
"hw/cpu/a15mpcore: Correct default value for num-irq").
The Versatile Express board however expects a fixed set of 64
interrupts (see the fixed IRQ length when this board was added in
commit 2055283bcc8 ("hw/vexpress: Add model of ARM Versatile Express
board"). Add the GIC_EXT_IRQS definition (with a comment) to make
that explicit.

Except explicitly setting a property value to its same implicit
value, there is no logical change intended.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250212154333.28644-7-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/arm/xilinx_zynq: Specify explicitly the GIC has 64 external IRQs
Philippe Mathieu-Daudé [Wed, 12 Feb 2025 15:43:30 +0000 (16:43 +0100)]
hw/arm/xilinx_zynq: Specify explicitly the GIC has 64 external IRQs

Looking at the Zynq 7000 SoC Technical Reference Manual (UG585 v1.14)
on Appendix A: Register Details, the mpcore Interrupt Controller Type
Register (ICDICTR) has the IT_Lines_Number field read-only with value
0x2, described as:

  IT_Lines_Number

          b00010 = the distributor provides 96 interrupts,
                   64 external interrupt lines.

Add a GIC_EXT_IRQS definition (with a comment) to make the number of
GIC external IRQs explicit.

Except explicitly setting a property value to its same implicit
value, there is no logical change intended.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250212154333.28644-6-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/arm/xilinx_zynq: Replace IRQ_OFFSET -> GIC_INTERNAL
Philippe Mathieu-Daudé [Wed, 12 Feb 2025 15:43:29 +0000 (16:43 +0100)]
hw/arm/xilinx_zynq: Replace IRQ_OFFSET -> GIC_INTERNAL

We already have a definition to distinct GIC internal
IRQs versus external ones, use it. No logical changes.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250212154333.28644-5-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/arm/realview: Specify explicitly the GIC has 64 external IRQs
Philippe Mathieu-Daudé [Wed, 12 Feb 2025 15:43:28 +0000 (16:43 +0100)]
hw/arm/realview: Specify explicitly the GIC has 64 external IRQs

When not specified, Cortex-A9MP configures its GIC with 64 external
IRQs (see commit a32134aad89 "arm:make the number of GIC interrupts
configurable"). Add the GIC_EXT_IRQS definition (with a comment)
to make that explicit.

Except explicitly setting a property value to its same implicit
value, there is no logical change intended.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250212154333.28644-4-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/arm/exynos4210: Specify explicitly the GIC has 64 external IRQs
Philippe Mathieu-Daudé [Wed, 12 Feb 2025 15:43:27 +0000 (16:43 +0100)]
hw/arm/exynos4210: Specify explicitly the GIC has 64 external IRQs

When not specified, Cortex-A9MP configures its GIC with 64 external
IRQs (see commit a32134aad89 "arm:make the number of GIC interrupts
configurable"). Add the GIC_EXT_IRQS definition (with a comment)
to make that explicit.

Except explicitly setting a property value to its same implicit
value, there is no logical change intended.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250212154333.28644-3-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agohw/arm/exynos4210: Replace magic 32 by proper 'GIC_INTERNAL' definition
Philippe Mathieu-Daudé [Wed, 12 Feb 2025 15:43:26 +0000 (16:43 +0100)]
hw/arm/exynos4210: Replace magic 32 by proper 'GIC_INTERNAL' definition

The 32 IRQ lines skipped are the GIC internal ones.
Use the GIC_INTERNAL definition for clarity.
No logical change.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250212154333.28644-2-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 weeks agotarget/arm: Correct errors in WFI/WFE trapping
Peter Maydell [Thu, 30 Jan 2025 18:23:09 +0000 (18:23 +0000)]
target/arm: Correct errors in WFI/WFE trapping

The code for WFI/WFE trapping has several errors:
 * it wasn't using arm_sctlr(), so it would look at SCTLR_EL1
   even if the CPU was in the EL2&0 translation regime
 * it was raising UNDEF, not Monitor Trap, for traps to
   AArch32 EL3 because of SCR.{TWE,TWI}
 * it was not honouring SCR.{TWE,TWI} when running in
   AArch32 at EL3 not in Monitor mode
 * it checked SCR.{TWE,TWI} even on v7 CPUs which don't have
   those bits

Fix these bugs.

Cc: qemu-stable@nongnu.org
Fixes: b1eced713d99 ("target-arm: Add WFx instruction trap support")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250130182309.717346-15-peter.maydell@linaro.org

6 weeks agotarget/arm: Rename CP_ACCESS_TRAP_UNCATEGORIZED to CP_ACCESS_UNDEFINED
Peter Maydell [Thu, 30 Jan 2025 18:23:08 +0000 (18:23 +0000)]
target/arm: Rename CP_ACCESS_TRAP_UNCATEGORIZED to CP_ACCESS_UNDEFINED

CP_ACCESS_TRAP_UNCATEGORIZED is technically an accurate description
of what this return value from a cpreg accessfn does, but it's liable
to confusion because it doesn't match how the Arm ARM pseudocode
indicates this case. What it does is an EXCP_UDEF with a zero
("uncategorized") syndrome value, which is what an UNDEFINED instruction
does. The pseudocode uses "UNDEFINED" to show this; rename our
constant to CP_ACCESS_UNDEFINED to make the parallel clearer.

Commit created with
sed -i -e 's/CP_ACCESS_TRAP_UNCATEGORIZED/CP_ACCESS_UNDEFINED/' $(git grep -l CP_ACCESS_TRAP_UNCATEGORIZED)

plus manual editing of the comment.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250130182309.717346-14-peter.maydell@linaro.org

6 weeks agotarget/arm: Remove CP_ACCESS_TRAP handling
Peter Maydell [Thu, 30 Jan 2025 18:23:07 +0000 (18:23 +0000)]
target/arm: Remove CP_ACCESS_TRAP handling

There are no longer any uses of CP_ACCESS_TRAP in access functions,
because we have converted them all to use either CP_ACCESS_TRAP_EL1
or CP_ACCESS_TRAP_UNCATEGORIZED, as appropriate. Remove the handling
of bare CP_ACCESS_TRAP from the access_check_cp_reg() helper, so that
it now asserts if an access function returns a value requesting a
trap without a target EL.

Rename CP_ACCESS_TRAP to CP_ACCESS_TRAP_BIT, to make it clearer
that this is an internal-only definition, not something that
it makes sense to return from an access function. This should
help to avoid future bugs where we return the wrong syndrome
value by mistake.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250130182309.717346-13-peter.maydell@linaro.org

6 weeks agotarget/arm: Use TRAP_UNCATEGORIZED for XScale CPAR traps
Peter Maydell [Thu, 30 Jan 2025 18:23:06 +0000 (18:23 +0000)]
target/arm: Use TRAP_UNCATEGORIZED for XScale CPAR traps

On XScale CPUs, there is no EL2 or AArch64, so no syndrome register.
These traps are just UNDEFs in the traditional AArch32 sense, so
CP_ACCESS_TRAP_UNCATEGORIZED is more accurate than CP_ACCESS_TRAP.
This has no visible behavioural change, because the guest doesn't
have a way to see the syndrome value we generate.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250130182309.717346-12-peter.maydell@linaro.org

6 weeks agotarget/arm: Use CP_ACCESS_TRAP_EL1 for traps that are always to EL1
Peter Maydell [Thu, 30 Jan 2025 18:23:05 +0000 (18:23 +0000)]
target/arm: Use CP_ACCESS_TRAP_EL1 for traps that are always to EL1

We currently use CP_ACCESS_TRAP in a number of access functions where
we know we're currently at EL0; in this case the "usual target EL"
is EL1, so CP_ACCESS_TRAP and CP_ACCESS_TRAP_EL1 behave the same.
Use CP_ACCESS_TRAP_EL1 to more closely match the pseudocode for
this sort of check.

Note that in the case of the access functions foc cacheop to
PoC or PoU, the code was correct but the comment was wrong:
SCTLR_EL1.UCI traps for DC CVAC, DC CIVAC, DC CVAP, DC CVADP,
DC CVAU and IC IVAU should be system access traps, not UNDEFs.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250130182309.717346-11-peter.maydell@linaro.org

6 weeks agotarget/arm: Support CP_ACCESS_TRAP_EL1 as a CPAccessResult
Peter Maydell [Thu, 30 Jan 2025 18:23:04 +0000 (18:23 +0000)]
target/arm: Support CP_ACCESS_TRAP_EL1 as a CPAccessResult

In the CPAccessResult enum, the CP_ACCESS_TRAP* values indicate the
equivalent of the pseudocode AArch64.SystemAccessTrap(..., 0x18),
causing a trap to a specified exception level with a syndrome value
giving information about the failing instructions.  In the
pseudocode, such traps are always taken to a specified target EL.  We
support that for target EL of 2 or 3 via CP_ACCESS_TRAP_EL2 and
CP_ACCESS_TRAP_EL3, but the only way to take the access trap to EL1
currently is to use CP_ACCESS_TRAP, which takes the trap to the
"usual target EL" (EL1 if in EL0, otherwise to the current EL).

Add CP_ACCESS_TRAP_EL1 so that access functions can follow the
pseudocode more closely.

(Note that for the common case in the pseudocode of "trap to
EL2 if HCR_EL2.TGE is set, otherwise trap to EL1", we handle
this in raise_exception(), so access functions don't need to
special case it and can use CP_ACCESS_TRAP_EL1.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250130182309.717346-10-peter.maydell@linaro.org

6 weeks agohw/intc/arm_gicv3_cpuif(): Remove redundant tests of is_a64()
Peter Maydell [Thu, 30 Jan 2025 18:23:03 +0000 (18:23 +0000)]
hw/intc/arm_gicv3_cpuif(): Remove redundant tests of is_a64()

In the gicv3_{irq,fiq,irqfiq}_access() functions, in the
arm_current_el(env) == 3 case we do the following test:
    if (!is_a64(env) && !arm_is_el3_or_mon(env)) {
        r = CP_ACCESS_TRAP_EL3;
    }

In this check, the "!is_a64(env)" is redundant, because if
we are at EL3 and in AArch64 then arm_is_el3_or_mon() will
return true and we will skip the if() body anyway.

Remove the unnecessary tests.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250130182309.717346-9-peter.maydell@linaro.org

6 weeks agotarget/arm: Honour SDCR.TDCC and SCR.TERR in AArch32 EL3 non-Monitor modes
Peter Maydell [Thu, 30 Jan 2025 18:23:02 +0000 (18:23 +0000)]
target/arm: Honour SDCR.TDCC and SCR.TERR in AArch32 EL3 non-Monitor modes

There are not many traps in AArch32 which should trap to Monitor
mode, but these trap bits should trap not just lower ELs to Monitor
mode but also the non-Monitor modes running at EL3 (i.e.  Secure
System, Secure Undef, etc).

We get this wrong because the relevant access functions implement the
AArch64-style logic of
   if (el < 3 && trap_bit_set) {
       return CP_ACCESS_TRAP_EL3;
   }
which won't trap the non-Monitor modes at EL3.

Correct this error by using arm_is_el3_or_mon() instead, which
returns true when the CPU is at AArch64 EL3 or AArch32 Monitor mode.
(Since the new callsites are compiled also for the linux-user mode,
we need to provide a dummy implementation for CONFIG_USER_ONLY.)

This affects only:
 * trapping of ERRIDR via SCR.TERR
 * trapping of the debug channel registers via SDCR.TDCC
 * trapping of GICv3 registers via SCR.IRQ and SCR.FIQ
   (which we already used arm_is_el3_or_mon() for)

This patch changes the handling of SCR.TERR and SDCR.TDCC. This
patch only changes guest-visible behaviour for "-cpu max" on
the qemu-system-arm binary, because SCR.TERR
and SDCR.TDCC (and indeed the entire SDCR register) only arrived
in Armv8, and the only guest CPU we support which has any v8
features and also starts in AArch32 EL3 is the 32-bit 'max'.

Other uses of CP_ACCESS_TRAP_EL3 don't need changing:

 * uses in code paths that can't happen when EL3 is AArch32:
   access_trap_aa32s_el1, cpacr_access, cptr_access, nsacr_access
 * uses which are in accessfns for AArch64-only registers:
   gt_stimer_access, gt_cntpoff_access, access_hxen, access_tpidr2,
   access_smpri, access_smprimap, access_lor_ns, access_pauth,
   access_mte, access_tfsr_el2, access_scxtnum, access_fgt
 * trap bits which exist only in the AArch64 version of the
   trap register, not the AArch32 one:
   access_tpm, pmreg_access, access_dbgvcr32, access_tdra,
   access_tda, access_tdosa (TPM, TDA and TDOSA exist only in
   MDCR_EL3, not in SDCR, and we enforce this in sdcr_write())

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250130182309.717346-8-peter.maydell@linaro.org

6 weeks agohw/intc/arm_gicv3_cpuif: Don't downgrade monitor traps for AArch32 EL3
Peter Maydell [Thu, 30 Jan 2025 18:23:01 +0000 (18:23 +0000)]
hw/intc/arm_gicv3_cpuif: Don't downgrade monitor traps for AArch32 EL3

In the gicv3_{irq,fiq,irqfiq}_access() functions, there is a check
which downgrades a CP_ACCESS_TRAP_EL3 to CP_ACCESS_TRAP if EL3 is not
AArch64.  This has been there since the GIC was first implemented,
but it isn't right: if we are trapping because of SCR.IRQ or SCR.FIQ
then we definitely want to be going to EL3 (doing
AArch32.TakeMonitorTrapException() in pseudocode terms).  We might
want to not take a trap at all, but we don't ever want to go to the
default target EL, because that would mean, for instance, taking a
trap to Hyp mode if the trapped access was made from Hyp mode.

(This might have been an attempt to work around our failure to
properly implement Monitor Traps.)

Remove the bogus check.

Cc: qemu-stable@nongnu.org
Fixes: 359fbe65e01e ("hw/intc/arm_gicv3: Implement GICv3 CPU interface registers")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250130182309.717346-7-peter.maydell@linaro.org

6 weeks agotarget/arm: Make CP_ACCESS_TRAPs to AArch32 EL3 be Monitor traps
Peter Maydell [Thu, 30 Jan 2025 18:23:00 +0000 (18:23 +0000)]
target/arm: Make CP_ACCESS_TRAPs to AArch32 EL3 be Monitor traps

In system register access pseudocode the common pattern for
AArch32 registers with access traps to EL3 is:

at EL1 and EL2:
  if HaveEL(EL3) && !ELUsingAArch32(EL3) && (SCR_EL3.TERR == 1) then
     AArch64.AArch32SystemAccessTrap(EL3, 0x03);
  elsif HaveEL(EL3) && ELUsingAArch32(EL3) && (SCR.TERR == 1) then
     AArch32.TakeMonitorTrapException();
at EL3:
  if (PSTATE.M != M32_Monitor) && (SCR.TERR == 1) then
     AArch32.TakeMonitorTrapException();

(taking as an example the ERRIDR access pseudocode).

This implements the behaviour of (in this case) SCR.TERR that
"Accesses to the specified registers from modes other than Monitor
mode generate a Monitor Trap exception" and of SCR_EL3.TERR that
"Accesses of the specified Error Record registers at EL2 and EL1
are trapped to EL3, unless the instruction generates a higher
priority exception".

In QEMU we don't implement this pattern correctly in two ways:
 * in access_check_cp_reg() we turn the CP_ACCESS_TRAP_EL3 into
   an UNDEF, not a trap to Monitor mode
 * in the access functions, we check trap bits like SCR.TERR
   only when arm_current_el(env) < 3 -- this is correct for
   AArch64 EL3, but misses the "trap non-Monitor-mode execution
   at EL3 into Monitor mode" case for AArch32 EL3

In this commit we fix the first of these two issues, by
making access_check_cp_reg() handle CP_ACCESS_TRAP_EL3
as a Monitor trap. This is a kind of exception that we haven't
yet implemented(!), so we need a new EXCP_MON_TRAP for it.

This diverges from the pseudocode approach, where every access check
function explicitly checks for "if EL3 is AArch32" and takes a
monitor trap; if we wanted to be closer to the pseudocode we could
add a new CP_ACCESS_TRAP_MONITOR and make all the accessfns use it
when appropriate.  But because there are no non-standard cases in the
pseudocode (i.e.  where either it raises a Monitor trap that doesn't
correspond to an AArch64 SystemAccessTrap or where it raises a
SystemAccessTrap that doesn't correspond to a Monitor trap), handling
this all in one place seems less likely to result in future bugs
where we forgot again about this special case when writing an
accessor.

(The cc of stable here is because "hw/intc/arm_gicv3_cpuif: Don't
downgrade monitor traps for AArch32 EL3" which is also cc:stable
will implicitly use the new EXCP_MON_TRAP code path.)

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250130182309.717346-6-peter.maydell@linaro.org

6 weeks agotarget/arm: Report correct syndrome for UNDEFINED LOR sysregs when NS=0
Peter Maydell [Thu, 30 Jan 2025 18:22:59 +0000 (18:22 +0000)]
target/arm: Report correct syndrome for UNDEFINED LOR sysregs when NS=0

The pseudocode for the accessors for the LOR sysregs says they
are UNDEFINED if SCR_EL3.NS is 0. We were reporting the wrong
syndrome value here; use CP_ACCESS_TRAP_UNCATEGORIZED.

Cc: qemu-stable@nongnu.org
Fixes: 2d7137c10faf ("target/arm: Implement the ARMv8.1-LOR extension")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250130182309.717346-5-peter.maydell@linaro.org

6 weeks agotarget/arm: Report correct syndrome for UNDEFINED S1E2 AT ops at EL3
Peter Maydell [Thu, 30 Jan 2025 18:22:58 +0000 (18:22 +0000)]
target/arm: Report correct syndrome for UNDEFINED S1E2 AT ops at EL3

The pseudocode for AT S1E2R and AT S1E2W says that they should be
UNDEFINED if executed at EL3 when EL2 is not enabled. We were
incorrectly using CP_ACCESS_TRAP and reporting the wrong exception
syndrome as a result. Use CP_ACCESS_TRAP_UNCATEGORIZED.

Cc: qemu-stable@nongnu.org
Fixes: 2a47df953202e1 ("target-arm: Wire up AArch64 EL2 and EL3 address translation ops")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250130182309.717346-4-peter.maydell@linaro.org