git.maquefel.me Git - qemu.git/log

hw/ppc/e500: Implement pflash handling

Allows e500 boards to have their root file system reside on flash using
only builtin devices located in the eLBC memory region.

Note that the flash memory area is only created when a -pflash argument is
given, and that the size is determined by the given file. The idea is to
put users into control.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221018210146.193159-6-shentey@gmail.com>
[danielhb: use memory_region_size() in mmio_size]
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

hw/sd/sdhci: Rename ESDHC_* defines to USDHC_*

The device model's functions start with "usdhc_", so rename the defines
accordingly for consistency.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20221018210146.193159-5-shentey@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

hw/sd/sdhci-internal: Unexport ESDHC defines

These defines aren't used outside of sdhci.c, so can be defined there.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20221018210146.193159-4-shentey@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

hw/block/pflash_cfi0{1, 2}: Error out if device length isn't a power of two

According to the JEDEC standard the device length is communicated to an
OS as an exponent (power of two).

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20221018210146.193159-3-shentey@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

docs/system/ppc/ppce500: Use qemu-system-ppc64 across the board(s)

The documentation suggests that there is a qemu-system-ppc32 binary
while the 32 bit version is actually just named qemu-system-ppc. Settle
on qemu-system-ppc64 which also works for 32 bit machines and causes
less clutter in the documentation.

Found-by: BALATON Zoltan <balaton@eik.bme.hu>
Suggested-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20221018210146.193159-2-shentey@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: Increment PMC5 with inline insns

Profiling QEMU during Fedora 35 for PPC64 boot revealed that
6.39% of total time was being spent in helper_insns_inc(), on a
POWER9 machine. To avoid calling this helper every time PMCs had
to be incremented, an inline implementation of PMC5 increment and
check for overflow was developed. This led to a reduction of
about 12% in Fedora's boot time.

Signed-off-by: Leandro Lupori <leandro.lupori@eldorado.org.br>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20221025202424.195984-4-leandro.lupori@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: Add new PMC HFLAGS

Add 2 new PMC related HFLAGS:
- HFLAGS_PMCJCE - value of MMCR0 PMCjCE bit
- HFLAGS_PMC_OTHER - set if a PMC other than PMC5-6 is enabled

These flags allow further optimization of PMC5 update code, by
allowing frequently tested conditions to be performed at
translation time.

Signed-off-by: Leandro Lupori <leandro.lupori@eldorado.org.br>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20221025202424.195984-3-leandro.lupori@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

ppc4xx_sdram: Add errp parameter to ppc4xx_sdram_banks()

Do not exit from ppc4xx_sdram_banks() but report error via an errp
parameter instead.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <04bb3445439c2f37b99e74b3fdf4e62c2e6f7e04.1666194485.git.balaton@eik.bme.hu>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

ppc4xx_sdram: Convert DDR SDRAM controller to new bank handling

Use the generic bank handling introduced in previous patch in the DDR
SDRAM controller too. This also fixes previously broken region unmap
due to sdram_ddr_unmap_bcr() ignoring container region so it crashed
with an assert when the guest tried to disable the controller.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <fc7c50e365d0027a659111e9cd67f9b93113a163.1666194485.git.balaton@eik.bme.hu>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

ppc4xx_sdram: Generalise bank setup

Currently only base and size are set on initial bank creation and bcr
value is computed on mapping the region. Set bcr at init so the bcr
encoding method becomes local to the controller model and mapping and
unmapping can operate on the bank so it can be shared between
different controller models. This patch converts the DDR2 controller.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <51b957b4b2d714a1072aa2589b979e08411640df.1666194485.git.balaton@eik.bme.hu>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

ppc4xx_sdram: Rename local state variable for brevity

Rename the sdram local state variable to s in dcr read/write functions
and reset methods for better readability and to match realize methods.
Other places not converted will be changed or removed in subsequent
patches.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <8e7539cb1fccd7556b68351c4dcf62534c3a69cf.1666194485.git.balaton@eik.bme.hu>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

ppc4xx_sdram: Use hwaddr for memory bank size

This resolves the target_ulong dependency that's clearly wrong and was
also noted in a fixme comment.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <92fdc5f9cc76bf45831428b3ec8d9fc6241b7190.1666194485.git.balaton@eik.bme.hu>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

ppc4xx_sdram: Move ppc4xx_sdram_banks() to ppc4xx_sdram.c

This function is only used by the ppc4xx memory controller models so
it can be made static.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <b1504a82157a586aa284e8ee3b427b9a07b24169.1666194485.git.balaton@eik.bme.hu>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

ppc4xx_devs.c: Move DDR SDRAM controller model to ppc4xx_sdram.c

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <3ea98072dbeb757942e25dcfcdd6a7a47738d2ca.1666194485.git.balaton@eik.bme.hu>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

ppc440_uc.c: Move DDR2 SDRAM controller model to ppc4xx_sdram.c

In order to move PPC4xx SDRAM controller models together move out the
DDR2 controller model from ppc440_uc.c into a new ppc4xx_sdram.c file.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <2f2900f93e997480e54b7bf9c32bb482a0fb1022.1666194485.git.balaton@eik.bme.hu>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: move the p*_interrupt_powersave methods to excp_helper.c

Move the methods to excp_helper.c and make them static.

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20221021142156.4134411-4-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: unify cpu->has_work based on cs->interrupt_request

Now that cs->interrupt_request indicates if there is any unmasked
interrupt, checking if the CPU has work to do can be simplified to a
single check that works for all CPU models.

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20221021142156.4134411-3-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: introduce ppc_maybe_interrupt

This new method will check if any pending interrupt was unmasked and
then call cpu_interrupt/cpu_reset_interrupt accordingly. Code that
raises/lowers or masks/unmasks interrupts should call this method to
keep CPU_INTERRUPT_HARD coherent with env->pending_interrupts.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20221021142156.4134411-2-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: remove ppc_store_lpcr from CONFIG_USER_ONLY builds

Writes to LPCR are hypervisor privileged.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-27-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: add power-saving interrupt masking logic to p7_next_unmasked_interrupt

Export p7_interrupt_powersave and use it in p7_next_unmasked_interrupt.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-26-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: move power-saving interrupt masking out of cpu_has_work_POWER7

Move the interrupt masking logic out of cpu_has_work_POWER7 in a new
method, p7_interrupt_powersave, that only returns an interrupt if it can
wake the processor from power-saving mode.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-25-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: remove generic architecture checks from p7_deliver_interrupt

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-24-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: remove unused interrupts from p7_deliver_interrupt

Remove the following unused interrupts from the POWER7 interrupt
processing method:
- PPC_INTERRUPT_RESET: only raised for 6xx, 7xx, 970 and POWER5p;
- Hypervisor Virtualization: introduced in Power ISA v3.0;
- Hypervisor Doorbell and Event-Based Branch: introduced in
  Power ISA v2.07;
- Critical Input, Watchdog Timer, and Fixed Interval Timer: only defined
  for embedded CPUs;
- Doorbell and Critical Doorbell Interrupt: processor does not implement
  the Embedded.Processor Control category;
- Programmable Interval Timer: 40x-only;
- PPC_INTERRUPT_THERM: only raised for 970 and POWER5p;

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-23-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: create an interrupt deliver method for POWER7

The new method is identical to ppc_deliver_interrupt, processor-specific
code will be added/removed in the following patches.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-22-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: remove unused interrupts from p7_next_unmasked_interrupt

Remove the following unused interrupts from the POWER7 interrupt masking
method:
- PPC_INTERRUPT_RESET: only raised for 6xx, 7xx, 970 and POWER5p;
- Hypervisor Virtualization: introduced in Power ISA v3.0;
- Hypervisor Doorbell and Event-Based Branch: introduced in
  Power ISA v2.07;
- Critical Input, Watchdog Timer, and Fixed Interval Timer: only defined
  for embedded CPUs;
- Doorbell and Critical Doorbell Interrupt: processor does not implement
  the Embedded.Processor Control category;
- Programmable Interval Timer: 40x-only;
- PPC_INTERRUPT_THERM: only raised for 970 and POWER5p;

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-21-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: create an interrupt masking method for POWER7

The new method is identical to ppc_next_unmasked_interrupt_generic,
processor-specific code will be added/removed in the following patches.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-20-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: add power-saving interrupt masking logic to p8_next_unmasked_interrupt

Export p8_interrupt_powersave and use it in p8_next_unmasked_interrupt.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-19-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: move power-saving interrupt masking out of cpu_has_work_POWER8

Move the interrupt masking logic out of cpu_has_work_POWER8 in a new
method, p8_interrupt_powersave, that only returns an interrupt if it can
wake the processor from power-saving mode.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-18-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: remove generic architecture checks from p8_deliver_interrupt

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-17-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: remove unused interrupts from p8_deliver_interrupt

Remove the following unused interrupts from the POWER8 interrupt
processing method:
- PPC_INTERRUPT_RESET: only raised for 6xx, 7xx, 970 and POWER5p;
- Debug Interrupt: removed in Power ISA v2.07;
- Hypervisor Virtualization: introduced in Power ISA v3.0;
- Critical Input, Watchdog Timer, and Fixed Interval Timer: only defined
for embedded CPUs;
- Critical Doorbell: processor does not implement the
"Embedded.Processor Control" category;
- Programmable Interval Timer: 40x-only;
- PPC_INTERRUPT_THERM: only raised for 970 and POWER5p;

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-16-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: create an interrupt deliver method for POWER8

The new method is identical to ppc_deliver_interrupt, processor-specific
code will be added/removed in the following patches.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-15-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: remove unused interrupts from p8_next_unmasked_interrupt

Remove the following unused interrupts from the POWER8 interrupt masking
method:
- PPC_INTERRUPT_RESET: only raised for 6xx, 7xx, 970, and POWER5p;
- Debug Interrupt: removed in Power ISA v2.07;
- Hypervisor Virtualization: introduced in Power ISA v3.0;
- Critical Input, Watchdog Timer, and Fixed Interval Timer: only defined
for embedded CPUs;
- Critical Doorbell: processor does not implement the "Embedded.Processor
Control" category;
- Programmable Interval Timer: 40x-only;
- PPC_INTERRUPT_THERM: only raised for 970 and POWER5p;

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-14-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: create an interrupt masking method for POWER8

The new method is identical to ppc_next_unmasked_interrupt_generic,
processor-specific code will be added/removed in the following patches.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-13-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: add power-saving interrupt masking logic to p9_next_unmasked_interrupt

Export p9_interrupt_powersave and use it in p9_next_unmasked_interrupt.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-12-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: move power-saving interrupt masking out of cpu_has_work_POWER9

Move the interrupt masking logic out of cpu_has_work_POWER9 in a new
method, p9_interrupt_powersave, that only returns an interrupt if it can
wake the processor from power-saving mode.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-11-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: remove generic architecture checks from p9_deliver_interrupt

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-10-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: remove unused interrupts from p9_deliver_interrupt

Remove the following unused interrupts from the POWER9 interrupt
processing method:
- PPC_INTERRUPT_RESET: only raised for 6xx, 7xx, 970 and POWER5p;
- Debug Interrupt: removed in Power ISA v2.07;
- Critical Input, Watchdog Timer, and Fixed Interval Timer: only defined
for embedded CPUs;
- Critical Doorbell Interrupt: removed in Power ISA v3.0;
- Programmable Interval Timer: 40x-only.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-9-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: create an interrupt deliver method for POWER9/POWER10

The new method is identical to ppc_deliver_interrupt, processor-specific
code will be added/removed in the following patches.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-8-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: remove unused interrupts from p9_next_unmasked_interrupt

Remove the following unused interrupts from the POWER9 interrupt masking
method:
- PPC_INTERRUPT_RESET: only raised for 6xx, 7xx, 970 and POWER5p;
- Debug Interrupt: removed in Power ISA v2.07;
- Critical Input, Watchdog Timer, and Fixed Interval Timer: only defined
for embedded CPUs;
- Critical Doorbell Interrupt: removed in Power ISA v3.0;
- Programmable Interval Timer: 40x-only.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-7-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: create an interrupt masking method for POWER9/POWER10

The new method is identical to ppc_next_unmasked_interrupt_generic,
processor-specific code will be added/removed in the following patches.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-6-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: prepare to split interrupt masking and delivery by excp_model

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-5-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: split interrupt masking and delivery from ppc_hw_interrupt

Split ppc_hw_interrupt into an interrupt masking method,
ppc_next_unmasked_interrupt, and an interrupt processing method,
ppc_deliver_interrupt.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221011204829.1641124-4-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: always use ppc_set_irq to set env->pending_interrupts

Use ppc_set_irq to raise/clear interrupts to ensure CPU_INTERRUPT_HARD
will be set/reset accordingly.

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20221011204829.1641124-3-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: define PPC_INTERRUPT_* values directly

This enum defines the bit positions in env->pending_interrupts for each
interrupt. However, except for the comparison in kvmppc_set_interrupt,
the values are always used as (1 << PPC_INTERRUPT_*). Define them
directly like that to save some clutter. No functional change intended.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20221011204829.1641124-2-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: Use gvec to decode XVTSTDC[DS]P

Used gvec to translate XVTSTDCSP and XVTSTDCDP.

xvtstdcsp:
rept    loop    imm     master version  prev version        current version
25      4000    0       0,206200        0,040730 (-80.2%)    0,040740 (-80.2%)
25      4000    1       0,205120        0,053650 (-73.8%)    0,053510 (-73.9%)
25      4000    3       0,206160        0,058630 (-71.6%)    0,058570 (-71.6%)
25      4000    51      0,217110        0,191490 (-11.8%)    0,192320 (-11.4%)
25      4000    127     0,206160        0,191490 (-7.1%)     0,192640 (-6.6%)
8000    12      0       1,234719        0,418833 (-66.1%)    0,386365 (-68.7%)
8000    12      1       1,232417        1,435979 (+16.5%)    1,462792 (+18.7%)
8000    12      3       1,232760        1,766073 (+43.3%)    1,743990 (+41.5%)
8000    12      51      1,239281        1,319562 (+6.5%)     1,423479 (+14.9%)
8000    12      127     1,231708        1,315760 (+6.8%)     1,426667 (+15.8%)

xvtstdcdp:
rept    loop    imm     master version  prev version    current version
25      4000    0       0,159930        0,040830 (-74.5%)    0,040610 (-74.6%)
25      4000    1       0,160640        0,053670 (-66.6%)    0,053480 (-66.7%)
25      4000    3       0,160020        0,063030 (-60.6%)    0,062960 (-60.7%)
25      4000    51      0,160410        0,128620 (-19.8%)    0,127470 (-20.5%)
25      4000    127     0,160330        0,127670 (-20.4%)    0,128690 (-19.7%)
8000    12      0       1,190365        0,422146 (-64.5%)    0,388417 (-67.4%)
8000    12      1       1,191292        1,445312 (+21.3%)    1,428698 (+19.9%)
8000    12      3       1,188687        1,980656 (+66.6%)    1,975354 (+66.2%)
8000    12      51      1,191250        1,264500 (+6.1%)     1,355083 (+13.8%)
8000    12      127     1,197313        1,266729 (+5.8%)     1,349156 (+12.7%)

Overall, these instructions are the hardest ones to measure performance
as the gvec implementation is affected by the immediate. Above there are
5 different scenarios when it comes to immediate and 2 when it comes to
rept/loop combination. The immediates scenarios are: all bits are 0
therefore the target register should just be changed to 0, with 1 bit
set, with 2 bits set in a combination the new implementation can deal
with using gvec, 4 bits set and the new implementation can't deal with
it using gvec and all bits set. The rept/loop scenarios are high loop
and low rept (so it should spend more time executing it than translating
it) and high rept low loop (so it should spend more time translating it
than executing this code).
These comparisons are between the upstream version, a previous similar
implementation and a one with a cleaner code(this one).
For a comparison with o previous different implementation:
<20221010191356.83659-13-lucas.araujo@eldorado.org.br>

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-13-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: Moved XSTSTDC[QDS]P to decodetree

Moved XSTSTDCSP, XSTSTDCDP and XSTSTDCQP to decodetree and moved some of
its decoding away from the helper as previously the DCMX, XB and BF were
calculated in the helper with the help of cpu_env, now that part was
moved to the decodetree with the rest.

xvtstdcsp:
rept    loop    master             patch
8       12500   1,85393600         1,94683600 (+5.0%)
25      4000    1,78779800         1,92479000 (+7.7%)
100     1000    2,12775000         2,28895500 (+7.6%)
500     200     2,99655300         3,23102900 (+7.8%)
2500    40      6,89082200         7,44827500 (+8.1%)
8000    12     17,50585500        18,95152100 (+8.3%)

xvtstdcdp:
rept    loop    master             patch
8       12500   1,39043100         1,33539800 (-4.0%)
25      4000    1,35731800         1,37347800 (+1.2%)
100     1000    1,51514800         1,56053000 (+3.0%)
500     200     2,21014400         2,47906000 (+12.2%)
2500    40      5,39488200         6,68766700 (+24.0%)
8000    12     13,98623900        18,17661900 (+30.0%)

xvtstdcdp:
rept    loop    master             patch
8       12500   1,35123800         1,34455800 (-0.5%)
25      4000    1,36441200         1,36759600 (+0.2%)
100     1000    1,49763500         1,54138400 (+2.9%)
500     200     2,19020200         2,46196400 (+12.4%)
2500    40      5,39265700         6,68147900 (+23.9%)
8000    12     14,04163600        18,19669600 (+29.6%)

As some values are now decoded outside the helper and passed to it as an
argument the number of arguments of the helper increased, the number
of TCGop needed to load the arguments increased. I suspect that's why
the slow-down in the tests with a high REPT but low LOOP.

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-12-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: Moved XVTSTDC[DS]P to decodetree

Moved XVTSTDCSP and XVTSTDCDP to decodetree an restructured the helper
to be simpler and do all decoding in the decodetree (so XB, XT and DCMX
are all calculated outside the helper).

Obs: The tests in this one are slightly different, these are the sum of
these instructions with all possible immediate and those instructions
are repeated 10 times.

xvtstdcsp:
rept    loop    master             patch
8       12500   2,76402100         2,70699100 (-2.1%)
25      4000    2,64867100         2,67884100 (+1.1%)
100     1000    2,73806300         2,78701000 (+1.8%)
500     200     3,44666500         3,61027600 (+4.7%)
2500    40      5,85790200         6,47475500 (+10.5%)
8000    12     15,22102100        17,46062900 (+14.7%)

xvtstdcdp:
rept    loop    master             patch
8       12500   2,11818000         1,61065300 (-24.0%)
25      4000    2,04573400         1,60132200 (-21.7%)
100     1000    2,13834100         1,69988100 (-20.5%)
500     200     2,73977000         2,48631700 (-9.3%)
2500    40      5,05067000         5,25914100 (+4.1%)
8000    12     14,60507800        15,93704900 (+9.1%)

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-11-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: Use gvec to decode XVCPSGN[SD]P

Moved XVCPSGNSP and XVCPSGNDP to decodetree and used gvec to translate
them.

xvcpsgnsp:
rept    loop    master             patch
8       12500   0,00561400         0,00537900 (-4.2%)
25      4000    0,00562100         0,00400000 (-28.8%)
100     1000    0,00696900         0,00416300 (-40.3%)
500     200     0,02211900         0,00840700 (-62.0%)
2500    40      0,09328600         0,02728300 (-70.8%)
8000    12      0,27295300         0,06867800 (-74.8%)

xvcpsgndp:
rept    loop    master             patch
8       12500   0,00556300         0,00584200 (+5.0%)
25      4000    0,00482700         0,00431700 (-10.6%)
100     1000    0,00585800         0,00464400 (-20.7%)
500     200     0,01565300         0,00839700 (-46.4%)
2500    40      0,05766500         0,02430600 (-57.8%)
8000    12      0,19875300         0,07947100 (-60.0%)

Like the previous instructions there seemed to be a improvement on
translation time.

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-10-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: Use gvec to decode XV[N]ABS[DS]P/XVNEG[DS]P

Moved XVABSSP, XVABSDP, XVNABSSP,XVNABSDP, XVNEGSP and XVNEGDP to
decodetree and used gvec to translate them.

xvabssp:
rept    loop    master             patch
8       12500   0,00477900         0,00476000 (-0.4%)
25      4000    0,00442800         0,00353300 (-20.2%)
100     1000    0,00478700         0,00366100 (-23.5%)
500     200     0,00973200         0,00649400 (-33.3%)
2500    40      0,03165200         0,02226700 (-29.7%)
8000    12      0,09315900         0,06674900 (-28.3%)

xvabsdp:
rept    loop    master             patch
8       12500   0,00475000         0,00474400 (-0.1%)
25      4000    0,00355600         0,00367500 (+3.3%)
100     1000    0,00444200         0,00366000 (-17.6%)
500     200     0,00942700         0,00732400 (-22.3%)
2500    40      0,02990000         0,02308500 (-22.8%)
8000    12      0,08770300         0,06683800 (-23.8%)

xvnabssp:
rept    loop    master             patch
8       12500   0,00494500         0,00492900 (-0.3%)
25      4000    0,00397700         0,00338600 (-14.9%)
100     1000    0,00421400         0,00353500 (-16.1%)
500     200     0,01048000         0,00707100 (-32.5%)
2500    40      0,03251500         0,02238300 (-31.2%)
8000    12      0,08889100         0,06469800 (-27.2%)

xvnabsdp:
rept    loop    master             patch
8       12500   0,00511000         0,00492700 (-3.6%)
25      4000    0,00398800         0,00381500 (-4.3%)
100     1000    0,00390500         0,00365900 (-6.3%)
500     200     0,00924800         0,00784600 (-15.2%)
2500    40      0,03138900         0,02391600 (-23.8%)
8000    12      0,09654200         0,05684600 (-41.1%)

xvnegsp:
rept    loop    master             patch
8       12500   0,00493900         0,00452800 (-8.3%)
25      4000    0,00369100         0,00366800 (-0.6%)
100     1000    0,00371100         0,00380000 (+2.4%)
500     200     0,00991100         0,00652300 (-34.2%)
2500    40      0,03025800         0,02422300 (-19.9%)
8000    12      0,09251100         0,06457600 (-30.2%)

xvnegdp:
rept    loop    master             patch
8       12500   0,00474900         0,00454400 (-4.3%)
25      4000    0,00353100         0,00325600 (-7.8%)
100     1000    0,00398600         0,00366800 (-8.0%)
500     200     0,01032300         0,00702400 (-32.0%)
2500    40      0,03125000         0,02422400 (-22.5%)
8000    12      0,09475100         0,06173000 (-34.9%)

This one to me seemed the opposite of the previous instructions, as it
looks like there was an improvement in the translation time (itself not
a surprise as operations were done twice before so there was the need to
translate twice as many TCGop)

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-9-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: Move VABSDU[BHW] to decodetree and use gvec

Moved VABSDUB, VABSDUH and VABSDUW to decodetree and use gvec to
translate them.

vabsdub:
rept    loop    master             patch
8       12500   0,03601600         0,00688500 (-80.9%)
25      4000    0,03651000         0,00532100 (-85.4%)
100     1000    0,03666900         0,00595300 (-83.8%)
500     200     0,04305800         0,01244600 (-71.1%)
2500    40      0,06893300         0,04273700 (-38.0%)
8000    12      0,14633200         0,12660300 (-13.5%)

vabsduh:
rept    loop    master             patch
8       12500   0,02172400         0,00687500 (-68.4%)
25      4000    0,02154100         0,00531500 (-75.3%)
100     1000    0,02235400         0,00596300 (-73.3%)
500     200     0,02827500         0,01245100 (-56.0%)
2500    40      0,05638400         0,04285500 (-24.0%)
8000    12      0,13166000         0,12641400 (-4.0%)

vabsduw:
rept    loop    master             patch
8       12500   0,01646400         0,00688300 (-58.2%)
25      4000    0,01454500         0,00475500 (-67.3%)
100     1000    0,01545800         0,00511800 (-66.9%)
500     200     0,02168200         0,01114300 (-48.6%)
2500    40      0,04571300         0,04138800 (-9.5%)
8000    12      0,12209500         0,12178500 (-0.3%)

Same as VADDCUW and VSUBCUW, overall performance gain but it uses more
TCGop (4 before the patch, 6 after).

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-8-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: Move VAVG[SU][BHW] to decodetree and use gvec

Moved the instructions VAVGUB, VAVGUH, VAVGUW, VAVGSB, VAVGSH, VAVGSW,
to decodetree and use gvec with them. For these one the right shift
had to be made before the sum as to avoid an overflow, so add 1 at the
end if any of the entries had 1 in its LSB as to replicate the "+ 1"
before the shift described by the ISA.

vavgub:
rept    loop    master             patch
8       12500   0,02616600         0,00754200 (-71.2%)
25      4000    0,02530000         0,00637700 (-74.8%)
100     1000    0,02604600         0,00790100 (-69.7%)
500     200     0,03189300         0,01838400 (-42.4%)
2500    40      0,06006900         0,06851000 (+14.1%)
8000    12      0,13941000         0,20548500 (+47.4%)

vavguh:
rept    loop    master             patch
8       12500   0,01818200         0,00780600 (-57.1%)
25      4000    0,01789300         0,00641600 (-64.1%)
100     1000    0,01899100         0,00787200 (-58.5%)
500     200     0,02527200         0,01828400 (-27.7%)
2500    40      0,05361800         0,06773000 (+26.3%)
8000    12      0,12886600         0,20291400 (+57.5%)

vavguw:
rept    loop    master             patch
8       12500   0,01423100         0,00776600 (-45.4%)
25      4000    0,01780800         0,00638600 (-64.1%)
100     1000    0,02085500         0,00787000 (-62.3%)
500     200     0,02737100         0,01828800 (-33.2%)
2500    40      0,05572600         0,06774200 (+21.6%)
8000    12      0,13101700         0,20311600 (+55.0%)

vavgsb:
rept    loop    master             patch
8       12500   0,03006000         0,00788600 (-73.8%)
25      4000    0,02882200         0,00637800 (-77.9%)
100     1000    0,02958000         0,00791400 (-73.2%)
500     200     0,03548800         0,01860400 (-47.6%)
2500    40      0,06360000         0,06850800 (+7.7%)
8000    12      0,13816500         0,20550300 (+48.7%)

vavgsh:
rept    loop    master             patch
8       12500   0,01965900         0,00776600 (-60.5%)
25      4000    0,01875400         0,00638700 (-65.9%)
100     1000    0,01952200         0,00786900 (-59.7%)
500     200     0,02562000         0,01760300 (-31.3%)
2500    40      0,05384300         0,06742800 (+25.2%)
8000    12      0,13240800         0,20330000 (+53.5%)

vavgsw:
rept    loop    master             patch
8       12500   0,01407700         0,00775600 (-44.9%)
25      4000    0,01762300         0,00640000 (-63.7%)
100     1000    0,02046500         0,00788500 (-61.5%)
500     200     0,02745600         0,01843000 (-32.9%)
2500    40      0,05375500         0,06820500 (+26.9%)
8000    12      0,13068300         0,20304900 (+55.4%)

These results to me seems to indicate that with gvec the results have a
slower translation but faster execution.

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-7-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: Move VPRTYB[WDQ] to decodetree and use gvec

Moved VPRTYBW and VPRTYBD to use gvec and both of them and VPRTYBQ to
decodetree. VPRTYBW and VPRTYBD now also use .fni4 and .fni8,
respectively.

vprtybw:
rept    loop    master             patch
8       12500   0,01198900         0,00703100 (-41.4%)
25      4000    0,01070100         0,00571400 (-46.6%)
100     1000    0,01123300         0,00678200 (-39.6%)
500     200     0,01601500         0,01535600 (-4.1%)
2500    40      0,03872900         0,05562100 (43.6%)
8000    12      0,10047000         0,16643000 (65.7%)

vprtybd:
rept    loop    master             patch
8       12500   0,00757700         0,00788100 (4.0%)
25      4000    0,00652500         0,00669600 (2.6%)
100     1000    0,00714400         0,00825400 (15.5%)
500     200     0,01211000         0,01903700 (57.2%)
2500    40      0,03483800         0,07021200 (101.5%)
8000    12      0,09591800         0,21036200 (119.3%)

vprtybq:
rept    loop    master             patch
8       12500   0,00675600         0,00667200 (-1.2%)
25      4000    0,00619400         0,00643200 (3.8%)
100     1000    0,00707100         0,00751100 (6.2%)
500     200     0,01199300         0,01342000 (11.9%)
2500    40      0,03490900         0,04092900 (17.2%)
8000    12      0,09588200         0,11465100 (19.6%)

I wasn't expecting such a performance lost in both VPRTYBD and VPRTYBQ,
I'm not sure if it's worth to move those instructions. Comparing the
assembly of the helper with the TCGop they are pretty similar, so
I'm not sure why vprtybd took so much more time.

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-6-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: Move VNEG[WD] to decodtree and use gvec

Moved the instructions VNEGW and VNEGD to decodetree and used gvec to
decode it.

vnegw:
rept    loop    master             patch
8       12500   0,01053200         0,00548400 (-47.9%)
25      4000    0,01030500         0,00390000 (-62.2%)
100     1000    0,01096300         0,00395400 (-63.9%)
500     200     0,01472000         0,00712300 (-51.6%)
2500    40      0,03809000         0,02147700 (-43.6%)
8000    12      0,09957100         0,06202100 (-37.7%)

vnegd:
rept    loop    master             patch
8       12500   0,00594600         0,00543800 (-8.5%)
25      4000    0,00575200         0,00396400 (-31.1%)
100     1000    0,00676100         0,00394800 (-41.6%)
500     200     0,01149300         0,00709400 (-38.3%)
2500    40      0,03441500         0,02169600 (-37.0%)
8000    12      0,09516900         0,06337000 (-33.4%)

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-5-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: Move V(ADD|SUB)CUW to decodetree and use gvec

This patch moves VADDCUW and VSUBCUW to decodtree with gvec using an
implementation based on the helper, with the main difference being
changing the -1 (aka all bits set to 1) result returned by cmp when
true to +1. It also implemented a .fni4 version of those instructions
and dropped the helper.

vaddcuw:
rept    loop    master             patch
8       12500   0,01008200         0,00612400 (-39.3%)
25      4000    0,01091500         0,00471600 (-56.8%)
100     1000    0,01332500         0,00593700 (-55.4%)
500     200     0,01998500         0,01275700 (-36.2%)
2500    40      0,04704300         0,04364300 (-7.2%)
8000    12      0,10748200         0,11241000 (+4.6%)

vsubcuw:
rept    loop    master             patch
8       12500   0,01226200         0,00571600 (-53.4%)
25      4000    0,01493500         0,00462100 (-69.1%)
100     1000    0,01522700         0,00455100 (-70.1%)
500     200     0,02384600         0,01133500 (-52.5%)
2500    40      0,04935200         0,03178100 (-35.6%)
8000    12      0,09039900         0,09440600 (+4.4%)

Overall there was a gain in performance, but the TCGop code was still
slightly bigger in the new version (it went from 4 to 5).

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-4-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: Move VMH[R]ADDSHS instruction to decodetree

This patch moves VMHADDSHS and VMHRADDSHS to decodetree I couldn't find
a satisfactory implementation with TCG inline.

vmhaddshs:
rept    loop    master             patch
8       12500   0,02983400         0,02648500 (-11.2%)
25      4000    0,02946000         0,02518000 (-14.5%)
100     1000    0,03104300         0,02638000 (-15.0%)
500     200     0,04002000         0,03502500 (-12.5%)
2500    40      0,08090100         0,07562200 (-6.5%)
8000    12      0,19242600         0,18626800 (-3.2%)

vmhraddshs:
rept    loop    master             patch
8       12500   0,03078600         0,02851000 (-7.4%)
25      4000    0,02793200         0,02746900 (-1.7%)
100     1000    0,02886000         0,02839900 (-1.6%)
500     200     0,03714700         0,03799200 (+2.3%)
2500    40      0,07948000         0,07852200 (-1.2%)
8000    12      0,19049800         0,18813900 (-1.2%)

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-3-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: Moved VMLADDUHM to decodetree and use gvec

This patch moves VMLADDUHM to decodetree a creates a gvec implementation
using mul_vec and add_vec.

rept    loop    master             patch
8       12500   0,01810500         0,00903100 (-50.1%)
25      4000    0,01739400         0,00747700 (-57.0%)
100     1000    0,01843600         0,00901400 (-51.1%)
500     200     0,02574600         0,01971000 (-23.4%)
2500    40      0,05921600         0,07121800 (+20.3%)
8000    12      0,15326700         0,21725200 (+41.7%)

The significant difference in performance when REPT is low and LOOP is
high I think is due to the fact that the new implementation has a higher
translation time, as when using a helper only 5 TCGop are used but with
the patch a total of 10 TCGop are needed (Power lacks a direct mul_vec
equivalent so this instruction is implemented with the help of 5 others,
vmuleu, vmulou, vmrgh, vmrgl and vpkum).

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-2-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: move msgsync to decodetree

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20221006200654.725390-7-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: move msgclrp/msgsndp to decodetree

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20221006200654.725390-6-matheus.ferst@eldorado.org.br>
[danielhb: ppc32 build fix in trans_(MSGCLRP|MSGSNDP)]
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: move msgclr/msgsnd to decodetree

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20221006200654.725390-5-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: fix REQUIRE_HV macro definition

The macro is missing a '{' after the if condition. Any use of REQUIRE_HV
would cause a compilation error.

Fixes: fc34e81acd51 ("target/ppc: add macros to check privilege level")
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221006200654.725390-4-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: fix msgsync insns flags

This instruction was added by Power ISA 3.0, using PPC2_PRCNTL makes it
available for older processors, like de e5500 and e6500.

Fixes: 7af1e7b02264 ("target/ppc: add support for hypervisor doorbells on book3s CPUs")
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221006200654.725390-3-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: fix msgclr/msgsnd insns flags

On Power ISA v2.07, the category for these instructions became
"Embedded.Processor Control" or "Book S".

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20221006200654.725390-2-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

Merge tag 'dump-pull-request' of https://gitlab.com/marcandre.lureau/qemu into staging

dump queue

Hi

The "dump" queue, with:
- [PATCH v3/v4 0/9] dump: Cleanup and consolidation
- [PATCH v4 0/4] dump: add 32-bit guest Windows support

# -----BEGIN PGP SIGNATURE-----
#
# iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAmNY9gMcHG1hcmNhbmRy
# ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5ZUtD/kByfamsq/8hnS6N/ok
# xs9kXO+HZA1A1Kng19RjYWbTka1LpEAf6y6tPtV27l5rWJZxCgqFp3Q2VKQyzAxl
# Bcf4gvEhUDJI87jHrZ8WBJ0JvPL8pKNjPn4JUPOQO+6kX8A/3XTwAyvH/T3uxlTo
# I+4HLwY0EkJ6NU6Cokud5Uo36Zj7JghKrBxTDrd3NC0qSy8xOoIsB5Pbp2PVKuX2
# F5Zfll3F+NUDsj9zmMR6agP4PBUJUB680TtvMpMZXb2BXumKDLngthCLRtGrgsDh
# ChjYr6xkRS9qlXn0PWIYsUyDucDuRFfqTz/Pa9OcGhQuQfIfQiGOM2IFQUE3UcuN
# OphJEFi44za3E7xEZziAGIFmro+k8zX2fjgN3+mApxpBjUAF/uzoW1VzIIdx65Gh
# H/IguECFu7AwMxPucRUI7PkwexgIcqpufeTRqep2nCFsAwS6bS+obzrAzIMd9kj1
# ApLhj36lkub0Tn77B8bkf1TYJnpBcYbGZpmPCILtOxpBZGlXm++KD1DKAYt6rbnR
# 8rQugZNRzEB92aSRTkLJ6QKsqudnbR9ssGbOdEJP+v1fgVtFzYbgygx5QMezGkRw
# vRLWrNbDLog+uYpI2Kb30ItU7+bsDrads9n/gqiGvTP887T3alCtRdIq+Fb28oor
# tSBhBMqMOtccMy3k+EoXBXX5gw==
# =BUEY
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 26 Oct 2022 04:55:31 EDT
# gpg:                using RSA key 87A9BD933F87C606D276F62DDAE8E10975969CE5
# gpg:                issuer "marcandre.lureau@redhat.com"
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full]
# gpg:                 aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full]
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276  F62D DAE8 E109 7596 9CE5

* tag 'dump-pull-request' of https://gitlab.com/marcandre.lureau/qemu:
  dump/win_dump: limit number of processed PRCBs
  s390x: pv: Add dump support
  s390x: Add KVM PV dump interface
  include/elf.h: add s390x note types
  s390x: Introduce PV query interface
  s390x: Add protected dump cap
  dump: Add architecture section and section string table support
  dump: Reintroduce memory_offset and section_offset
  dump: Reorder struct DumpState
  dump: Write ELF section headers right after ELF header
  dump: Use a buffer for ELF section data and headers

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Merge tag 'pull-tcg-20221026' of https://gitlab.com/rth7680/qemu into staging

Revert incorrect cflags initialization.
Add direct jumps for tcg/loongarch64.
Speed up breakpoint check.
Improve assertions for atomic.h.
Move restore_state_to_opc to TCGCPUOps.
Cleanups to TranslationBlock maintenance.

# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmNYlo4dHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV9y2wf9EKsCA6VtYI2Qtftf
# q/ujYFmUf8AKTb9eVcA0XX71CT1dEnFR7GQyT8B8X13x0pSbOX7tbEWHPreegTFV
# tESiejvymi6Q9devAB58GVwNoU/zPIQQGhCPxkVUKDmRztJz22MbGUzd7UKPPgU8
# 2nVMkIpLTMBsKeFLxE/D3ZntmdKsgyI/1Dtkl9TxvlDGsCbMjbNcr8lM+TLaG2oX
# GZhFyJHKEVy0cobukvhhb/9rU7AWdG/BnFmZM16JxvHV/YCwJBx3Udhcy9xPePUU
# yIjkGsUAq4aB6H9RFuTWh7GmaY5u6gMbTTi2J7hDos0mzauYJtpgEB/H42LpycGE
# sOhkLQ==
# =DUb8
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 25 Oct 2022 22:08:14 EDT
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* tag 'pull-tcg-20221026' of https://gitlab.com/rth7680/qemu: (47 commits)
  accel/tcg: Remove restore_state_to_opc function
  target/xtensa: Convert to tcg_ops restore_state_to_opc
  target/tricore: Convert to tcg_ops restore_state_to_opc
  target/sparc: Convert to tcg_ops restore_state_to_opc
  target/sh4: Convert to tcg_ops restore_state_to_opc
  target/s390x: Convert to tcg_ops restore_state_to_opc
  target/rx: Convert to tcg_ops restore_state_to_opc
  target/riscv: Convert to tcg_ops restore_state_to_opc
  target/ppc: Convert to tcg_ops restore_state_to_opc
  target/openrisc: Convert to tcg_ops restore_state_to_opc
  target/nios2: Convert to tcg_ops restore_state_to_opc
  target/mips: Convert to tcg_ops restore_state_to_opc
  target/microblaze: Convert to tcg_ops restore_state_to_opc
  target/m68k: Convert to tcg_ops restore_state_to_opc
  target/loongarch: Convert to tcg_ops restore_state_to_opc
  target/i386: Convert to tcg_ops restore_state_to_opc
  target/hppa: Convert to tcg_ops restore_state_to_opc
  target/hexagon: Convert to tcg_ops restore_state_to_opc
  target/cris: Convert to tcg_ops restore_state_to_opc
  target/avr: Convert to tcg_ops restore_state_to_opc
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Merge tag 'pull-aspeed-20221025' of https://github.com/legoater/qemu into staging

aspeed queue :

* Performance improvement with Object class caching
* Serial Flash Discovery Parameters support for m25p80 device
* Various small adjustments on intructions and models

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmNX/WEACgkQUaNDx8/7
# 7KFhERAAhrcLcv15ny8RwatHPjzU00ZPQ0PcxGj1VDT66pCVh6M+rIeRPB2scOey
# Pu8jUvIYJ8w7ozjAP6YTQ1MP/WufniVi91Bx+vs/okSiWZa4dP0/G7NQWoc1at0s
# NBlkg57l1GMEeQb5x8vC1DizTQ1Z8Q8J/Ur3uXukXCmYVJAwHYpl/Foob1IPFgh8
# UcJ55LyuRq99lS8ib6HvRftAsC3DOcA/sl3b/TYR2+iKyi1VS2aZoQzxVCavSBcz
# PoTonT9O4OvIQthAgXRwpylW/aMYU3I7FeyOMKlCNLbmJ8LpVbX2v0KN3WBvWBv4
# OWP0DiqPUuoWFHLUGKbiVOgWQrTQXZyoD70SD/ObE1oMTLmeBoD1oFizQDvokHAR
# g2+gMdWnuWcbyaofY7YwuI6qz22gbrgh8JqX6sEWRDnY7HgCUvPhCsmci+bdN5cf
# dGcE8YKi7aD5gzoU9LRziPlhbwaEsgYLpYS7aGfNcmypgeq6lmNG7xKyw911zCTY
# uqDZWOUJy0tUIUTxoz3o1/KtsTFugjuZ+9W1SxELptJR37iwlP1vumf6bduwcx/3
# ba8tzNoXecXO5Icmq5P3lMNVM/abpkDDKS66HA87mABLEd/eCD0ojR9Kfxo0mD74
# kmQK3MFfJPkTu0ddu1cWhCIgTO7EuLuZL7gzj1oxoeXiU3YcVh8=
# =u7pS
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 25 Oct 2022 11:14:41 EDT
# gpg:                using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B  0B60 51A3 43C7 CFFB ECA1

* tag 'pull-aspeed-20221025' of https://github.com/legoater/qemu:
  arm/aspeed: Replace mx25l25635e chip model
  m25p80: Add the w25q01jvq SFPD table
  m25p80: Add the w25q512jv SFPD table
  m25p80: Add the w25q256 SFPD table
  m25p80: Add the mx66l1g45g SFDP table
  m25p80: Add the mx25l25635f SFPD table
  m25p80: Add the mx25l25635e SFPD table
  m25p80: Add erase size for mx25l25635e
  m25p80: Add the n25q256a SFDP table
  m25p80: Add basic support for the SFDP command
  hw/arm/aspeed: increase Bletchley memory size
  ast2600: Drop NEON from the CPU features
  aspeed/smc: Cache AspeedSMCClass
  ssi: cache SSIPeripheralClass to avoid GET_CLASS()
  tests/avocado/machine_aspeed.py: Fix typos on buildroot
  hw/i2c/aspeed: Fix old reg slave receive

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

dump/win_dump: limit number of processed PRCBs

When number of CPUs utilized by guest Windows is less than defined in
QEMU (i.e., desktop versions of Windows severely limits number of CPU
sockets), patch_and_save_context routine accesses non-existent PRCB and
fails. So, limit number of processed PRCBs by NumberProcessors taken
from guest Windows driver.

Signed-off-by: Viktor Prutyanov <viktor.prutyanov@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20221019235948.656411-1-viktor.prutyanov@redhat.com>

s390x: pv: Add dump support

Sometimes dumping a guest from the outside is the only way to get the
data that is needed. This can be the case if a dumping mechanism like
KDUMP hasn't been configured or data needs to be fetched at a specific
point. Dumping a protected guest from the outside without help from
fw/hw doesn't yield sufficient data to be useful. Hence we now
introduce PV dump support.

The PV dump support works by integrating the firmware into the dump
process. New Ultravisor calls are used to initiate the dump process,
dump cpu data, dump memory state and lastly complete the dump process.
The UV calls are exposed by KVM via the new KVM_PV_DUMP command and
its subcommands. The guest's data is fully encrypted and can only be
decrypted by the entity that owns the customer communication key for
the dumped guest. Also dumping needs to be allowed via a flag in the
SE header.

On the QEMU side of things we store the PV dump data in the newly
introduced architecture ELF sections (storage state and completion
data) and the cpu notes (for cpu dump data).

Users can use the zgetdump tool to convert the encrypted QEMU dump to an
unencrypted one.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Message-Id: <20221017083822.43118-11-frankja@linux.ibm.com>

s390x: Add KVM PV dump interface

Let's add a few bits of code which hide the new KVM PV dump API from
us via new functions.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
[ Marc-André: fix up for compilation issue ]
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20221017083822.43118-10-frankja@linux.ibm.com>

include/elf.h: add s390x note types

Adding two s390x note types

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20221017083822.43118-9-frankja@linux.ibm.com>

s390x: Introduce PV query interface

Introduce an interface over which we can get information about UV data.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Reviewed-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20221017083822.43118-8-frankja@linux.ibm.com>

s390x: Add protected dump cap

Add a protected dump capability for later feature checking.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Message-Id: <20221017083822.43118-7-frankja@linux.ibm.com>
[ Marc-André - Add missing stubs when !kvm ]
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>

accel/tcg: Remove restore_state_to_opc function

All targets have been updated. Use the tcg_ops target hook
exclusively, which allows the compat code to be removed.

Reviewed-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/xtensa: Convert to tcg_ops restore_state_to_opc

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/tricore: Convert to tcg_ops restore_state_to_opc

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/sparc: Convert to tcg_ops restore_state_to_opc

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/sh4: Convert to tcg_ops restore_state_to_opc

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/s390x: Convert to tcg_ops restore_state_to_opc

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/rx: Convert to tcg_ops restore_state_to_opc

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/riscv: Convert to tcg_ops restore_state_to_opc

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/ppc: Convert to tcg_ops restore_state_to_opc

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/openrisc: Convert to tcg_ops restore_state_to_opc

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/nios2: Convert to tcg_ops restore_state_to_opc

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/mips: Convert to tcg_ops restore_state_to_opc

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/microblaze: Convert to tcg_ops restore_state_to_opc

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/m68k: Convert to tcg_ops restore_state_to_opc

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/loongarch: Convert to tcg_ops restore_state_to_opc

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/i386: Convert to tcg_ops restore_state_to_opc

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/hppa: Convert to tcg_ops restore_state_to_opc

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/hexagon: Convert to tcg_ops restore_state_to_opc

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/cris: Convert to tcg_ops restore_state_to_opc

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/avr: Convert to tcg_ops restore_state_to_opc

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/arm: Convert to tcg_ops restore_state_to_opc

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

target/alpha: Convert to tcg_ops restore_state_to_opc

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

accel/tcg: Add restore_state_to_opc to TCGCPUOps

Add a tcg_ops hook to replace the restore_state_to_opc
function call. Because these generic hooks cannot depend
on target-specific types, temporarily, copy the current
target_ulong data[] into uint64_t d64[].

Reviewed-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

accel/tcg: Simplify page_get/alloc_target_data

Since the only user, Arm MTE, always requires allocation,
merge the get and alloc functions to always produce a
non-null result. Also assume that the user has already
checked page validity.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

accel/tcg: Move TARGET_PAGE_DATA_SIZE impl to user-exec.c

Since "target data" is always user-only, move it out of
translate-all.c to user-exec.c.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

accel/tcg: Use tb_invalidate_phys_range in page_set_flags

Flush translation blocks in bulk, rather than page-by-page.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

accel/tcg: Use page_reset_target_data in page_set_flags

Use the existing function for clearing target data.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

accel/tcg: Call tb_invalidate_phys_page for PAGE_RESET

When PAGE_RESET is set, we are replacing pages with new
content, which means that we need to invalidate existing
cached data, such as TranslationBlocks. Perform the
reset invalidate while we're doing other invalidates,
which allows us to remove the separate invalidates from
the user-only mmap/munmap/mprotect routines.

In addition, restrict invalidation to PAGE_EXEC pages.
Since cdf713085131, we have validated PAGE_EXEC is present
before translation, which means we can assume that if the
bit is not present, there are no translations to invalidate.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

accel/tcg: Use tb_invalidate_phys_page in page_set_flags

We do not require detection of overlapping TBs here,
so use the more appropriate function.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>