Matheus Ferst [Fri, 21 Oct 2022 14:21:56 +0000 (11:21 -0300)]
target/ppc: move the p*_interrupt_powersave methods to excp_helper.c
Move the methods to excp_helper.c and make them static.
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <
20221021142156.
4134411-4-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Fri, 21 Oct 2022 14:21:55 +0000 (11:21 -0300)]
target/ppc: unify cpu->has_work based on cs->interrupt_request
Now that cs->interrupt_request indicates if there is any unmasked
interrupt, checking if the CPU has work to do can be simplified to a
single check that works for all CPU models.
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <
20221021142156.
4134411-3-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Fri, 21 Oct 2022 14:21:54 +0000 (11:21 -0300)]
target/ppc: introduce ppc_maybe_interrupt
This new method will check if any pending interrupt was unmasked and
then call cpu_interrupt/cpu_reset_interrupt accordingly. Code that
raises/lowers or masks/unmasks interrupts should call this method to
keep CPU_INTERRUPT_HARD coherent with env->pending_interrupts.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <
20221021142156.
4134411-2-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:26 +0000 (17:48 -0300)]
target/ppc: remove ppc_store_lpcr from CONFIG_USER_ONLY builds
Writes to LPCR are hypervisor privileged.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-27-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:25 +0000 (17:48 -0300)]
target/ppc: add power-saving interrupt masking logic to p7_next_unmasked_interrupt
Export p7_interrupt_powersave and use it in p7_next_unmasked_interrupt.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-26-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:24 +0000 (17:48 -0300)]
target/ppc: move power-saving interrupt masking out of cpu_has_work_POWER7
Move the interrupt masking logic out of cpu_has_work_POWER7 in a new
method, p7_interrupt_powersave, that only returns an interrupt if it can
wake the processor from power-saving mode.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-25-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:23 +0000 (17:48 -0300)]
target/ppc: remove generic architecture checks from p7_deliver_interrupt
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-24-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:22 +0000 (17:48 -0300)]
target/ppc: remove unused interrupts from p7_deliver_interrupt
Remove the following unused interrupts from the POWER7 interrupt
processing method:
- PPC_INTERRUPT_RESET: only raised for 6xx, 7xx, 970 and POWER5p;
- Hypervisor Virtualization: introduced in Power ISA v3.0;
- Hypervisor Doorbell and Event-Based Branch: introduced in
Power ISA v2.07;
- Critical Input, Watchdog Timer, and Fixed Interval Timer: only defined
for embedded CPUs;
- Doorbell and Critical Doorbell Interrupt: processor does not implement
the Embedded.Processor Control category;
- Programmable Interval Timer: 40x-only;
- PPC_INTERRUPT_THERM: only raised for 970 and POWER5p;
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-23-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:21 +0000 (17:48 -0300)]
target/ppc: create an interrupt deliver method for POWER7
The new method is identical to ppc_deliver_interrupt, processor-specific
code will be added/removed in the following patches.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-22-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:20 +0000 (17:48 -0300)]
target/ppc: remove unused interrupts from p7_next_unmasked_interrupt
Remove the following unused interrupts from the POWER7 interrupt masking
method:
- PPC_INTERRUPT_RESET: only raised for 6xx, 7xx, 970 and POWER5p;
- Hypervisor Virtualization: introduced in Power ISA v3.0;
- Hypervisor Doorbell and Event-Based Branch: introduced in
Power ISA v2.07;
- Critical Input, Watchdog Timer, and Fixed Interval Timer: only defined
for embedded CPUs;
- Doorbell and Critical Doorbell Interrupt: processor does not implement
the Embedded.Processor Control category;
- Programmable Interval Timer: 40x-only;
- PPC_INTERRUPT_THERM: only raised for 970 and POWER5p;
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-21-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:19 +0000 (17:48 -0300)]
target/ppc: create an interrupt masking method for POWER7
The new method is identical to ppc_next_unmasked_interrupt_generic,
processor-specific code will be added/removed in the following patches.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-20-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:18 +0000 (17:48 -0300)]
target/ppc: add power-saving interrupt masking logic to p8_next_unmasked_interrupt
Export p8_interrupt_powersave and use it in p8_next_unmasked_interrupt.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-19-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:17 +0000 (17:48 -0300)]
target/ppc: move power-saving interrupt masking out of cpu_has_work_POWER8
Move the interrupt masking logic out of cpu_has_work_POWER8 in a new
method, p8_interrupt_powersave, that only returns an interrupt if it can
wake the processor from power-saving mode.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-18-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:16 +0000 (17:48 -0300)]
target/ppc: remove generic architecture checks from p8_deliver_interrupt
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-17-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:15 +0000 (17:48 -0300)]
target/ppc: remove unused interrupts from p8_deliver_interrupt
Remove the following unused interrupts from the POWER8 interrupt
processing method:
- PPC_INTERRUPT_RESET: only raised for 6xx, 7xx, 970 and POWER5p;
- Debug Interrupt: removed in Power ISA v2.07;
- Hypervisor Virtualization: introduced in Power ISA v3.0;
- Critical Input, Watchdog Timer, and Fixed Interval Timer: only defined
for embedded CPUs;
- Critical Doorbell: processor does not implement the
"Embedded.Processor Control" category;
- Programmable Interval Timer: 40x-only;
- PPC_INTERRUPT_THERM: only raised for 970 and POWER5p;
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-16-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:14 +0000 (17:48 -0300)]
target/ppc: create an interrupt deliver method for POWER8
The new method is identical to ppc_deliver_interrupt, processor-specific
code will be added/removed in the following patches.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-15-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:13 +0000 (17:48 -0300)]
target/ppc: remove unused interrupts from p8_next_unmasked_interrupt
Remove the following unused interrupts from the POWER8 interrupt masking
method:
- PPC_INTERRUPT_RESET: only raised for 6xx, 7xx, 970, and POWER5p;
- Debug Interrupt: removed in Power ISA v2.07;
- Hypervisor Virtualization: introduced in Power ISA v3.0;
- Critical Input, Watchdog Timer, and Fixed Interval Timer: only defined
for embedded CPUs;
- Critical Doorbell: processor does not implement the "Embedded.Processor
Control" category;
- Programmable Interval Timer: 40x-only;
- PPC_INTERRUPT_THERM: only raised for 970 and POWER5p;
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-14-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:12 +0000 (17:48 -0300)]
target/ppc: create an interrupt masking method for POWER8
The new method is identical to ppc_next_unmasked_interrupt_generic,
processor-specific code will be added/removed in the following patches.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-13-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:11 +0000 (17:48 -0300)]
target/ppc: add power-saving interrupt masking logic to p9_next_unmasked_interrupt
Export p9_interrupt_powersave and use it in p9_next_unmasked_interrupt.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-12-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:10 +0000 (17:48 -0300)]
target/ppc: move power-saving interrupt masking out of cpu_has_work_POWER9
Move the interrupt masking logic out of cpu_has_work_POWER9 in a new
method, p9_interrupt_powersave, that only returns an interrupt if it can
wake the processor from power-saving mode.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-11-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:09 +0000 (17:48 -0300)]
target/ppc: remove generic architecture checks from p9_deliver_interrupt
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-10-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:08 +0000 (17:48 -0300)]
target/ppc: remove unused interrupts from p9_deliver_interrupt
Remove the following unused interrupts from the POWER9 interrupt
processing method:
- PPC_INTERRUPT_RESET: only raised for 6xx, 7xx, 970 and POWER5p;
- Debug Interrupt: removed in Power ISA v2.07;
- Critical Input, Watchdog Timer, and Fixed Interval Timer: only defined
for embedded CPUs;
- Critical Doorbell Interrupt: removed in Power ISA v3.0;
- Programmable Interval Timer: 40x-only.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-9-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:07 +0000 (17:48 -0300)]
target/ppc: create an interrupt deliver method for POWER9/POWER10
The new method is identical to ppc_deliver_interrupt, processor-specific
code will be added/removed in the following patches.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-8-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:06 +0000 (17:48 -0300)]
target/ppc: remove unused interrupts from p9_next_unmasked_interrupt
Remove the following unused interrupts from the POWER9 interrupt masking
method:
- PPC_INTERRUPT_RESET: only raised for 6xx, 7xx, 970 and POWER5p;
- Debug Interrupt: removed in Power ISA v2.07;
- Critical Input, Watchdog Timer, and Fixed Interval Timer: only defined
for embedded CPUs;
- Critical Doorbell Interrupt: removed in Power ISA v3.0;
- Programmable Interval Timer: 40x-only.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-7-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:05 +0000 (17:48 -0300)]
target/ppc: create an interrupt masking method for POWER9/POWER10
The new method is identical to ppc_next_unmasked_interrupt_generic,
processor-specific code will be added/removed in the following patches.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-6-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:04 +0000 (17:48 -0300)]
target/ppc: prepare to split interrupt masking and delivery by excp_model
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-5-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:03 +0000 (17:48 -0300)]
target/ppc: split interrupt masking and delivery from ppc_hw_interrupt
Split ppc_hw_interrupt into an interrupt masking method,
ppc_next_unmasked_interrupt, and an interrupt processing method,
ppc_deliver_interrupt.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221011204829.
1641124-4-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:02 +0000 (17:48 -0300)]
target/ppc: always use ppc_set_irq to set env->pending_interrupts
Use ppc_set_irq to raise/clear interrupts to ensure CPU_INTERRUPT_HARD
will be set/reset accordingly.
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <
20221011204829.
1641124-3-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Tue, 11 Oct 2022 20:48:01 +0000 (17:48 -0300)]
target/ppc: define PPC_INTERRUPT_* values directly
This enum defines the bit positions in env->pending_interrupts for each
interrupt. However, except for the comparison in kvmppc_set_interrupt,
the values are always used as (1 << PPC_INTERRUPT_*). Define them
directly like that to save some clutter. No functional change intended.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <
20221011204829.
1641124-2-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Lucas Mateus Castro (alqotel) [Wed, 19 Oct 2022 12:50:40 +0000 (09:50 -0300)]
target/ppc: Use gvec to decode XVTSTDC[DS]P
Used gvec to translate XVTSTDCSP and XVTSTDCDP.
xvtstdcsp:
rept loop imm master version prev version current version
25 4000 0 0,206200 0,040730 (-80.2%) 0,040740 (-80.2%)
25 4000 1 0,205120 0,053650 (-73.8%) 0,053510 (-73.9%)
25 4000 3 0,206160 0,058630 (-71.6%) 0,058570 (-71.6%)
25 4000 51 0,217110 0,191490 (-11.8%) 0,192320 (-11.4%)
25 4000 127 0,206160 0,191490 (-7.1%) 0,192640 (-6.6%)
8000 12 0 1,234719 0,418833 (-66.1%) 0,386365 (-68.7%)
8000 12 1 1,232417 1,435979 (+16.5%) 1,462792 (+18.7%)
8000 12 3 1,232760 1,766073 (+43.3%) 1,743990 (+41.5%)
8000 12 51 1,239281 1,319562 (+6.5%) 1,423479 (+14.9%)
8000 12 127 1,231708 1,315760 (+6.8%) 1,426667 (+15.8%)
xvtstdcdp:
rept loop imm master version prev version current version
25 4000 0 0,159930 0,040830 (-74.5%) 0,040610 (-74.6%)
25 4000 1 0,160640 0,053670 (-66.6%) 0,053480 (-66.7%)
25 4000 3 0,160020 0,063030 (-60.6%) 0,062960 (-60.7%)
25 4000 51 0,160410 0,128620 (-19.8%) 0,127470 (-20.5%)
25 4000 127 0,160330 0,127670 (-20.4%) 0,128690 (-19.7%)
8000 12 0 1,190365 0,422146 (-64.5%) 0,388417 (-67.4%)
8000 12 1 1,191292 1,445312 (+21.3%) 1,428698 (+19.9%)
8000 12 3 1,188687 1,980656 (+66.6%) 1,975354 (+66.2%)
8000 12 51 1,191250 1,264500 (+6.1%) 1,355083 (+13.8%)
8000 12 127 1,197313 1,266729 (+5.8%) 1,349156 (+12.7%)
Overall, these instructions are the hardest ones to measure performance
as the gvec implementation is affected by the immediate. Above there are
5 different scenarios when it comes to immediate and 2 when it comes to
rept/loop combination. The immediates scenarios are: all bits are 0
therefore the target register should just be changed to 0, with 1 bit
set, with 2 bits set in a combination the new implementation can deal
with using gvec, 4 bits set and the new implementation can't deal with
it using gvec and all bits set. The rept/loop scenarios are high loop
and low rept (so it should spend more time executing it than translating
it) and high rept low loop (so it should spend more time translating it
than executing this code).
These comparisons are between the upstream version, a previous similar
implementation and a one with a cleaner code(this one).
For a comparison with o previous different implementation:
<
20221010191356.83659-13-lucas.araujo@eldorado.org.br>
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20221019125040.48028-13-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Lucas Mateus Castro (alqotel) [Wed, 19 Oct 2022 12:50:39 +0000 (09:50 -0300)]
target/ppc: Moved XSTSTDC[QDS]P to decodetree
Moved XSTSTDCSP, XSTSTDCDP and XSTSTDCQP to decodetree and moved some of
its decoding away from the helper as previously the DCMX, XB and BF were
calculated in the helper with the help of cpu_env, now that part was
moved to the decodetree with the rest.
xvtstdcsp:
rept loop master patch
8 12500 1,
85393600 1,
94683600 (+5.0%)
25 4000 1,
78779800 1,
92479000 (+7.7%)
100 1000 2,
12775000 2,
28895500 (+7.6%)
500 200 2,
99655300 3,
23102900 (+7.8%)
2500 40 6,
89082200 7,
44827500 (+8.1%)
8000 12 17,
50585500 18,
95152100 (+8.3%)
xvtstdcdp:
rept loop master patch
8 12500 1,
39043100 1,
33539800 (-4.0%)
25 4000 1,
35731800 1,
37347800 (+1.2%)
100 1000 1,
51514800 1,
56053000 (+3.0%)
500 200 2,
21014400 2,
47906000 (+12.2%)
2500 40 5,
39488200 6,
68766700 (+24.0%)
8000 12 13,
98623900 18,
17661900 (+30.0%)
xvtstdcdp:
rept loop master patch
8 12500 1,
35123800 1,
34455800 (-0.5%)
25 4000 1,
36441200 1,
36759600 (+0.2%)
100 1000 1,
49763500 1,
54138400 (+2.9%)
500 200 2,
19020200 2,
46196400 (+12.4%)
2500 40 5,
39265700 6,
68147900 (+23.9%)
8000 12 14,
04163600 18,
19669600 (+29.6%)
As some values are now decoded outside the helper and passed to it as an
argument the number of arguments of the helper increased, the number
of TCGop needed to load the arguments increased. I suspect that's why
the slow-down in the tests with a high REPT but low LOOP.
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20221019125040.48028-12-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Lucas Mateus Castro (alqotel) [Wed, 19 Oct 2022 12:50:38 +0000 (09:50 -0300)]
target/ppc: Moved XVTSTDC[DS]P to decodetree
Moved XVTSTDCSP and XVTSTDCDP to decodetree an restructured the helper
to be simpler and do all decoding in the decodetree (so XB, XT and DCMX
are all calculated outside the helper).
Obs: The tests in this one are slightly different, these are the sum of
these instructions with all possible immediate and those instructions
are repeated 10 times.
xvtstdcsp:
rept loop master patch
8 12500 2,
76402100 2,
70699100 (-2.1%)
25 4000 2,
64867100 2,
67884100 (+1.1%)
100 1000 2,
73806300 2,
78701000 (+1.8%)
500 200 3,
44666500 3,
61027600 (+4.7%)
2500 40 5,
85790200 6,
47475500 (+10.5%)
8000 12 15,
22102100 17,
46062900 (+14.7%)
xvtstdcdp:
rept loop master patch
8 12500 2,
11818000 1,
61065300 (-24.0%)
25 4000 2,
04573400 1,
60132200 (-21.7%)
100 1000 2,
13834100 1,
69988100 (-20.5%)
500 200 2,
73977000 2,
48631700 (-9.3%)
2500 40 5,
05067000 5,
25914100 (+4.1%)
8000 12 14,
60507800 15,
93704900 (+9.1%)
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20221019125040.48028-11-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Lucas Mateus Castro (alqotel) [Wed, 19 Oct 2022 12:50:37 +0000 (09:50 -0300)]
target/ppc: Use gvec to decode XVCPSGN[SD]P
Moved XVCPSGNSP and XVCPSGNDP to decodetree and used gvec to translate
them.
xvcpsgnsp:
rept loop master patch
8 12500 0,
00561400 0,
00537900 (-4.2%)
25 4000 0,
00562100 0,
00400000 (-28.8%)
100 1000 0,
00696900 0,
00416300 (-40.3%)
500 200 0,
02211900 0,
00840700 (-62.0%)
2500 40 0,
09328600 0,
02728300 (-70.8%)
8000 12 0,
27295300 0,
06867800 (-74.8%)
xvcpsgndp:
rept loop master patch
8 12500 0,
00556300 0,
00584200 (+5.0%)
25 4000 0,
00482700 0,
00431700 (-10.6%)
100 1000 0,
00585800 0,
00464400 (-20.7%)
500 200 0,
01565300 0,
00839700 (-46.4%)
2500 40 0,
05766500 0,
02430600 (-57.8%)
8000 12 0,
19875300 0,
07947100 (-60.0%)
Like the previous instructions there seemed to be a improvement on
translation time.
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20221019125040.48028-10-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Lucas Mateus Castro (alqotel) [Wed, 19 Oct 2022 12:50:36 +0000 (09:50 -0300)]
target/ppc: Use gvec to decode XV[N]ABS[DS]P/XVNEG[DS]P
Moved XVABSSP, XVABSDP, XVNABSSP,XVNABSDP, XVNEGSP and XVNEGDP to
decodetree and used gvec to translate them.
xvabssp:
rept loop master patch
8 12500 0,
00477900 0,
00476000 (-0.4%)
25 4000 0,
00442800 0,
00353300 (-20.2%)
100 1000 0,
00478700 0,
00366100 (-23.5%)
500 200 0,
00973200 0,
00649400 (-33.3%)
2500 40 0,
03165200 0,
02226700 (-29.7%)
8000 12 0,
09315900 0,
06674900 (-28.3%)
xvabsdp:
rept loop master patch
8 12500 0,
00475000 0,
00474400 (-0.1%)
25 4000 0,
00355600 0,
00367500 (+3.3%)
100 1000 0,
00444200 0,
00366000 (-17.6%)
500 200 0,
00942700 0,
00732400 (-22.3%)
2500 40 0,
02990000 0,
02308500 (-22.8%)
8000 12 0,
08770300 0,
06683800 (-23.8%)
xvnabssp:
rept loop master patch
8 12500 0,
00494500 0,
00492900 (-0.3%)
25 4000 0,
00397700 0,
00338600 (-14.9%)
100 1000 0,
00421400 0,
00353500 (-16.1%)
500 200 0,
01048000 0,
00707100 (-32.5%)
2500 40 0,
03251500 0,
02238300 (-31.2%)
8000 12 0,
08889100 0,
06469800 (-27.2%)
xvnabsdp:
rept loop master patch
8 12500 0,
00511000 0,
00492700 (-3.6%)
25 4000 0,
00398800 0,
00381500 (-4.3%)
100 1000 0,
00390500 0,
00365900 (-6.3%)
500 200 0,
00924800 0,
00784600 (-15.2%)
2500 40 0,
03138900 0,
02391600 (-23.8%)
8000 12 0,
09654200 0,
05684600 (-41.1%)
xvnegsp:
rept loop master patch
8 12500 0,
00493900 0,
00452800 (-8.3%)
25 4000 0,
00369100 0,
00366800 (-0.6%)
100 1000 0,
00371100 0,
00380000 (+2.4%)
500 200 0,
00991100 0,
00652300 (-34.2%)
2500 40 0,
03025800 0,
02422300 (-19.9%)
8000 12 0,
09251100 0,
06457600 (-30.2%)
xvnegdp:
rept loop master patch
8 12500 0,
00474900 0,
00454400 (-4.3%)
25 4000 0,
00353100 0,
00325600 (-7.8%)
100 1000 0,
00398600 0,
00366800 (-8.0%)
500 200 0,
01032300 0,
00702400 (-32.0%)
2500 40 0,
03125000 0,
02422400 (-22.5%)
8000 12 0,
09475100 0,
06173000 (-34.9%)
This one to me seemed the opposite of the previous instructions, as it
looks like there was an improvement in the translation time (itself not
a surprise as operations were done twice before so there was the need to
translate twice as many TCGop)
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20221019125040.48028-9-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Lucas Mateus Castro (alqotel) [Wed, 19 Oct 2022 12:50:35 +0000 (09:50 -0300)]
target/ppc: Move VABSDU[BHW] to decodetree and use gvec
Moved VABSDUB, VABSDUH and VABSDUW to decodetree and use gvec to
translate them.
vabsdub:
rept loop master patch
8 12500 0,
03601600 0,
00688500 (-80.9%)
25 4000 0,
03651000 0,
00532100 (-85.4%)
100 1000 0,
03666900 0,
00595300 (-83.8%)
500 200 0,
04305800 0,
01244600 (-71.1%)
2500 40 0,
06893300 0,
04273700 (-38.0%)
8000 12 0,
14633200 0,
12660300 (-13.5%)
vabsduh:
rept loop master patch
8 12500 0,
02172400 0,
00687500 (-68.4%)
25 4000 0,
02154100 0,
00531500 (-75.3%)
100 1000 0,
02235400 0,
00596300 (-73.3%)
500 200 0,
02827500 0,
01245100 (-56.0%)
2500 40 0,
05638400 0,
04285500 (-24.0%)
8000 12 0,
13166000 0,
12641400 (-4.0%)
vabsduw:
rept loop master patch
8 12500 0,
01646400 0,
00688300 (-58.2%)
25 4000 0,
01454500 0,
00475500 (-67.3%)
100 1000 0,
01545800 0,
00511800 (-66.9%)
500 200 0,
02168200 0,
01114300 (-48.6%)
2500 40 0,
04571300 0,
04138800 (-9.5%)
8000 12 0,
12209500 0,
12178500 (-0.3%)
Same as VADDCUW and VSUBCUW, overall performance gain but it uses more
TCGop (4 before the patch, 6 after).
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20221019125040.48028-8-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Lucas Mateus Castro (alqotel) [Wed, 19 Oct 2022 12:50:34 +0000 (09:50 -0300)]
target/ppc: Move VAVG[SU][BHW] to decodetree and use gvec
Moved the instructions VAVGUB, VAVGUH, VAVGUW, VAVGSB, VAVGSH, VAVGSW,
to decodetree and use gvec with them. For these one the right shift
had to be made before the sum as to avoid an overflow, so add 1 at the
end if any of the entries had 1 in its LSB as to replicate the "+ 1"
before the shift described by the ISA.
vavgub:
rept loop master patch
8 12500 0,
02616600 0,
00754200 (-71.2%)
25 4000 0,
02530000 0,
00637700 (-74.8%)
100 1000 0,
02604600 0,
00790100 (-69.7%)
500 200 0,
03189300 0,
01838400 (-42.4%)
2500 40 0,
06006900 0,
06851000 (+14.1%)
8000 12 0,
13941000 0,
20548500 (+47.4%)
vavguh:
rept loop master patch
8 12500 0,
01818200 0,
00780600 (-57.1%)
25 4000 0,
01789300 0,
00641600 (-64.1%)
100 1000 0,
01899100 0,
00787200 (-58.5%)
500 200 0,
02527200 0,
01828400 (-27.7%)
2500 40 0,
05361800 0,
06773000 (+26.3%)
8000 12 0,
12886600 0,
20291400 (+57.5%)
vavguw:
rept loop master patch
8 12500 0,
01423100 0,
00776600 (-45.4%)
25 4000 0,
01780800 0,
00638600 (-64.1%)
100 1000 0,
02085500 0,
00787000 (-62.3%)
500 200 0,
02737100 0,
01828800 (-33.2%)
2500 40 0,
05572600 0,
06774200 (+21.6%)
8000 12 0,
13101700 0,
20311600 (+55.0%)
vavgsb:
rept loop master patch
8 12500 0,
03006000 0,
00788600 (-73.8%)
25 4000 0,
02882200 0,
00637800 (-77.9%)
100 1000 0,
02958000 0,
00791400 (-73.2%)
500 200 0,
03548800 0,
01860400 (-47.6%)
2500 40 0,
06360000 0,
06850800 (+7.7%)
8000 12 0,
13816500 0,
20550300 (+48.7%)
vavgsh:
rept loop master patch
8 12500 0,
01965900 0,
00776600 (-60.5%)
25 4000 0,
01875400 0,
00638700 (-65.9%)
100 1000 0,
01952200 0,
00786900 (-59.7%)
500 200 0,
02562000 0,
01760300 (-31.3%)
2500 40 0,
05384300 0,
06742800 (+25.2%)
8000 12 0,
13240800 0,
20330000 (+53.5%)
vavgsw:
rept loop master patch
8 12500 0,
01407700 0,
00775600 (-44.9%)
25 4000 0,
01762300 0,
00640000 (-63.7%)
100 1000 0,
02046500 0,
00788500 (-61.5%)
500 200 0,
02745600 0,
01843000 (-32.9%)
2500 40 0,
05375500 0,
06820500 (+26.9%)
8000 12 0,
13068300 0,
20304900 (+55.4%)
These results to me seems to indicate that with gvec the results have a
slower translation but faster execution.
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20221019125040.48028-7-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Lucas Mateus Castro (alqotel) [Wed, 19 Oct 2022 12:50:33 +0000 (09:50 -0300)]
target/ppc: Move VPRTYB[WDQ] to decodetree and use gvec
Moved VPRTYBW and VPRTYBD to use gvec and both of them and VPRTYBQ to
decodetree. VPRTYBW and VPRTYBD now also use .fni4 and .fni8,
respectively.
vprtybw:
rept loop master patch
8 12500 0,
01198900 0,
00703100 (-41.4%)
25 4000 0,
01070100 0,
00571400 (-46.6%)
100 1000 0,
01123300 0,
00678200 (-39.6%)
500 200 0,
01601500 0,
01535600 (-4.1%)
2500 40 0,
03872900 0,
05562100 (43.6%)
8000 12 0,
10047000 0,
16643000 (65.7%)
vprtybd:
rept loop master patch
8 12500 0,
00757700 0,
00788100 (4.0%)
25 4000 0,
00652500 0,
00669600 (2.6%)
100 1000 0,
00714400 0,
00825400 (15.5%)
500 200 0,
01211000 0,
01903700 (57.2%)
2500 40 0,
03483800 0,
07021200 (101.5%)
8000 12 0,
09591800 0,
21036200 (119.3%)
vprtybq:
rept loop master patch
8 12500 0,
00675600 0,
00667200 (-1.2%)
25 4000 0,
00619400 0,
00643200 (3.8%)
100 1000 0,
00707100 0,
00751100 (6.2%)
500 200 0,
01199300 0,
01342000 (11.9%)
2500 40 0,
03490900 0,
04092900 (17.2%)
8000 12 0,
09588200 0,
11465100 (19.6%)
I wasn't expecting such a performance lost in both VPRTYBD and VPRTYBQ,
I'm not sure if it's worth to move those instructions. Comparing the
assembly of the helper with the TCGop they are pretty similar, so
I'm not sure why vprtybd took so much more time.
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20221019125040.48028-6-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Lucas Mateus Castro (alqotel) [Wed, 19 Oct 2022 12:50:32 +0000 (09:50 -0300)]
target/ppc: Move VNEG[WD] to decodtree and use gvec
Moved the instructions VNEGW and VNEGD to decodetree and used gvec to
decode it.
vnegw:
rept loop master patch
8 12500 0,
01053200 0,
00548400 (-47.9%)
25 4000 0,
01030500 0,
00390000 (-62.2%)
100 1000 0,
01096300 0,
00395400 (-63.9%)
500 200 0,
01472000 0,
00712300 (-51.6%)
2500 40 0,
03809000 0,
02147700 (-43.6%)
8000 12 0,
09957100 0,
06202100 (-37.7%)
vnegd:
rept loop master patch
8 12500 0,
00594600 0,
00543800 (-8.5%)
25 4000 0,
00575200 0,
00396400 (-31.1%)
100 1000 0,
00676100 0,
00394800 (-41.6%)
500 200 0,
01149300 0,
00709400 (-38.3%)
2500 40 0,
03441500 0,
02169600 (-37.0%)
8000 12 0,
09516900 0,
06337000 (-33.4%)
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20221019125040.48028-5-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Lucas Mateus Castro (alqotel) [Wed, 19 Oct 2022 12:50:31 +0000 (09:50 -0300)]
target/ppc: Move V(ADD|SUB)CUW to decodetree and use gvec
This patch moves VADDCUW and VSUBCUW to decodtree with gvec using an
implementation based on the helper, with the main difference being
changing the -1 (aka all bits set to 1) result returned by cmp when
true to +1. It also implemented a .fni4 version of those instructions
and dropped the helper.
vaddcuw:
rept loop master patch
8 12500 0,
01008200 0,
00612400 (-39.3%)
25 4000 0,
01091500 0,
00471600 (-56.8%)
100 1000 0,
01332500 0,
00593700 (-55.4%)
500 200 0,
01998500 0,
01275700 (-36.2%)
2500 40 0,
04704300 0,
04364300 (-7.2%)
8000 12 0,
10748200 0,
11241000 (+4.6%)
vsubcuw:
rept loop master patch
8 12500 0,
01226200 0,
00571600 (-53.4%)
25 4000 0,
01493500 0,
00462100 (-69.1%)
100 1000 0,
01522700 0,
00455100 (-70.1%)
500 200 0,
02384600 0,
01133500 (-52.5%)
2500 40 0,
04935200 0,
03178100 (-35.6%)
8000 12 0,
09039900 0,
09440600 (+4.4%)
Overall there was a gain in performance, but the TCGop code was still
slightly bigger in the new version (it went from 4 to 5).
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20221019125040.48028-4-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Lucas Mateus Castro (alqotel) [Wed, 19 Oct 2022 12:50:30 +0000 (09:50 -0300)]
target/ppc: Move VMH[R]ADDSHS instruction to decodetree
This patch moves VMHADDSHS and VMHRADDSHS to decodetree I couldn't find
a satisfactory implementation with TCG inline.
vmhaddshs:
rept loop master patch
8 12500 0,
02983400 0,
02648500 (-11.2%)
25 4000 0,
02946000 0,
02518000 (-14.5%)
100 1000 0,
03104300 0,
02638000 (-15.0%)
500 200 0,
04002000 0,
03502500 (-12.5%)
2500 40 0,
08090100 0,
07562200 (-6.5%)
8000 12 0,
19242600 0,
18626800 (-3.2%)
vmhraddshs:
rept loop master patch
8 12500 0,
03078600 0,
02851000 (-7.4%)
25 4000 0,
02793200 0,
02746900 (-1.7%)
100 1000 0,
02886000 0,
02839900 (-1.6%)
500 200 0,
03714700 0,
03799200 (+2.3%)
2500 40 0,
07948000 0,
07852200 (-1.2%)
8000 12 0,
19049800 0,
18813900 (-1.2%)
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20221019125040.48028-3-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Lucas Mateus Castro (alqotel) [Wed, 19 Oct 2022 12:50:29 +0000 (09:50 -0300)]
target/ppc: Moved VMLADDUHM to decodetree and use gvec
This patch moves VMLADDUHM to decodetree a creates a gvec implementation
using mul_vec and add_vec.
rept loop master patch
8 12500 0,
01810500 0,
00903100 (-50.1%)
25 4000 0,
01739400 0,
00747700 (-57.0%)
100 1000 0,
01843600 0,
00901400 (-51.1%)
500 200 0,
02574600 0,
01971000 (-23.4%)
2500 40 0,
05921600 0,
07121800 (+20.3%)
8000 12 0,
15326700 0,
21725200 (+41.7%)
The significant difference in performance when REPT is low and LOOP is
high I think is due to the fact that the new implementation has a higher
translation time, as when using a helper only 5 TCGop are used but with
the patch a total of 10 TCGop are needed (Power lacks a direct mul_vec
equivalent so this instruction is implemented with the help of 5 others,
vmuleu, vmulou, vmrgh, vmrgl and vpkum).
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20221019125040.48028-2-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Thu, 6 Oct 2022 20:06:54 +0000 (17:06 -0300)]
target/ppc: move msgsync to decodetree
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <
20221006200654.725390-7-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Thu, 6 Oct 2022 20:06:53 +0000 (17:06 -0300)]
target/ppc: move msgclrp/msgsndp to decodetree
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <
20221006200654.725390-6-matheus.ferst@eldorado.org.br>
[danielhb: ppc32 build fix in trans_(MSGCLRP|MSGSNDP)]
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Thu, 6 Oct 2022 20:06:52 +0000 (17:06 -0300)]
target/ppc: move msgclr/msgsnd to decodetree
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <
20221006200654.725390-5-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Thu, 6 Oct 2022 20:06:51 +0000 (17:06 -0300)]
target/ppc: fix REQUIRE_HV macro definition
The macro is missing a '{' after the if condition. Any use of REQUIRE_HV
would cause a compilation error.
Fixes: fc34e81acd51 ("target/ppc: add macros to check privilege level")
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221006200654.725390-4-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Thu, 6 Oct 2022 20:06:50 +0000 (17:06 -0300)]
target/ppc: fix msgsync insns flags
This instruction was added by Power ISA 3.0, using PPC2_PRCNTL makes it
available for older processors, like de e5500 and e6500.
Fixes: 7af1e7b02264 ("target/ppc: add support for hypervisor doorbells on book3s CPUs")
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221006200654.725390-3-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Matheus Ferst [Thu, 6 Oct 2022 20:06:49 +0000 (17:06 -0300)]
target/ppc: fix msgclr/msgsnd insns flags
On Power ISA v2.07, the category for these instructions became
"Embedded.Processor Control" or "Book S".
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <
20221006200654.725390-2-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Stefan Hajnoczi [Wed, 26 Oct 2022 14:53:48 +0000 (10:53 -0400)]
Merge tag 'dump-pull-request' of https://gitlab.com/marcandre.lureau/qemu into staging
dump queue
Hi
The "dump" queue, with:
- [PATCH v3/v4 0/9] dump: Cleanup and consolidation
- [PATCH v4 0/4] dump: add 32-bit guest Windows support
# -----BEGIN PGP SIGNATURE-----
#
# iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAmNY9gMcHG1hcmNhbmRy
# ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5ZUtD/kByfamsq/8hnS6N/ok
# xs9kXO+HZA1A1Kng19RjYWbTka1LpEAf6y6tPtV27l5rWJZxCgqFp3Q2VKQyzAxl
# Bcf4gvEhUDJI87jHrZ8WBJ0JvPL8pKNjPn4JUPOQO+6kX8A/3XTwAyvH/T3uxlTo
# I+4HLwY0EkJ6NU6Cokud5Uo36Zj7JghKrBxTDrd3NC0qSy8xOoIsB5Pbp2PVKuX2
# F5Zfll3F+NUDsj9zmMR6agP4PBUJUB680TtvMpMZXb2BXumKDLngthCLRtGrgsDh
# ChjYr6xkRS9qlXn0PWIYsUyDucDuRFfqTz/Pa9OcGhQuQfIfQiGOM2IFQUE3UcuN
# OphJEFi44za3E7xEZziAGIFmro+k8zX2fjgN3+mApxpBjUAF/uzoW1VzIIdx65Gh
# H/IguECFu7AwMxPucRUI7PkwexgIcqpufeTRqep2nCFsAwS6bS+obzrAzIMd9kj1
# ApLhj36lkub0Tn77B8bkf1TYJnpBcYbGZpmPCILtOxpBZGlXm++KD1DKAYt6rbnR
# 8rQugZNRzEB92aSRTkLJ6QKsqudnbR9ssGbOdEJP+v1fgVtFzYbgygx5QMezGkRw
# vRLWrNbDLog+uYpI2Kb30ItU7+bsDrads9n/gqiGvTP887T3alCtRdIq+Fb28oor
# tSBhBMqMOtccMy3k+EoXBXX5gw==
# =BUEY
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 26 Oct 2022 04:55:31 EDT
# gpg: using RSA key
87A9BD933F87C606D276F62DDAE8E10975969CE5
# gpg: issuer "marcandre.lureau@redhat.com"
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full]
# gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full]
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5
* tag 'dump-pull-request' of https://gitlab.com/marcandre.lureau/qemu:
dump/win_dump: limit number of processed PRCBs
s390x: pv: Add dump support
s390x: Add KVM PV dump interface
include/elf.h: add s390x note types
s390x: Introduce PV query interface
s390x: Add protected dump cap
dump: Add architecture section and section string table support
dump: Reintroduce memory_offset and section_offset
dump: Reorder struct DumpState
dump: Write ELF section headers right after ELF header
dump: Use a buffer for ELF section data and headers
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi [Wed, 26 Oct 2022 14:53:41 +0000 (10:53 -0400)]
Merge tag 'pull-tcg-
20221026' of https://gitlab.com/rth7680/qemu into staging
Revert incorrect cflags initialization.
Add direct jumps for tcg/loongarch64.
Speed up breakpoint check.
Improve assertions for atomic.h.
Move restore_state_to_opc to TCGCPUOps.
Cleanups to TranslationBlock maintenance.
# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmNYlo4dHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV9y2wf9EKsCA6VtYI2Qtftf
# q/ujYFmUf8AKTb9eVcA0XX71CT1dEnFR7GQyT8B8X13x0pSbOX7tbEWHPreegTFV
# tESiejvymi6Q9devAB58GVwNoU/zPIQQGhCPxkVUKDmRztJz22MbGUzd7UKPPgU8
# 2nVMkIpLTMBsKeFLxE/D3ZntmdKsgyI/1Dtkl9TxvlDGsCbMjbNcr8lM+TLaG2oX
# GZhFyJHKEVy0cobukvhhb/9rU7AWdG/BnFmZM16JxvHV/YCwJBx3Udhcy9xPePUU
# yIjkGsUAq4aB6H9RFuTWh7GmaY5u6gMbTTi2J7hDos0mzauYJtpgEB/H42LpycGE
# sOhkLQ==
# =DUb8
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 25 Oct 2022 22:08:14 EDT
# gpg: using RSA key
7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* tag 'pull-tcg-
20221026' of https://gitlab.com/rth7680/qemu: (47 commits)
accel/tcg: Remove restore_state_to_opc function
target/xtensa: Convert to tcg_ops restore_state_to_opc
target/tricore: Convert to tcg_ops restore_state_to_opc
target/sparc: Convert to tcg_ops restore_state_to_opc
target/sh4: Convert to tcg_ops restore_state_to_opc
target/s390x: Convert to tcg_ops restore_state_to_opc
target/rx: Convert to tcg_ops restore_state_to_opc
target/riscv: Convert to tcg_ops restore_state_to_opc
target/ppc: Convert to tcg_ops restore_state_to_opc
target/openrisc: Convert to tcg_ops restore_state_to_opc
target/nios2: Convert to tcg_ops restore_state_to_opc
target/mips: Convert to tcg_ops restore_state_to_opc
target/microblaze: Convert to tcg_ops restore_state_to_opc
target/m68k: Convert to tcg_ops restore_state_to_opc
target/loongarch: Convert to tcg_ops restore_state_to_opc
target/i386: Convert to tcg_ops restore_state_to_opc
target/hppa: Convert to tcg_ops restore_state_to_opc
target/hexagon: Convert to tcg_ops restore_state_to_opc
target/cris: Convert to tcg_ops restore_state_to_opc
target/avr: Convert to tcg_ops restore_state_to_opc
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi [Wed, 26 Oct 2022 14:04:05 +0000 (10:04 -0400)]
Merge tag 'pull-aspeed-
20221025' of https://github.com/legoater/qemu into staging
aspeed queue :
* Performance improvement with Object class caching
* Serial Flash Discovery Parameters support for m25p80 device
* Various small adjustments on intructions and models
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmNX/WEACgkQUaNDx8/7
# 7KFhERAAhrcLcv15ny8RwatHPjzU00ZPQ0PcxGj1VDT66pCVh6M+rIeRPB2scOey
# Pu8jUvIYJ8w7ozjAP6YTQ1MP/WufniVi91Bx+vs/okSiWZa4dP0/G7NQWoc1at0s
# NBlkg57l1GMEeQb5x8vC1DizTQ1Z8Q8J/Ur3uXukXCmYVJAwHYpl/Foob1IPFgh8
# UcJ55LyuRq99lS8ib6HvRftAsC3DOcA/sl3b/TYR2+iKyi1VS2aZoQzxVCavSBcz
# PoTonT9O4OvIQthAgXRwpylW/aMYU3I7FeyOMKlCNLbmJ8LpVbX2v0KN3WBvWBv4
# OWP0DiqPUuoWFHLUGKbiVOgWQrTQXZyoD70SD/ObE1oMTLmeBoD1oFizQDvokHAR
# g2+gMdWnuWcbyaofY7YwuI6qz22gbrgh8JqX6sEWRDnY7HgCUvPhCsmci+bdN5cf
# dGcE8YKi7aD5gzoU9LRziPlhbwaEsgYLpYS7aGfNcmypgeq6lmNG7xKyw911zCTY
# uqDZWOUJy0tUIUTxoz3o1/KtsTFugjuZ+9W1SxELptJR37iwlP1vumf6bduwcx/3
# ba8tzNoXecXO5Icmq5P3lMNVM/abpkDDKS66HA87mABLEd/eCD0ojR9Kfxo0mD74
# kmQK3MFfJPkTu0ddu1cWhCIgTO7EuLuZL7gzj1oxoeXiU3YcVh8=
# =u7pS
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 25 Oct 2022 11:14:41 EDT
# gpg: using RSA key
A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-aspeed-
20221025' of https://github.com/legoater/qemu:
arm/aspeed: Replace mx25l25635e chip model
m25p80: Add the w25q01jvq SFPD table
m25p80: Add the w25q512jv SFPD table
m25p80: Add the w25q256 SFPD table
m25p80: Add the mx66l1g45g SFDP table
m25p80: Add the mx25l25635f SFPD table
m25p80: Add the mx25l25635e SFPD table
m25p80: Add erase size for mx25l25635e
m25p80: Add the n25q256a SFDP table
m25p80: Add basic support for the SFDP command
hw/arm/aspeed: increase Bletchley memory size
ast2600: Drop NEON from the CPU features
aspeed/smc: Cache AspeedSMCClass
ssi: cache SSIPeripheralClass to avoid GET_CLASS()
tests/avocado/machine_aspeed.py: Fix typos on buildroot
hw/i2c/aspeed: Fix old reg slave receive
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Viktor Prutyanov [Wed, 19 Oct 2022 23:59:48 +0000 (02:59 +0300)]
dump/win_dump: limit number of processed PRCBs
When number of CPUs utilized by guest Windows is less than defined in
QEMU (i.e., desktop versions of Windows severely limits number of CPU
sockets), patch_and_save_context routine accesses non-existent PRCB and
fails. So, limit number of processed PRCBs by NumberProcessors taken
from guest Windows driver.
Signed-off-by: Viktor Prutyanov <viktor.prutyanov@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <
20221019235948.656411-1-viktor.prutyanov@redhat.com>
Janosch Frank [Mon, 17 Oct 2022 08:38:22 +0000 (08:38 +0000)]
s390x: pv: Add dump support
Sometimes dumping a guest from the outside is the only way to get the
data that is needed. This can be the case if a dumping mechanism like
KDUMP hasn't been configured or data needs to be fetched at a specific
point. Dumping a protected guest from the outside without help from
fw/hw doesn't yield sufficient data to be useful. Hence we now
introduce PV dump support.
The PV dump support works by integrating the firmware into the dump
process. New Ultravisor calls are used to initiate the dump process,
dump cpu data, dump memory state and lastly complete the dump process.
The UV calls are exposed by KVM via the new KVM_PV_DUMP command and
its subcommands. The guest's data is fully encrypted and can only be
decrypted by the entity that owns the customer communication key for
the dumped guest. Also dumping needs to be allowed via a flag in the
SE header.
On the QEMU side of things we store the PV dump data in the newly
introduced architecture ELF sections (storage state and completion
data) and the cpu notes (for cpu dump data).
Users can use the zgetdump tool to convert the encrypted QEMU dump to an
unencrypted one.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Message-Id: <
20221017083822.43118-11-frankja@linux.ibm.com>
Janosch Frank [Mon, 17 Oct 2022 08:38:21 +0000 (08:38 +0000)]
s390x: Add KVM PV dump interface
Let's add a few bits of code which hide the new KVM PV dump API from
us via new functions.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
[ Marc-André: fix up for compilation issue ]
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <
20221017083822.43118-10-frankja@linux.ibm.com>
Janosch Frank [Mon, 17 Oct 2022 08:38:20 +0000 (08:38 +0000)]
include/elf.h: add s390x note types
Adding two s390x note types
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <
20221017083822.43118-9-frankja@linux.ibm.com>
Janosch Frank [Mon, 17 Oct 2022 08:38:19 +0000 (08:38 +0000)]
s390x: Introduce PV query interface
Introduce an interface over which we can get information about UV data.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Reviewed-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-Id: <
20221017083822.43118-8-frankja@linux.ibm.com>
Janosch Frank [Mon, 17 Oct 2022 08:38:18 +0000 (08:38 +0000)]
s390x: Add protected dump cap
Add a protected dump capability for later feature checking.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Message-Id: <
20221017083822.43118-7-frankja@linux.ibm.com>
[ Marc-André - Add missing stubs when !kvm ]
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Richard Henderson [Mon, 24 Oct 2022 11:17:39 +0000 (21:17 +1000)]
accel/tcg: Remove restore_state_to_opc function
All targets have been updated. Use the tcg_ops target hook
exclusively, which allows the compat code to be removed.
Reviewed-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 11:08:38 +0000 (21:08 +1000)]
target/xtensa: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 11:06:03 +0000 (21:06 +1000)]
target/tricore: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 11:03:29 +0000 (21:03 +1000)]
target/sparc: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 10:58:40 +0000 (20:58 +1000)]
target/sh4: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 10:56:41 +0000 (20:56 +1000)]
target/s390x: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 10:52:08 +0000 (20:52 +1000)]
target/rx: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 10:49:27 +0000 (20:49 +1000)]
target/riscv: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 10:44:45 +0000 (20:44 +1000)]
target/ppc: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 10:40:30 +0000 (20:40 +1000)]
target/openrisc: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 10:36:57 +0000 (20:36 +1000)]
target/nios2: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 10:35:06 +0000 (20:35 +1000)]
target/mips: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 10:29:48 +0000 (20:29 +1000)]
target/microblaze: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 10:26:33 +0000 (20:26 +1000)]
target/m68k: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 10:24:10 +0000 (20:24 +1000)]
target/loongarch: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 10:18:03 +0000 (20:18 +1000)]
target/i386: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 10:13:57 +0000 (20:13 +1000)]
target/hppa: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 10:10:03 +0000 (20:10 +1000)]
target/hexagon: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 10:08:21 +0000 (20:08 +1000)]
target/cris: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 10:05:02 +0000 (20:05 +1000)]
target/avr: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 09:59:18 +0000 (19:59 +1000)]
target/arm: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 09:44:20 +0000 (19:44 +1000)]
target/alpha: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 24 Oct 2022 09:43:40 +0000 (19:43 +1000)]
accel/tcg: Add restore_state_to_opc to TCGCPUOps
Add a tcg_ops hook to replace the restore_state_to_opc
function call. Because these generic hooks cannot depend
on target-specific types, temporarily, copy the current
target_ulong data[] into uint64_t d64[].
Reviewed-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Tue, 4 Oct 2022 22:40:22 +0000 (15:40 -0700)]
accel/tcg: Simplify page_get/alloc_target_data
Since the only user, Arm MTE, always requires allocation,
merge the get and alloc functions to always produce a
non-null result. Also assume that the user has already
checked page validity.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Tue, 4 Oct 2022 22:24:36 +0000 (15:24 -0700)]
accel/tcg: Move TARGET_PAGE_DATA_SIZE impl to user-exec.c
Since "target data" is always user-only, move it out of
translate-all.c to user-exec.c.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Wed, 5 Oct 2022 19:56:46 +0000 (12:56 -0700)]
accel/tcg: Use tb_invalidate_phys_range in page_set_flags
Flush translation blocks in bulk, rather than page-by-page.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Wed, 5 Oct 2022 19:56:14 +0000 (12:56 -0700)]
accel/tcg: Use page_reset_target_data in page_set_flags
Use the existing function for clearing target data.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Wed, 5 Oct 2022 16:44:52 +0000 (09:44 -0700)]
accel/tcg: Call tb_invalidate_phys_page for PAGE_RESET
When PAGE_RESET is set, we are replacing pages with new
content, which means that we need to invalidate existing
cached data, such as TranslationBlocks. Perform the
reset invalidate while we're doing other invalidates,
which allows us to remove the separate invalidates from
the user-only mmap/munmap/mprotect routines.
In addition, restrict invalidation to PAGE_EXEC pages.
Since
cdf713085131, we have validated PAGE_EXEC is present
before translation, which means we can assume that if the
bit is not present, there are no translations to invalidate.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Wed, 5 Oct 2022 16:27:52 +0000 (09:27 -0700)]
accel/tcg: Use tb_invalidate_phys_page in page_set_flags
We do not require detection of overlapping TBs here,
so use the more appropriate function.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Wed, 5 Oct 2022 20:50:32 +0000 (13:50 -0700)]
accel/tcg: Unify declarations of tb_invalidate_phys_range
We missed this function when we introduced tb_page_addr_t.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Wed, 5 Oct 2022 16:26:26 +0000 (09:26 -0700)]
accel/tcg: Rename tb_invalidate_phys_page_range and drop end parameter
This function is is never called with a real range,
only for a single page. Drop the second parameter
and rename to tb_invalidate_phys_page.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Wed, 5 Oct 2022 16:18:39 +0000 (09:18 -0700)]
accel/tcg: Rename tb_invalidate_phys_page
Rename to tb_invalidate_phys_page_unwind to emphasize that
we also detect invalidating the current TB, and also to free
up that name for other usage.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Tue, 20 Sep 2022 11:21:40 +0000 (13:21 +0200)]
accel/tcg: Introduce tb_{set_}page_addr{0,1}
This data structure will be replaced for user-only: add accessors.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Tue, 20 Sep 2022 11:09:45 +0000 (13:09 +0200)]
accel/tcg: Remove duplicate store to tb->page_addr[]
When we added the fast path, we initialized page_addr[] early.
These stores in and around tb_page_add() are redundant; remove them.
Fixes: 50627f1b7b1 ("accel/tcg: Add fast path for translator_ld*")
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Tue, 20 Sep 2022 05:48:43 +0000 (07:48 +0200)]
accel/tcg: Drop cpu_get_tb_cpu_state from TARGET_HAS_PRECISE_SMC
The results of the calls to cpu_get_tb_cpu_state,
current_{pc,cs_base,flags}, are not used.
In tb_invalidate_phys_page, use bool for current_tb_modified.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Wed, 5 Oct 2022 22:08:34 +0000 (15:08 -0700)]
accel/tcg: Move assert_no_pages_locked to internal.h
There are no users outside of accel/tcg; this function
does not need to be defined in exec-all.h.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Tue, 20 Sep 2022 05:17:44 +0000 (07:17 +0200)]
accel/tcg: Split out tb-maint.c
Move all of the TranslationBlock flushing and page linking
code from translate-all.c to tb-maint.c.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Mon, 19 Sep 2022 10:28:15 +0000 (12:28 +0200)]
accel/tcg: Split out PageDesc to internal.h
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Sun, 18 Sep 2022 11:46:21 +0000 (13:46 +0200)]
accel/tcg: Remove disabled debug in translate-all.c
These items printf, and could be replaced with proper
tracepoints if we really cared.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Sat, 17 Sep 2022 12:25:12 +0000 (14:25 +0200)]
accel/tcg: Make page_alloc_target_data allocation constant
Use a constant target data allocation size for all pages.
This will be necessary to reduce overhead of page tracking.
Since TARGET_PAGE_DATA_SIZE is now required, we can use this
to omit data tracking for targets that don't require it.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Sat, 22 Oct 2022 13:04:11 +0000 (23:04 +1000)]
include/qemu/thread: Use qatomic_* functions
Use qatomic_*, which expands to __atomic_* in preference
to the "legacy" __sync_* functions.
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Sat, 22 Oct 2022 12:05:16 +0000 (22:05 +1000)]
include/qemu/atomic: Use qemu_build_assert
Change from QEMU_BUILD_BUG_ON, which requires ifdefs to avoid
problematic code, to qemu_build_assert, which can use C ifs.
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Richard Henderson [Sat, 22 Oct 2022 11:34:12 +0000 (21:34 +1000)]
include/qemu/osdep: Add qemu_build_assert
This differs from assert, in that with optimization enabled it
triggers at build-time. It differs from QEMU_BUILD_BUG_ON,
aka _Static_assert, in that it is sensitive to control flow
and is subject to dead-code elimination.
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Leandro Lupori [Tue, 25 Oct 2022 20:24:22 +0000 (17:24 -0300)]
accel/tcg: Add a quicker check for breakpoints
Profiling QEMU during Fedora 35 for PPC64 boot revealed that a
considerable amount of time was being spent in
check_for_breakpoints() (0.61% of total time on PPC64 and 2.19% on
amd64), even though it was just checking that its queue was empty
and returning, when no breakpoints were set. It turns out this
function is not inlined by the compiler and it's always called by
helper_lookup_tb_ptr(), one of the most called functions.
By leaving only the check for empty queue in
check_for_breakpoints() and moving the remaining code to
check_for_breakpoints_slow(), called only when the queue is not
empty, it's possible to avoid the call overhead. An improvement of
about 3% in total time was measured on POWER9.
Signed-off-by: Leandro Lupori <leandro.lupori@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <
20221025202424.195984-2-leandro.lupori@eldorado.org.br>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>