qemu.git
3 years agotarget/mips/tx79: Introduce PAND/POR/PXOR/PNOR opcodes (parallel logic)
Philippe Mathieu-Daudé [Sat, 13 Feb 2021 11:24:44 +0000 (12:24 +0100)]
target/mips/tx79: Introduce PAND/POR/PXOR/PNOR opcodes (parallel logic)

Introduce the parallel logic opcodes:

 - PAND (Parallel AND)
 - POR  (Parallel OR)
 - PXOR (Parallel XOR)
 - PNOR (Parallel NOR)

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210214175912.732946-16-f4bug@amsat.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
3 years agohw/pci-host/raven: Add PCI_IO_BASE_ADDR definition
Philippe Mathieu-Daudé [Fri, 16 Apr 2021 16:15:56 +0000 (18:15 +0200)]
hw/pci-host/raven: Add PCI_IO_BASE_ADDR definition

Rather than using the magic 0x80000000 number for the PCI I/O BAR
physical address on the main system bus, use a definition.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20210417103028.601124-6-f4bug@amsat.org>

3 years agohw/pci-host: Rename Raven ASIC PCI bridge as raven.c
Philippe Mathieu-Daudé [Fri, 16 Apr 2021 16:18:58 +0000 (18:18 +0200)]
hw/pci-host: Rename Raven ASIC PCI bridge as raven.c

The ASIC PCI bridge chipset from Motorola is named 'Raven'.
This chipset is used in the PowerPC Reference Platform (PReP),
but not restricted to it. Rename it accordingly.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20210417103028.601124-5-f4bug@amsat.org>

3 years agoMerge remote-tracking branch 'remotes/cminyard/tags/for-qemu-6.1-2' into staging
Peter Maydell [Sun, 11 Jul 2021 13:32:49 +0000 (14:32 +0100)]
Merge remote-tracking branch 'remotes/cminyard/tags/for-qemu-6.1-2' into staging

Some qemu updates for IPMI and I2C

Move some ADC file to where they belong and move some sensors to a
sensor directory, since with new BMCs coming in lots of different
sensors should be coming in.  Keep from cluttering things up.

Add support for I2C PMBus devices.

Replace the confusing and error-prone i2c_send_recv and i2c_transfer with
specific send and receive functions.  Several errors have already been
made with these, avoid any new errors.

Fix the watchdog_expired field in the IPMI watchdog, it's not a bool,
it's a u8.  After a vmstate transfer, the new value could be wrong.

# gpg: Signature made Fri 09 Jul 2021 17:25:04 BST
# gpg:                using RSA key FD0D5CE67CE0F59A6688268661F38C90919BFF81
# gpg: Good signature from "Corey Minyard <cminyard@mvista.com>" [unknown]
# gpg:                 aka "Corey Minyard <minyard@acm.org>" [unknown]
# gpg:                 aka "Corey Minyard <corey@minyard.net>" [unknown]
# gpg:                 aka "Corey Minyard <minyard@mvista.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: FD0D 5CE6 7CE0 F59A 6688  2686 61F3 8C90 919B FF81

* remotes/cminyard/tags/for-qemu-6.1-2: (24 commits)
  tests/qtest: add tests for MAX34451 device model
  hw/misc: add MAX34451 device
  tests/qtest: add tests for ADM1272 device model
  hw/misc: add ADM1272 device
  hw/i2c: add support for PMBus
  ipmi/sim: fix watchdog_expired data type error in IPMIBmcSim struct
  hw/i2c: Introduce i2c_start_recv() and i2c_start_send()
  hw/i2c: Extract i2c_do_start_transfer() from i2c_start_transfer()
  hw/i2c: Make i2c_start_transfer() direction argument a boolean
  hw/i2c: Rename i2c_set_slave_address() -> i2c_slave_set_address()
  hw/i2c: Remove confusing i2c_send_recv()
  hw/misc/auxbus: Replace i2c_send_recv() by i2c_recv() & i2c_send()
  hw/misc/auxbus: Replace 'is_write' boolean by its value
  hw/misc/auxbus: Explode READ_I2C / WRITE_I2C_MOT cases
  hw/misc/auxbus: Fix MOT/classic I2C mode
  hw/i2c/ppc4xx_i2c: Replace i2c_send_recv() by i2c_recv() & i2c_send()
  hw/i2c/ppc4xx_i2c: Add reference to datasheet
  hw/display/sm501: Replace i2c_send_recv() by i2c_recv() & i2c_send()
  hw/display/sm501: Simplify sm501_i2c_write() logic
  hw/input/lm832x: Define TYPE_LM8323 in public header
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agoMerge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20210709' into...
Peter Maydell [Sun, 11 Jul 2021 12:11:32 +0000 (13:11 +0100)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20210709' into staging

target-arm queue:
 * New machine type: stm32vldiscovery
 * hw/intc/arm_gicv3_cpuif: Fix virtual irq number check in icv_[dir|eoir]_write
 * hw/gpio/pl061: Honour Luminary PL061 PUR and PDR registers
 * virt: Fix implementation of GPIO-based powerdown/shutdown mechanism
 * Correct the encoding of MDCCSR_EL0 and DBGDSCRint
 * hw/intc: Improve formatting of MEMTX_ERROR guest error message

# gpg: Signature made Fri 09 Jul 2021 17:09:10 BST
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20210709:
  hw/intc: Improve formatting of MEMTX_ERROR guest error message
  target/arm: Correct the encoding of MDCCSR_EL0 and DBGDSCRint
  hw/arm/stellaris: Expand comment about handling of OLED chipselect
  hw/gpio/pl061: Document a shortcoming in our implementation
  hw/gpio/pl061: Convert to 3-phase reset and assert GPIO lines correctly on reset
  hw/arm/virt: Make PL061 GPIO lines pulled low, not high
  hw/gpio/pl061: Make pullup/pulldown of outputs configurable
  hw/gpio/pl061: Honour Luminary PL061 PUR and PDR registers
  hw/gpio/pl061: Document the interface of this device
  hw/gpio/pl061: Add tracepoints for register read and write
  hw/gpio/pl061: Clean up read/write offset handling logic
  hw/gpio/pl061: Convert DPRINTF to tracepoints
  hw/intc/arm_gicv3_cpuif: Fix virtual irq number check in icv_[dir|eoir]_write
  tests/boot-serial-test: Add STM32VLDISCOVERY board testcase
  docs/system: arm: Add stm32 boards description
  stm32vldiscovery: Add the STM32VLDISCOVERY Machine
  stm32f100: Add the stm32f100 SoC

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agoMerge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Peter Maydell [Sat, 10 Jul 2021 18:55:20 +0000 (19:55 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches

- Make blockdev-reopen stable
- Remove deprecated qemu-img backing file without format
- rbd: Convert to coroutines and add write zeroes support
- rbd: Updated MAINTAINERS
- export/fuse: Allow other users access to the export
- vhost-user: Fix backends without multiqueue support
- Fix drive-backup transaction endless drained section

# gpg: Signature made Fri 09 Jul 2021 13:49:22 BST
# gpg:                using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg:                issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream: (28 commits)
  block: Make blockdev-reopen stable API
  iotests: Test reopening multiple devices at the same time
  block: Support multiple reopening with x-blockdev-reopen
  block: Acquire AioContexts during bdrv_reopen_multiple()
  block: Add bdrv_reopen_queue_free()
  qcow2: Fix dangling pointer after reopen for 'file'
  qemu-img: Improve error for rebase without backing format
  qemu-img: Require -F with -b backing image
  qcow2: Prohibit backing file changes in 'qemu-img amend'
  blockdev: fix drive-backup transaction endless drained section
  vhost-user: Fix backends without multiqueue support
  MAINTAINERS: add block/rbd.c reviewer
  block/rbd: fix type of task->complete
  iotests/fuse-allow-other: Test allow-other
  iotests/308: Test +w on read-only FUSE exports
  export/fuse: Let permissions be adjustable
  export/fuse: Give SET_ATTR_SIZE its own branch
  export/fuse: Add allow-other option
  export/fuse: Pass default_permissions for mount
  util/uri: do not check argument of uri_free()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agoMerge remote-tracking branch 'remotes/dg-gitlab/tags/ppc-for-6.1-20210709' into staging
Peter Maydell [Sat, 10 Jul 2021 15:06:24 +0000 (16:06 +0100)]
Merge remote-tracking branch 'remotes/dg-gitlab/tags/ppc-for-6.1-20210709' into staging

ppc patch queue 2021-07-09

Here's a (probably) final pull request before the qemu-6.1 soft
freeze.  Includes:
  * Implementation of the new H_RPT_INVALIDATE hypercall
  * Virtual Open Firmware for pSeries and pegasos2 machine types.
    This is an experimental minimal Open Firmware implementation which
    works by delegating nearly everything to qemu itself via a special
    hypercall.
  * A number of cleanups to the ppc soft MMU code
  * Fix to handling of two-level radix mode translations for the
    powernv machine type
  * Update the H_GET_CPU_CHARACTERISTICS call with newly defined bits.
    This will allow more flexible handling of possible future CPU
    Spectre-like flaws
  * Correctly treat mtmsrd as an illegal instruction on BookE cpus
  * Firmware update for the ppce500 machine type

# gpg: Signature made Fri 09 Jul 2021 06:16:42 BST
# gpg:                using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dg-gitlab/tags/ppc-for-6.1-20210709: (33 commits)
  target/ppc: Support for H_RPT_INVALIDATE hcall
  linux-headers: Update
  spapr: Fix implementation of Open Firmware client interface
  target/ppc: Don't compile ppc_tlb_invalid_all without TCG
  ppc/pegasos2: Implement some RTAS functions with VOF
  ppc/pegasos2: Fix use of && instead of &
  ppc/pegasos2: Use Virtual Open Firmware as firmware replacement
  target/ppc/spapr: Update H_GET_CPU_CHARACTERISTICS L1D cache flush bits
  target/ppc: Allow virtual hypervisor on CPU without HV
  ppc/pegasos2: Introduce Pegasos2MachineState structure
  target/ppc: mtmsrd is an illegal instruction on BookE
  spapr: Implement Open Firmware client interface
  docs/system: ppc: Update ppce500 documentation with eTSEC support
  roms/u-boot: Bump ppce500 u-boot to v2021.07 to add eTSEC support
  target/ppc: change ppc_hash32_xlate to use mmu_idx
  target/ppc: introduce mmu-books.h
  target/ppc: changed ppc_hash64_xlate to use mmu_idx
  target/ppc: fix address translation bug for radix mmus
  target/ppc: Fix compilation with DEBUG_BATS debug option
  target/ppc: Fix compilation with FLUSH_ALL_TLBS debug option
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agoMerge remote-tracking branch 'remotes/ehabkost-gl/tags/machine-next-pull-request...
Peter Maydell [Fri, 9 Jul 2021 16:58:38 +0000 (17:58 +0100)]
Merge remote-tracking branch 'remotes/ehabkost-gl/tags/machine-next-pull-request' into staging

Machine queue, 2021-07-07

Deprecation:
* Deprecate pmem=on with non-DAX capable backend file
  (Igor Mammedov)

Feature:
* virtio-mem: vfio support (David Hildenbrand)

Cleanup:
* vmbus: Don't make QOM property registration conditional
  (Eduardo Habkost)

# gpg: Signature made Thu 08 Jul 2021 20:55:04 BST
# gpg:                using RSA key 5A322FD5ABC4D3DBACCFD1AA2807936F984DC5A6
# gpg:                issuer "ehabkost@redhat.com"
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" [full]
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost-gl/tags/machine-next-pull-request:
  vfio: Disable only uncoordinated discards for VFIO_TYPE1 iommus
  virtio-mem: Require only coordinated discards
  softmmu/physmem: Extend ram_block_discard_(require|disable) by two discard types
  softmmu/physmem: Don't use atomic operations in ram_block_discard_(disable|require)
  vfio: Support for RamDiscardManager in the vIOMMU case
  vfio: Sanity check maximum number of DMA mappings with RamDiscardManager
  vfio: Query and store the maximum number of possible DMA mappings
  vfio: Support for RamDiscardManager in the !vIOMMU case
  virtio-mem: Implement RamDiscardManager interface
  virtio-mem: Don't report errors when ram_block_discard_range() fails
  virtio-mem: Factor out traversing unplugged ranges
  memory: Helpers to copy/free a MemoryRegionSection
  memory: Introduce RamDiscardManager for RAM memory regions
  Deprecate pmem=on with non-DAX capable backend file
  vmbus: Don't make QOM property registration conditional

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agohw/intc: Improve formatting of MEMTX_ERROR guest error message
Rebecca Cran [Tue, 6 Jul 2021 21:14:32 +0000 (15:14 -0600)]
hw/intc: Improve formatting of MEMTX_ERROR guest error message

Add a space in the message printed when gicr_read*/gicr_write* returns
MEMTX_ERROR in arm_gicv3_redist.c.

Signed-off-by: Rebecca Cran <rebecca@nuviainc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210706211432.31902-1-rebecca@nuviainc.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agotarget/arm: Correct the encoding of MDCCSR_EL0 and DBGDSCRint
hnick@vmware.com [Tue, 6 Jul 2021 13:44:57 +0000 (14:44 +0100)]
target/arm: Correct the encoding of MDCCSR_EL0 and DBGDSCRint

Signed-off-by: Nick Hudson <hnick@vmware.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agohw/arm/stellaris: Expand comment about handling of OLED chipselect
Peter Maydell [Fri, 2 Jul 2021 10:40:18 +0000 (11:40 +0100)]
hw/arm/stellaris: Expand comment about handling of OLED chipselect

The stellaris board doesn't emulate the handling of the OLED
chipselect line correctly.  Expand the comment describing this,
including a sketch of the theoretical correct way to do it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agohw/gpio/pl061: Document a shortcoming in our implementation
Peter Maydell [Fri, 2 Jul 2021 10:40:17 +0000 (11:40 +0100)]
hw/gpio/pl061: Document a shortcoming in our implementation

The Luminary PL061s in the Stellaris LM3S9695 don't all have the same
reset value for GPIOPUR.  We can get away with not letting the board
configure the PUR reset value because we don't actually wire anything
up to the lines which should reset to pull-up.  Add a comment noting
this omission.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
3 years agohw/gpio/pl061: Convert to 3-phase reset and assert GPIO lines correctly on reset
Peter Maydell [Fri, 2 Jul 2021 10:40:16 +0000 (11:40 +0100)]
hw/gpio/pl061: Convert to 3-phase reset and assert GPIO lines correctly on reset

The PL061 comes out of reset with all its lines configured as input,
which means they might need to be pulled to 0 or 1 depending on the
'pullups' and 'pulldowns' properties.  Currently we do not assert
these lines on reset; they will only be set whenever the guest first
touches a register that triggers a call to pl061_update().

Convert the device to three-phase reset so we have a place where we
can safely call qemu_set_irq() to set the floating lines to their
correct values.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
3 years agohw/arm/virt: Make PL061 GPIO lines pulled low, not high
Peter Maydell [Fri, 2 Jul 2021 10:40:15 +0000 (11:40 +0100)]
hw/arm/virt: Make PL061 GPIO lines pulled low, not high

For the virt board we have two PL061 devices -- one for NonSecure which
is inputs only, and one for Secure which is outputs only. For the former,
we don't care whether its outputs are pulled low or high when the line is
configured as an input, because we don't connect them. For the latter,
we do care, because we wire the lines up to the gpio-pwr device, which
assumes that level 1 means "do the action" and 1 means "do nothing".
For consistency in case we add more outputs in future, configure both
PL061s to pull GPIO lines down to 0.

Reported-by: Maxim Uvarov <maxim.uvarov@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
3 years agohw/gpio/pl061: Make pullup/pulldown of outputs configurable
Peter Maydell [Fri, 2 Jul 2021 10:40:14 +0000 (11:40 +0100)]
hw/gpio/pl061: Make pullup/pulldown of outputs configurable

The PL061 GPIO does not itself include pullup or pulldown resistors
to set the value of a GPIO line treated as an output when it is
configured as an input (ie when the PL061 itself is not driving it).
In real hardware it is up to the board to add suitable pullups or
pulldowns.  Currently our implementation hardwires this to "outputs
pulled high", which is correct for some boards (eg the realview ones:
see figure 3-29 in the "RealView Platform Baseboard for ARM926EJ-S
User Guide" DUI0224I), but wrong for others.

In particular, the wiring in the 'virt' board and the gpio-pwr device
assumes that wires should be pulled low, because otherwise the
pull-to-high will trigger a shutdown or reset action.  (The only
reason this doesn't happen immediately on startup is due to another
bug in the PL061, where we don't assert the GPIOs to the correct
value on reset, but will do so as soon as the guest touches a
register and pl061_update() gets called.)

Add properties to the pl061 so the board can configure whether it
wants GPIO lines to have pullup, pulldown, or neither.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
3 years agohw/gpio/pl061: Honour Luminary PL061 PUR and PDR registers
Peter Maydell [Fri, 2 Jul 2021 10:40:13 +0000 (11:40 +0100)]
hw/gpio/pl061: Honour Luminary PL061 PUR and PDR registers

The Luminary variant of the PL061 has registers GPIOPUR and GPIOPDR
which lets the guest configure whether the GPIO lines are pull-up,
pull-down, or truly floating. Instead of assuming all lines are pulled
high, honour the PUR and PDR registers.

For the plain PL061, continue to assume that lines have an external
pull-up resistor, as we did before.

The stellaris board actually relies on this behaviour -- the CD line
of the ssd0323 display device is connected to GPIO output C7, and it
is only because of a different bug which we're about to fix that we
weren't incorrectly driving this line high on reset and putting the
ssd0323 into data mode.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
3 years agohw/gpio/pl061: Document the interface of this device
Peter Maydell [Fri, 2 Jul 2021 10:40:12 +0000 (11:40 +0100)]
hw/gpio/pl061: Document the interface of this device

Add a comment documenting the "QEMU interface" of this device:
which MMIO regions, IRQ lines, GPIO lines, etc it exposes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
3 years agohw/gpio/pl061: Add tracepoints for register read and write
Peter Maydell [Fri, 2 Jul 2021 10:40:11 +0000 (11:40 +0100)]
hw/gpio/pl061: Add tracepoints for register read and write

Add tracepoints for reads and writes to the PL061 registers. This requires
restructuring pl061_read() to only return after the tracepoint, rather
than having lots of early-returns.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
3 years agohw/gpio/pl061: Clean up read/write offset handling logic
Peter Maydell [Fri, 2 Jul 2021 10:40:10 +0000 (11:40 +0100)]
hw/gpio/pl061: Clean up read/write offset handling logic

Currently the pl061_read() and pl061_write() functions handle offsets
using a combination of three if() statements and a switch().  Clean
this up to use just a switch, using case ranges.

This requires that instead of catching accesses to the luminary-only
registers on a stock PL061 via a check on s->rsvd_start we use
an "is this luminary?" check in the cases for each luminary-only
register.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
3 years agohw/gpio/pl061: Convert DPRINTF to tracepoints
Peter Maydell [Fri, 2 Jul 2021 10:40:09 +0000 (11:40 +0100)]
hw/gpio/pl061: Convert DPRINTF to tracepoints

Convert the use of the DPRINTF debug macro in the PL061 model to
use tracepoints.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
3 years agohw/intc/arm_gicv3_cpuif: Fix virtual irq number check in icv_[dir|eoir]_write
Ricardo Koller [Fri, 2 Jul 2021 23:37:01 +0000 (16:37 -0700)]
hw/intc/arm_gicv3_cpuif: Fix virtual irq number check in icv_[dir|eoir]_write

icv_eoir_write() and icv_dir_write() ignore invalid virtual IRQ numbers
(like LPIs).  The issue is that these functions check against the number
of implemented IRQs (QEMU's default is num_irq=288) which can be lower
than the maximum virtual IRQ number (1020 - 1).  The consequence is that
if a hypervisor creates an LR for an IRQ between 288 and 1020, then the
guest is unable to deactivate the resulting IRQ. Note that other
functions that deal with large IRQ numbers, like icv_iar_read, check
against 1020 and not against num_irq.

Fix the checks by using GICV3_MAXIRQ (1020) instead of the number of
implemented IRQs.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Message-id: 20210702233701.3369-1-ricarkol@google.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agotests/boot-serial-test: Add STM32VLDISCOVERY board testcase
Alexandre Iooss [Thu, 17 Jun 2021 16:56:47 +0000 (18:56 +0200)]
tests/boot-serial-test: Add STM32VLDISCOVERY board testcase

New mini-kernel test for STM32VLDISCOVERY USART1.

Signed-off-by: Alexandre Iooss <erdnaxe@crans.org>
Acked-by: Thomas Huth <thuth@redhat.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210617165647.2575955-5-erdnaxe@crans.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agodocs/system: arm: Add stm32 boards description
Alexandre Iooss [Thu, 17 Jun 2021 16:56:46 +0000 (18:56 +0200)]
docs/system: arm: Add stm32 boards description

This adds the target guide for Netduino 2, Netduino Plus 2 and STM32VLDISCOVERY.

Signed-off-by: Alexandre Iooss <erdnaxe@crans.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210617165647.2575955-4-erdnaxe@crans.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agostm32vldiscovery: Add the STM32VLDISCOVERY Machine
Alexandre Iooss [Thu, 17 Jun 2021 16:56:45 +0000 (18:56 +0200)]
stm32vldiscovery: Add the STM32VLDISCOVERY Machine

This is a Cortex-M3 based machine. Information can be found at:
https://www.st.com/en/evaluation-tools/stm32vldiscovery.html

Signed-off-by: Alexandre Iooss <erdnaxe@crans.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210617165647.2575955-3-erdnaxe@crans.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agostm32f100: Add the stm32f100 SoC
Alexandre Iooss [Thu, 17 Jun 2021 16:56:44 +0000 (18:56 +0200)]
stm32f100: Add the stm32f100 SoC

This SoC is similar to stm32f205 SoC.
This will be used by the STM32VLDISCOVERY to create a machine.

Signed-off-by: Alexandre Iooss <erdnaxe@crans.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210617165647.2575955-2-erdnaxe@crans.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agoMerge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
Peter Maydell [Fri, 9 Jul 2021 13:30:01 +0000 (14:30 +0100)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

pc,pci,virtio: bugfixes, improvements

vhost-user-rng support.
Fixes all over the place.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Wed 07 Jul 2021 14:29:30 BST
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  MAINTAINERS: Add maintainer for vhost-user RNG implementation
  docs: add slot when adding new PCIe root port
  acpi/ged: fix reset cause
  tests: acpi: pc: update expected DSDT blobs
  acpi: pc: revert back to v5.2 PCI slot enumeration
  tests: acpi: prepare for changing DSDT tables
  migration: failover: reset partially_hotplugged
  virtio-pci: Changed return values for "notify", "device" and "isr" read.
  virtio-pci: Added check for virtio device in PCI config cbs.
  virtio-pci: Added check for virtio device presence in mm callbacks.
  hw/pci-host/q35: Ignore write of reserved PCIEXBAR LENGTH field
  virtio: Clarify MR transaction optimization
  virtio: disable ioeventfd for record/replay

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agoblock: Make blockdev-reopen stable API
Alberto Garcia [Thu, 8 Jul 2021 11:47:09 +0000 (13:47 +0200)]
block: Make blockdev-reopen stable API

This patch drops the 'x-' prefix from x-blockdev-reopen.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210708114709.206487-7-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoiotests: Test reopening multiple devices at the same time
Alberto Garcia [Thu, 8 Jul 2021 11:47:08 +0000 (13:47 +0200)]
iotests: Test reopening multiple devices at the same time

This test swaps the images used by two active block devices.

This is now possible thanks to the new ability to run
x-blockdev-reopen on multiple devices at the same time.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210708114709.206487-6-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoblock: Support multiple reopening with x-blockdev-reopen
Alberto Garcia [Thu, 8 Jul 2021 11:47:07 +0000 (13:47 +0200)]
block: Support multiple reopening with x-blockdev-reopen

[ kwolf: Fixed AioContext locking ]

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210708114709.206487-5-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoblock: Acquire AioContexts during bdrv_reopen_multiple()
Kevin Wolf [Thu, 8 Jul 2021 11:47:06 +0000 (13:47 +0200)]
block: Acquire AioContexts during bdrv_reopen_multiple()

As the BlockReopenQueue can contain nodes in multiple AioContexts, only
one of which may be locked when AIO_WAIT_WHILE() can be called, we can't
let the caller lock the right contexts. Instead, individually lock the
AioContext of a single node when iterating the queue.

Reintroduce bdrv_reopen() as a wrapper for reopening a single node that
drains the node and temporarily drops the AioContext lock for
bdrv_reopen_multiple().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210708114709.206487-4-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoblock: Add bdrv_reopen_queue_free()
Alberto Garcia [Thu, 8 Jul 2021 11:47:05 +0000 (13:47 +0200)]
block: Add bdrv_reopen_queue_free()

Move the code to free a BlockReopenQueue to a separate function.
It will be used in a subsequent patch.

[ kwolf: Also free explicit_options and options, and explicitly
  qobject_ref() the value when it continues to be used. This makes
  future memory leaks less likely. ]

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210708114709.206487-3-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoqcow2: Fix dangling pointer after reopen for 'file'
Kevin Wolf [Thu, 8 Jul 2021 11:47:04 +0000 (13:47 +0200)]
qcow2: Fix dangling pointer after reopen for 'file'

Without an external data file, s->data_file is a second pointer with the
same value as bs->file. When changing bs->file to a different BdrvChild
and freeing the old BdrvChild, s->data_file must also be updated,
otherwise it points to freed memory and causes crashes.

This problem was caught by iotests case 245.

Fixes: df2b7086f169239ebad5d150efa29c9bb6d4f820
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210708114709.206487-2-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoqemu-img: Improve error for rebase without backing format
Eric Blake [Thu, 8 Jul 2021 15:52:28 +0000 (10:52 -0500)]
qemu-img: Improve error for rebase without backing format

When removeing support for qemu-img being able to create backing
chains without embedded backing formats, we caused a poor error
message as caught by iotest 114.  Improve the situation to inform the
user what went wrong.

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210708155228.2666172-1-eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoqemu-img: Require -F with -b backing image
Eric Blake [Mon, 3 May 2021 21:36:00 +0000 (14:36 -0700)]
qemu-img: Require -F with -b backing image

Back in commit d9f059aa6c (qemu-img: Deprecate use of -b without -F),
we deprecated the ability to create a file with a backing image that
requires qemu to perform format probing.  Qemu can still probe older
files for backwards compatibility, but it is time to finish off the
ability to create such images, due to the potential security risk they
present.  Update a couple of iotests affected by the change.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210503213600.569128-3-eblake@redhat.com>
Reviewed-by: Connor Kuehl <ckuehl@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoqcow2: Prohibit backing file changes in 'qemu-img amend'
Eric Blake [Mon, 3 May 2021 21:35:59 +0000 (14:35 -0700)]
qcow2: Prohibit backing file changes in 'qemu-img amend'

This was deprecated back in bc5ee6da7 (qcow2: Deprecate use of
qemu-img amend to change backing file), and no one in the meantime has
given any reasons why it should be supported.  Time to make change
attempts a hard error (but for convenience, specifying the _same_
backing chain is not forbidden).  Update a couple of iotests to match.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210503213600.569128-2-eblake@redhat.com>
Reviewed-by: Connor Kuehl <ckuehl@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoblockdev: fix drive-backup transaction endless drained section
Vladimir Sementsov-Ogievskiy [Tue, 8 Jun 2021 17:18:52 +0000 (10:18 -0700)]
blockdev: fix drive-backup transaction endless drained section

drive_backup_prepare() does bdrv_drained_begin() in hope that
bdrv_drained_end() will be called in drive_backup_clean(). Still we
need to set state->bs for this to work. That's done too late: a lot of
failure paths in drive_backup_prepare() miss setting state->bs. Fix
that.

Fixes: 2288ccfac96281c316db942d10e3f921c1373064
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/399
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210608171852.250775-1-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agovhost-user: Fix backends without multiqueue support
Kevin Wolf [Mon, 5 Jul 2021 17:14:29 +0000 (19:14 +0200)]
vhost-user: Fix backends without multiqueue support

dev->max_queues was never initialised for backends that don't support
VHOST_USER_PROTOCOL_F_MQ, so it would use 0 as the maximum number of
queues to check against and consequently fail for any such backend.

Set it to 1 if the backend doesn't have multiqueue support.

Fixes: c90bd505a3e8210c23d69fecab9ee6f56ec4a161
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210705171429.29286-1-kwolf@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoMAINTAINERS: add block/rbd.c reviewer
Peter Lieven [Wed, 7 Jul 2021 18:04:49 +0000 (20:04 +0200)]
MAINTAINERS: add block/rbd.c reviewer

adding myself as a designated reviewer.

Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <20210707180449.32665-2-pl@kamp.de>
Acked-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoblock/rbd: fix type of task->complete
Peter Lieven [Wed, 7 Jul 2021 18:04:48 +0000 (20:04 +0200)]
block/rbd: fix type of task->complete

task->complete is a bool not an integer.

Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <20210707180449.32665-1-pl@kamp.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoiotests/fuse-allow-other: Test allow-other
Max Reitz [Fri, 25 Jun 2021 14:23:17 +0000 (16:23 +0200)]
iotests/fuse-allow-other: Test allow-other

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210625142317.271673-7-mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoiotests/308: Test +w on read-only FUSE exports
Max Reitz [Fri, 25 Jun 2021 14:23:16 +0000 (16:23 +0200)]
iotests/308: Test +w on read-only FUSE exports

Test that +w on read-only FUSE exports returns an EROFS error.  u+x on
the other hand should work.  (There is no special reason to choose u+x
here, it simply is like +w another flag that is not set by default.)

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210625142317.271673-6-mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoexport/fuse: Let permissions be adjustable
Max Reitz [Fri, 25 Jun 2021 14:23:15 +0000 (16:23 +0200)]
export/fuse: Let permissions be adjustable

Allow changing the file mode, UID, and GID through SETATTR.

Without allow_other, UID and GID are not allowed to be changed, because
it would not make sense.  Also, changing group or others' permissions
is not allowed either.

For read-only exports, +w cannot be set.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210625142317.271673-5-mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoexport/fuse: Give SET_ATTR_SIZE its own branch
Max Reitz [Fri, 25 Jun 2021 14:23:14 +0000 (16:23 +0200)]
export/fuse: Give SET_ATTR_SIZE its own branch

In order to support changing other attributes than the file size in
fuse_setattr(), we have to give each its own independent branch.  This
also applies to the only attribute we do support right now.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210625142317.271673-4-mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoexport/fuse: Add allow-other option
Max Reitz [Fri, 25 Jun 2021 14:23:13 +0000 (16:23 +0200)]
export/fuse: Add allow-other option

Without the allow_other mount option, no user (not even root) but the
one who started qemu/the storage daemon can access the export.  Allow
users to configure the export such that such accesses are possible.

While allow_other is probably what users want, we cannot make it an
unconditional default, because passing it is only possible (for non-root
users) if the global fuse.conf configuration file allows it.  Thus, the
default is an 'auto' mode, in which we first try with allow_other, and
then fall back to without.

FuseExport.allow_other reports whether allow_other was actually used as
a mount option or not.  Currently, this information is not used, but a
future patch will let this field decide whether e.g. an export's UID and
GID can be changed through chmod.

One notable thing about 'auto' mode is that libfuse may print error
messages directly to stderr, and so may fusermount (which it executes).
Our export code cannot really filter or hide them.  Therefore, if 'auto'
fails its first attempt and has to fall back, fusermount will print an
error message that mounting with allow_other failed.

This behavior necessitates a change to iotest 308, namely we need to
filter out this error message (because if the first attempt at mounting
with allow_other succeeds, there will be no such message).

Furthermore, common.rc's _make_test_img should use allow-other=off for
FUSE exports, because iotests generally do not need to access images
from other users, so allow-other=on or allow-other=auto have no
advantage.  OTOH, allow-other=on will not work on systems where
user_allow_other is disabled, and with allow-other=auto, we get said
error message that we would need to filter out again.  Just disabling
allow-other is simplest.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210625142317.271673-3-mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoexport/fuse: Pass default_permissions for mount
Max Reitz [Fri, 25 Jun 2021 14:23:12 +0000 (16:23 +0200)]
export/fuse: Pass default_permissions for mount

We do not do any permission checks in fuse_open(), so let the kernel do
them.  We already let fuse_getattr() report the proper UNIX permissions,
so this should work the way we want.

This causes a change in 308's reference output, because now opening a
non-writable export with O_RDWR fails already, instead of only actually
attempting to write to it.  (That is an improvement.)

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210625142317.271673-2-mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoutil/uri: do not check argument of uri_free()
Heinrich Schuchardt [Tue, 29 Jun 2021 06:36:02 +0000 (08:36 +0200)]
util/uri: do not check argument of uri_free()

uri_free() checks if its argument is NULL in uri_clean() and g_free().
There is no need to check the argument before the call.

Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Message-Id: <20210629063602.4239-1-xypron.glpk@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoblock/rbd: drop qemu_rbd_refresh_limits
Peter Lieven [Fri, 2 Jul 2021 17:23:56 +0000 (19:23 +0200)]
block/rbd: drop qemu_rbd_refresh_limits

librbd supports 1 byte alignment for all aio operations.

Currently, there is no API call to query limits from the Ceph
ObjectStore backend.  So drop the bdrv_refresh_limits completely
until there is such an API call.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Message-Id: <20210702172356.11574-7-idryomov@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoblock/rbd: add write zeroes support
Peter Lieven [Fri, 2 Jul 2021 17:23:55 +0000 (19:23 +0200)]
block/rbd: add write zeroes support

This patch wittingly sets BDRV_REQ_NO_FALLBACK and silently ignores
BDRV_REQ_MAY_UNMAP for older librbd versions.

The rationale for this is as follows (citing Ilya Dryomov current RBD
maintainer):

---8<---
a) remove the BDRV_REQ_MAY_UNMAP check in qemu_rbd_co_pwrite_zeroes()
   and as a consequence always unmap if librbd is too old

   It's not clear what qemu's expectation is but in general Write
   Zeroes is allowed to unmap.  The only guarantee is that subsequent
   reads return zeroes, everything else is a hint.  This is how it is
   specified in the kernel and in the NVMe spec.

   In particular, block/nvme.c implements it as follows:

   if (flags & BDRV_REQ_MAY_UNMAP) {
       cdw12 |= (1 << 25);
   }

   This sets the Deallocate bit.  But if it's not set, the device may
   still deallocate:

   """
   If the Deallocate bit (CDW12.DEAC) is set to '1' in a Write Zeroes
   command, and the namespace supports clearing all bytes to 0h in the
   values read (e.g., bits 2:0 in the DLFEAT field are set to 001b)
   from a deallocated logical block and its metadata (excluding
   protection information), then for each specified logical block, the
   controller:
   - should deallocate that logical block;

   ...

   If the Deallocate bit is cleared to '0' in a Write Zeroes command,
   and the namespace supports clearing all bytes to 0h in the values
   read (e.g., bits 2:0 in the DLFEAT field are set to 001b) from
   a deallocated logical block and its metadata (excluding protection
   information), then, for each specified logical block, the
   controller:
   - may deallocate that logical block;
   """

   https://nvmexpress.org/wp-content/uploads/NVM-Express-NVM-Command-Set-Specification-2021.06.02-Ratified-1.pdf

b) set BDRV_REQ_NO_FALLBACK in supported_zero_flags

   Again, it's not clear what qemu expects here, but without it we end
   up in a ridiculous situation where specifying the "don't allow slow
   fallback" switch immediately fails all efficient zeroing requests on
   a device where Write Zeroes is always efficient:

   $ qemu-io -c 'help write' | grep -- '-[zun]'
    -n, -- with -z, don't allow slow fallback
    -u, -- with -z, allow unmapping
    -z, -- write zeroes using blk_co_pwrite_zeroes

   $ qemu-io -f rbd -c 'write -z -u -n 0 1M' rbd:foo/bar
   write failed: Operation not supported
--->8---

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Message-Id: <20210702172356.11574-6-idryomov@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoblock/rbd: migrate from aio to coroutines
Peter Lieven [Fri, 2 Jul 2021 17:23:54 +0000 (19:23 +0200)]
block/rbd: migrate from aio to coroutines

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Message-Id: <20210702172356.11574-5-idryomov@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoblock/rbd: update s->image_size in qemu_rbd_getlength
Peter Lieven [Fri, 2 Jul 2021 17:23:53 +0000 (19:23 +0200)]
block/rbd: update s->image_size in qemu_rbd_getlength

While at it just call rbd_get_size and avoid rbd_image_info_t.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Message-Id: <20210702172356.11574-4-idryomov@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoblock/rbd: store object_size in BDRVRBDState
Peter Lieven [Fri, 2 Jul 2021 17:23:52 +0000 (19:23 +0200)]
block/rbd: store object_size in BDRVRBDState

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Message-Id: <20210702172356.11574-3-idryomov@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoblock/rbd: bump librbd requirement to luminous release
Peter Lieven [Fri, 2 Jul 2021 17:23:51 +0000 (19:23 +0200)]
block/rbd: bump librbd requirement to luminous release

Ceph Luminous (version 12.2.z) is almost 4 years old at this point.
Bump the requirement to get rid of the ifdef'ry in the code.
Qemu 6.1 dropped the support for RHEL-7 which was the last supported
OS that required an older librbd.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Message-Id: <20210702172356.11574-2-idryomov@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoblock/rbd: Add support for rbd image encryption
Or Ozeri [Sun, 27 Jun 2021 11:46:35 +0000 (14:46 +0300)]
block/rbd: Add support for rbd image encryption

Starting from ceph Pacific, RBD has built-in support for image-level encryption.
Currently supported formats are LUKS version 1 and 2.

There are 2 new relevant librbd APIs for controlling encryption, both expect an
open image context:

rbd_encryption_format: formats an image (i.e. writes the LUKS header)
rbd_encryption_load: loads encryptor/decryptor to the image IO stack

This commit extends the qemu rbd driver API to support the above.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
Message-Id: <20210627114635.39326-1-oro@il.ibm.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agoMAINTAINERS: update block/rbd.c maintainer
Ilya Dryomov [Wed, 19 May 2021 11:25:13 +0000 (13:25 +0200)]
MAINTAINERS: update block/rbd.c maintainer

Jason has moved on from working on RBD and Ceph.  I'm taking over
his role upstream.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Message-Id: <20210519112513.19694-1-idryomov@gmail.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
3 years agotarget/ppc: Support for H_RPT_INVALIDATE hcall
Bharata B Rao [Tue, 6 Jul 2021 11:24:40 +0000 (16:54 +0530)]
target/ppc: Support for H_RPT_INVALIDATE hcall

If KVM_CAP_RPT_INVALIDATE KVM capability is enabled, then

- indicate the availability of H_RPT_INVALIDATE hcall to the guest via
  ibm,hypertas-functions property.
- Enable the hcall

Both the above are done only if the new sPAPR machine capability
cap-rpt-invalidate is set.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Message-Id: <20210706112440.1449562-3-bharata@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agolinux-headers: Update
Bharata B Rao [Tue, 6 Jul 2021 11:24:39 +0000 (16:54 +0530)]
linux-headers: Update

Update to mainline commit: 79160a603bdb ("Merge tag 'usb-5.14-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb"

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Message-Id: <20210706112440.1449562-2-bharata@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agospapr: Fix implementation of Open Firmware client interface
Alexey Kardashevskiy [Thu, 8 Jul 2021 06:56:25 +0000 (16:56 +1000)]
spapr: Fix implementation of Open Firmware client interface

This addresses the comments from v22.

The functional changes are (the VOF ones need retesting with Pegasos2):

(VOF) setprop will start failing if the machine class callback
did not handle it;
(VOF) unit addresses are lowered in path_offset();
(SPAPR) /chosen/bootargs is initialized from kernel_cmdline if
the client did not change it.

Fixes: 5c991e5d4378 ("spapr: Implement Open Firmware client interface")
Cc: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20210708065625.548396-1-aik@ozlabs.ru>
Tested-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: Don't compile ppc_tlb_invalid_all without TCG
Lucas Mateus Castro (alqotel) [Thu, 8 Jul 2021 16:49:54 +0000 (13:49 -0300)]
target/ppc: Don't compile ppc_tlb_invalid_all without TCG

The function ppc_tlb_invalid_all is not compiled anymore in a TCG-less
environment, and the call to that function has been disabled in this
situation

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Message-Id: <20210708164957.28096-2-lucas.araujo@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agoppc/pegasos2: Implement some RTAS functions with VOF
BALATON Zoltan [Thu, 8 Jul 2021 21:46:14 +0000 (23:46 +0200)]
ppc/pegasos2: Implement some RTAS functions with VOF

Linux uses RTAS functions to access PCI devices so we need to provide
these with VOF. Implement some of the most important functions to
allow booting Linux with VOF. With this the board is now usable
without a binary ROM image and we can enable it by default as other
boards.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <20210708215113.B3F747456E3@zero.eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agoppc/pegasos2: Fix use of && instead of &
David Gibson [Thu, 8 Jul 2021 05:40:21 +0000 (15:40 +1000)]
ppc/pegasos2: Fix use of && instead of &

This is obviously intended to be a mask, not a logical operation.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agoppc/pegasos2: Use Virtual Open Firmware as firmware replacement
BALATON Zoltan [Sun, 27 Jun 2021 16:27:13 +0000 (18:27 +0200)]
ppc/pegasos2: Use Virtual Open Firmware as firmware replacement

The pegasos2 board comes with an Open Firmware compliant ROM based on
SmartFirmware but it has some changes that are not open source
therefore the ROM binary cannot be included in QEMU. Guests running on
the board however depend on services provided by the firmware. The
Virtual Open Firmware recently added to QEMU implements a minimal set
of these services to allow some guests to boot without the original
firmware. This patch adds VOF as the default firmware for pegasos2
which allows booting Linux and MorphOS via -kernel option while a ROM
image can still be used with -bios for guests that don't run with VOF.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <1d6ed6f290c5c1f0b5a1e1c51cf1151452d70d9a.1624811233.git.balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc/spapr: Update H_GET_CPU_CHARACTERISTICS L1D cache flush bits
Nicholas Piggin [Tue, 15 Jun 2021 04:41:07 +0000 (14:41 +1000)]
target/ppc/spapr: Update H_GET_CPU_CHARACTERISTICS L1D cache flush bits

There are several new L1D cache flush bits added to the hcall which reflect
hardware security features for speculative cache access issues.

These behaviours are now being specified as negative in order to simplify
patched kernel compatibility with older firmware (a new problem found in
existing systems would automatically be vulnerable).

[dwg: Technically this changes behaviour for existing machine types.
 After discussion with Nick, we've determined this is safe, because
 the worst that will happen if a guest gets the wrong information due
 to a migration is that it will perform some unnecessary workarounds,
 but will remain correct and secure (well, as secure as it was going
 to be anyway).  In addition the change only affects cap-cfpc=safe
 which is not enabled by default, and in fact is not possible to set
 on any current hardware (though it's expected it will be possible on
 POWER10)]

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20210615044107.1481608-1-npiggin@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: Allow virtual hypervisor on CPU without HV
BALATON Zoltan [Sun, 27 Jun 2021 16:27:13 +0000 (18:27 +0200)]
target/ppc: Allow virtual hypervisor on CPU without HV

Change the assert in ppc_store_sdr1() to allow vhyp to be set on CPUs
without HV bit. This allows using the vhyp interface for firmware
emulation on pegasos2.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <21c7745aabbb68fcc50bb2ffaf16b939ba21261c.1624811233.git.balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agoppc/pegasos2: Introduce Pegasos2MachineState structure
BALATON Zoltan [Sun, 27 Jun 2021 16:27:13 +0000 (18:27 +0200)]
ppc/pegasos2: Introduce Pegasos2MachineState structure

Add own machine state structure which will be used to store state
needed for firmware emulation.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <7f6d5fbf4f70c64dba001483174a2921dd616ecd.1624811233.git.balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: mtmsrd is an illegal instruction on BookE
Nicholas Piggin [Tue, 6 Jul 2021 05:13:21 +0000 (15:13 +1000)]
target/ppc: mtmsrd is an illegal instruction on BookE

MSR is a 32-bit register in BookE and there is no mtmsrd instruction.

Cc: Christian Zigotzky <chzigotzky@xenosoft.de>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20210706051321.609046-1-npiggin@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agospapr: Implement Open Firmware client interface
Alexey Kardashevskiy [Fri, 25 Jun 2021 05:51:55 +0000 (15:51 +1000)]
spapr: Implement Open Firmware client interface

The PAPR platform describes an OS environment that's presented by
a combination of a hypervisor and firmware. The features it specifies
require collaboration between the firmware and the hypervisor.

Since the beginning, the runtime component of the firmware (RTAS) has
been implemented as a 20 byte shim which simply forwards it to
a hypercall implemented in qemu. The boot time firmware component is
SLOF - but a build that's specific to qemu, and has always needed to be
updated in sync with it. Even though we've managed to limit the amount
of runtime communication we need between qemu and SLOF, there's some,
and it has become increasingly awkward to handle as we've implemented
new features.

This implements a boot time OF client interface (CI) which is
enabled by a new "x-vof" pseries machine option (stands for "Virtual Open
Firmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall
which implements Open Firmware Client Interface (OF CI). This allows
using a smaller stateless firmware which does not have to manage
the device tree.

The new "vof.bin" firmware image is included with source code under
pc-bios/. It also includes RTAS blob.

This implements a handful of CI methods just to get -kernel/-initrd
working. In particular, this implements the device tree fetching and
simple memory allocator - "claim" (an OF CI memory allocator) and updates
"/memory@0/available" to report the client about available memory.

This implements changing some device tree properties which we know how
to deal with, the rest is ignored. To allow changes, this skips
fdt_pack() when x-vof=on as not packing the blob leaves some room for
appending.

In absence of SLOF, this assigns phandles to device tree nodes to make
device tree traversing work.

When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree.

This adds basic instances support which are managed by a hash map
ihandle -> [phandle].

Before the guest started, the used memory is:
0..e60 - the initial firmware
8000..10000 - stack
400000.. - kernel
3ea0000.. - initramdisk

This OF CI does not implement "interpret".

Unlike SLOF, this does not format uninitialized nvram. Instead, this
includes a disk image with pre-formatted nvram.

With this basic support, this can only boot into kernel directly.
However this is just enough for the petitboot kernel and initradmdisk to
boot from any possible source. Note this requires reasonably recent guest
kernel with:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735

The immediate benefit is much faster booting time which especially
crucial with fully emulated early CPU bring up environments. Also this
may come handy when/if GRUB-in-the-userspace sees light of the day.

This separates VOF and sPAPR in a hope that VOF bits may be reused by
other POWERPC boards which do not support pSeries.

This assumes potential support for booting from QEMU backends
such as blockdev or netdev without devices/drivers used.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20210625055155.2252896-1-aik@ozlabs.ru>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
[dwg: Adjusted some includes which broke compile in some more obscure
 compilation setups]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agodocs/system: ppc: Update ppce500 documentation with eTSEC support
Bin Meng [Tue, 6 Jul 2021 03:19:01 +0000 (11:19 +0800)]
docs/system: ppc: Update ppce500 documentation with eTSEC support

This adds eTSEC support to the PowerPC `ppce500` machine documentation.

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agoroms/u-boot: Bump ppce500 u-boot to v2021.07 to add eTSEC support
Bin Meng [Tue, 6 Jul 2021 02:46:41 +0000 (10:46 +0800)]
roms/u-boot: Bump ppce500 u-boot to v2021.07 to add eTSEC support

Update the QEMU shipped u-boot.e500 image built from U-Boot mainline
v2021.07 release, which added eTSEC support to the QEMU ppce500 target,
via the following U-Boot series:

  http://patchwork.ozlabs.org/project/uboot/list/?series=233875&state=*

The cross-compilation toolchain used to build the U-Boot image is:
https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/10.1.0/x86_64-gcc-10.1.0-nolibc-powerpc-linux.tar.xz

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: change ppc_hash32_xlate to use mmu_idx
Bruno Larsen (billionai) [Tue, 6 Jul 2021 15:03:16 +0000 (12:03 -0300)]
target/ppc: change ppc_hash32_xlate to use mmu_idx

Changed hash32 address translation to use the supplied mmu_idx, instead
of using what was stored in the msr, for parity purposes (radix64
already uses that) and for conceptual correctness, all the relevant
functions should always use the supplied mmu_idx, as there are no
guarantees that the mmu_idx stored in the CPU variable will not desync.

Signed-off-by: Bruno Larsen (billionai) <bruno.larsen@eldorado.org.br>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20210706150316.21005-3-bruno.larsen@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: introduce mmu-books.h
Bruno Larsen (billionai) [Tue, 6 Jul 2021 15:03:15 +0000 (12:03 -0300)]
target/ppc: introduce mmu-books.h

Intrudoce a header common to all BookS MMUs, that can hold code that is
common to hash32 and book3s-v3 MMUs.

Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruno Larsen (billionai) <bruno.larsen@eldorado.org.br>
Message-Id: <20210706150316.21005-2-bruno.larsen@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: changed ppc_hash64_xlate to use mmu_idx
Bruno Larsen (billionai) [Mon, 28 Jun 2021 13:36:10 +0000 (10:36 -0300)]
target/ppc: changed ppc_hash64_xlate to use mmu_idx

Changed hash64 address translation to use the supplied mmu_idx instead
of using the one stored in the msr, for parity purposes (other book3s
MMUs already use it).

Signed-off-by: Bruno Larsen (billionai) <bruno.larsen@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210628133610.1143-4-bruno.larsen@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: fix address translation bug for radix mmus
Bruno Larsen (billionai) [Mon, 28 Jun 2021 13:36:08 +0000 (10:36 -0300)]
target/ppc: fix address translation bug for radix mmus

This commit attempts to fix a technical hiccup first mentioned by Richard
Henderson in
https://lists.nongnu.org/archive/html/qemu-devel/2021-05/msg06247.html

To sumarize the hiccup here, when radix-style mmus are translating an
address, they might need to call a second level of translation, with
hypervisor privileges. However, the way it was being done up until
this point meant that the second level translation had the same
privileges as the first level. It could lead to a bug in address
translation when running KVM inside a TCG guest, but this bug was never
experienced by users, so this isn't as much a bug fix as it is a
correctness cleanup.

This patch attempts that cleanup by making radix64_*_xlate functions
receive the mmu_idx, and passing one with the correct permission for the
second level translation.

The mmuidx macros added by this patch are only correct for non-bookE
mmus, because BookE style set the IS and DS bits inverted and there
might be other subtle differences. However, there doesn't seem to be
BookE cpus that have radix-style mmus, so we left a comment there to
document the issue, in case a machine does have that and was missed.

As part of this cleanup, we now need to send the correct mmmu_idx
when calling get_phys_page_debug, otherwise we might not be able to see the
memory that the CPU could

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bruno Larsen (billionai) <bruno.larsen@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210628133610.1143-2-bruno.larsen@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: Fix compilation with DEBUG_BATS debug option
Fabiano Rosas [Fri, 2 Jul 2021 21:52:35 +0000 (18:52 -0300)]
target/ppc: Fix compilation with DEBUG_BATS debug option

../target/ppc/mmu-hash32.c: In function 'ppc_hash32_bat_lookup':
../target/ppc/mmu-hash32.c:204:13: error: 'BATu' undeclared (first use in this function);
  204 |             BATu = &BATut[i];
      |             ^~~~
      |             BATut
../target/ppc/mmu-hash32.c:205:13: error: 'BATl' undeclared (first use in this function);
  205 |             BATl = &BATlt[i];
      |             ^~~~
      |             BATlt
../target/ppc/mmu-hash32.c:206:13: error: 'BEPIu' undeclared (first use in this function)
  206 |             BEPIu = *BATu & BATU32_BEPIU;
      |             ^~~~~
../target/ppc/mmu-hash32.c:206:29: error: 'BATU32_BEPIU' undeclared (first use in this function);
  206 |             BEPIu = *BATu & BATU32_BEPIU;
      |                             ^~~~~~~~~~~~
      |                             BATU32_BEPI
../target/ppc/mmu-hash32.c:207:13: error: 'BEPIl' undeclared (first use in this function)
  207 |             BEPIl = *BATu & BATU32_BEPIL;
      |             ^~~~~
../target/ppc/mmu-hash32.c:207:29: error: 'BATU32_BEPIL' undeclared (first use in this function);
  207 |             BEPIl = *BATu & BATU32_BEPIL;
      |                             ^~~~~~~~~~~~
      |                             BATU32_BEPI
../target/ppc/mmu-hash32.c:208:13: error: 'bl' undeclared (first use in this function)
  208 |             bl = (*BATu & 0x00001FFC) << 15;
      |             ^~

Fixes: 9813279664 ("target-ppc: Disentangle BAT code for 32-bit hash MMUs")
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20210702215235.1941771-4-farosas@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: Fix compilation with FLUSH_ALL_TLBS debug option
Fabiano Rosas [Fri, 2 Jul 2021 21:52:34 +0000 (18:52 -0300)]
target/ppc: Fix compilation with FLUSH_ALL_TLBS debug option

../target/ppc/mmu_helper.c: In function 'helper_store_ibatu':
../target/ppc/mmu_helper.c:1802:17: error: unused variable 'cpu' [-Werror=unused-variable]
 1802 |     PowerPCCPU *cpu = env_archcpu(env);
      |                 ^~~
../target/ppc/mmu_helper.c: In function 'helper_store_dbatu':
../target/ppc/mmu_helper.c:1838:17: error: unused variable 'cpu' [-Werror=unused-variable]
 1838 |     PowerPCCPU *cpu = env_archcpu(env);
      |                 ^~~
../target/ppc/mmu_helper.c: In function 'helper_store_601_batu':
../target/ppc/mmu_helper.c:1874:17: error: unused variable 'cpu' [-Werror=unused-variable]
 1874 |     PowerPCCPU *cpu = env_archcpu(env);
      |                 ^~~
../target/ppc/mmu_helper.c: In function 'helper_store_601_batl':
../target/ppc/mmu_helper.c:1919:17: error: unused variable 'cpu' [-Werror=unused-variable]
 1919 |     PowerPCCPU *cpu = env_archcpu(env);

Fixes: db70b31144 ("target/ppc: Use env_cpu, env_archcpu")
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20210702215235.1941771-3-farosas@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: Fix compilation with DUMP_PAGE_TABLES debug option
Fabiano Rosas [Fri, 2 Jul 2021 21:52:33 +0000 (18:52 -0300)]
target/ppc: Fix compilation with DUMP_PAGE_TABLES debug option

../target/ppc/mmu_helper.c: In function 'get_segment_6xx_tlb':
../target/ppc/mmu_helper.c:514:46: error: passing argument 1 of
'ppc_hash32_hpt_mask' from incompatible pointer type [-Werror=incompatible-pointer-types]

  514 |                          ppc_hash32_hpt_mask(env) + 0x80);
      |                                              ^~~
      |                                              |
      |                                              CPUPPCState *

Fixes: 36778660d7 ("target/ppc: Eliminate htab_base and htab_mask variables")
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20210702215235.1941771-2-farosas@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: Restrict ppc_cpu_tlb_fill to TCG
Richard Henderson [Mon, 21 Jun 2021 12:51:14 +0000 (09:51 -0300)]
target/ppc: Restrict ppc_cpu_tlb_fill to TCG

This function is used by TCGCPUOps, and is thus TCG specific.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210621125115.67717-10-bruno.larsen@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: Introduce ppc_xlate
Richard Henderson [Mon, 21 Jun 2021 12:51:13 +0000 (09:51 -0300)]
target/ppc: Introduce ppc_xlate

Create one common dispatch for all of the ppc_*_xlate functions.
Use ppc64_v3_radix to directly dispatch between ppc_radix64_xlate
and ppc_hash64_xlate.

Remove the separate *_handle_mmu_fault and *_get_phys_page_debug
functions, using common code for ppc_cpu_tlb_fill and
ppc_cpu_get_phys_page_debug.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210621125115.67717-9-bruno.larsen@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: Split out ppc_jumbo_xlate
Richard Henderson [Mon, 21 Jun 2021 12:51:12 +0000 (09:51 -0300)]
target/ppc: Split out ppc_jumbo_xlate

Mirror the interface of ppc_radix64_xlate (mostly), putting all
of the logic for older mmu translation into a single entry point.
For booke, we need to add mmu_idx to the xlate-style interface.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210621125115.67717-8-bruno.larsen@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: Split out ppc_hash32_xlate
Richard Henderson [Mon, 21 Jun 2021 12:51:11 +0000 (09:51 -0300)]
target/ppc: Split out ppc_hash32_xlate

Mirror the interface of ppc_radix64_xlate, putting all of
the logic for hash32 translation into a single entry point.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210621125115.67717-7-bruno.larsen@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: Split out ppc_hash64_xlate
Richard Henderson [Mon, 21 Jun 2021 12:51:10 +0000 (09:51 -0300)]
target/ppc: Split out ppc_hash64_xlate

Mirror the interface of ppc_radix64_xlate, putting all of
the logic for hash64 translation into a single function.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210621125115.67717-6-bruno.larsen@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: Use bool success for ppc_radix64_xlate
Richard Henderson [Mon, 21 Jun 2021 12:51:09 +0000 (09:51 -0300)]
target/ppc: Use bool success for ppc_radix64_xlate

Instead of returning non-zero for failure, return true for success.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210621125115.67717-5-bruno.larsen@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: Push real-mode handling into ppc_radix64_xlate
Richard Henderson [Mon, 21 Jun 2021 12:51:08 +0000 (09:51 -0300)]
target/ppc: Push real-mode handling into ppc_radix64_xlate

This removes some incomplete duplication between
ppc_radix64_handle_mmu_fault and ppc_radix64_get_phys_page_debug.
The former was correct wrt SPR_HRMOR and the latter was not.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210621125115.67717-4-bruno.larsen@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: Use MMUAccessType with *_handle_mmu_fault
Richard Henderson [Mon, 21 Jun 2021 12:51:07 +0000 (09:51 -0300)]
target/ppc: Use MMUAccessType with *_handle_mmu_fault

These changes were waiting until we didn't need to match
the function type of PowerPCCPUClass.handle_mmu_fault.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210621125115.67717-3-bruno.larsen@eldorado.org.br>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: Remove PowerPCCPUClass.handle_mmu_fault
Richard Henderson [Mon, 21 Jun 2021 12:51:06 +0000 (09:51 -0300)]
target/ppc: Remove PowerPCCPUClass.handle_mmu_fault

Instead, use a switch on env->mmu_model.  This avoids some
replicated information in cpu setup.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210621125115.67717-2-bruno.larsen@eldorado.org.br>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agospapr: tune rtas-size
Alexey Kardashevskiy [Tue, 22 Jun 2021 07:03:36 +0000 (17:03 +1000)]
spapr: tune rtas-size

QEMU reserves space for RTAS via /rtas/rtas-size which tells the client
how much space the RTAS requires to work which includes the RTAS binary
blob implementing RTAS runtime. Because pseries supports FWNMI which
requires plenty of space, QEMU reserves more than 2KB which is
enough for the RTAS blob as it is just 20 bytes (under QEMU).

Since FWNMI reset delivery was added, RTAS_SIZE macro is not used anymore.
This replaces RTAS_SIZE with RTAS_MIN_SIZE and uses it in
the /rtas/rtas-size calculation to account for the RTAS blob.

Fixes: 0e236d347790 ("ppc/spapr: Implement FWNMI System Reset delivery")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20210622070336.1463250-1-aik@ozlabs.ru>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: Drop PowerPCCPUClass::interrupts_big_endian()
Greg Kurz [Tue, 22 Jun 2021 14:09:26 +0000 (16:09 +0200)]
target/ppc: Drop PowerPCCPUClass::interrupts_big_endian()

This isn't used anymore.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210622140926.677618-3-groug@kaod.org>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agotarget/ppc: Introduce ppc_interrupts_little_endian()
Greg Kurz [Tue, 22 Jun 2021 14:09:25 +0000 (16:09 +0200)]
target/ppc: Introduce ppc_interrupts_little_endian()

PowerPC CPUs use big endian by default but starting with POWER7,
server grade CPUs use the ILE bit of the LPCR special purpose
register to decide on the endianness to use when handling
interrupts. This gives a clue to QEMU on the endianness the
guest kernel is running, which is needed when generating an
ELF dump of the guest or when delivering an FWNMI machine
check interrupt.

Commit 382d2db62bcb ("target-ppc: Introduce callback for interrupt
endianness") added a class method to PowerPCCPUClass to modelize
this : default implementation returns a fixed "big endian" value,
while POWER7 and newer do the LPCR_ILE check. This is suboptimal
as it forces to implement the method for every new CPU family, and
it is very unlikely that this will ever be different than what we
have today.

We basically only have three cases to consider:
a) CPU doesn't have an LPCR => big endian
b) CPU has an LPCR but doesn't support the ILE bit => big endian
c) CPU has an LPCR and supports the ILE bit => little or big endian

Instead of class methods, introduce an inline helper that checks the
ILE bit in the LPCR_MASK to decide on the outcome. The new helper
words little endian instead of big endian. This allows to drop a !
operator in ppc_cpu_do_fwnmi_machine_check().

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210622140926.677618-2-groug@kaod.org>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
3 years agoMerge remote-tracking branch 'remotes/stefanha-gitlab/tags/block-pull-request' into...
Peter Maydell [Thu, 8 Jul 2021 21:17:28 +0000 (22:17 +0100)]
Merge remote-tracking branch 'remotes/stefanha-gitlab/tags/block-pull-request' into staging

Pull request

# gpg: Signature made Thu 08 Jul 2021 14:11:37 BST
# gpg:                using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha-gitlab/tags/block-pull-request:
  block/io: Merge discard request alignments
  block: Add backend_defaults property
  block/file-posix: Optimize for macOS
  util/async: print leaked BH name when AioContext finalizes
  util/async: add a human-readable name to BHs for debugging

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
3 years agovfio: Disable only uncoordinated discards for VFIO_TYPE1 iommus
David Hildenbrand [Tue, 13 Apr 2021 09:55:31 +0000 (11:55 +0200)]
vfio: Disable only uncoordinated discards for VFIO_TYPE1 iommus

We support coordinated discarding of RAM using the RamDiscardManager for
the VFIO_TYPE1 iommus. Let's unlock support for coordinated discards,
keeping uncoordinated discards (e.g., via virtio-balloon) disabled if
possible.

This unlocks virtio-mem + vfio on x86-64. Note that vfio used via "nvme://"
by the block layer has to be implemented/unlocked separately. For now,
virtio-mem only supports x86-64; we don't restrict RamDiscardManager to
x86-64, though: arm64 and s390x are supposed to work as well, and we'll
test once unlocking virtio-mem support. The spapr IOMMUs will need special
care, to be tackled later, e.g.., once supporting virtio-mem.

Note: The block size of a virtio-mem device has to be set to sane sizes,
depending on the maximum hotplug size - to not run out of vfio mappings.
The default virtio-mem block size is usually in the range of a couple of
MBs. The maximum number of mapping is 64k, shared with other users.
Assume you want to hotplug 256GB using virtio-mem - the block size would
have to be set to at least 8 MiB (resulting in 32768 separate mappings).

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-14-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
3 years agovirtio-mem: Require only coordinated discards
David Hildenbrand [Tue, 13 Apr 2021 09:55:30 +0000 (11:55 +0200)]
virtio-mem: Require only coordinated discards

We implement the RamDiscardManager interface and only require coordinated
discarding of RAM to work.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-13-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
3 years agosoftmmu/physmem: Extend ram_block_discard_(require|disable) by two discard types
David Hildenbrand [Tue, 13 Apr 2021 09:55:29 +0000 (11:55 +0200)]
softmmu/physmem: Extend ram_block_discard_(require|disable) by two discard types

We want to separate the two cases whereby we discard ram
- uncoordinated: e.g., virito-balloon
- coordinated: e.g., virtio-mem coordinated via the RamDiscardManager

Reviewed-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-12-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
3 years agosoftmmu/physmem: Don't use atomic operations in ram_block_discard_(disable|require)
David Hildenbrand [Tue, 13 Apr 2021 09:55:28 +0000 (11:55 +0200)]
softmmu/physmem: Don't use atomic operations in ram_block_discard_(disable|require)

We have users in migration context that don't hold the BQL (when
finishing migration). To prepare for further changes, use a dedicated mutex
instead of atomic operations. Keep using qatomic_read ("READ_ONCE") for the
functions that only extract the current state (e.g., used by
virtio-balloon), locking isn't necessary.

While at it, split up the counter into two variables to make it easier
to understand.

Suggested-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-11-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
3 years agovfio: Support for RamDiscardManager in the vIOMMU case
David Hildenbrand [Tue, 13 Apr 2021 09:55:27 +0000 (11:55 +0200)]
vfio: Support for RamDiscardManager in the vIOMMU case

vIOMMU support works already with RamDiscardManager as long as guests only
map populated memory. Both, populated and discarded memory is mapped
into &address_space_memory, where vfio_get_xlat_addr() will find that
memory, to create the vfio mapping.

Sane guests will never map discarded memory (e.g., unplugged memory
blocks in virtio-mem) into an IOMMU - or keep it mapped into an IOMMU while
memory is getting discarded. However, there are two cases where a malicious
guests could trigger pinning of more memory than intended.

One case is easy to handle: the guest trying to map discarded memory
into an IOMMU.

The other case is harder to handle: the guest keeping memory mapped in
the IOMMU while it is getting discarded. We would have to walk over all
mappings when discarding memory and identify if any mapping would be a
violation. Let's keep it simple for now and print a warning, indicating
that setting RLIMIT_MEMLOCK can mitigate such attacks.

We have to take care of incoming migration: at the point the
IOMMUs get restored and start creating mappings in vfio, RamDiscardManager
implementations might not be back up and running yet: let's add runstate
priorities to enforce the order when restoring.

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-10-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
3 years agovfio: Sanity check maximum number of DMA mappings with RamDiscardManager
David Hildenbrand [Tue, 13 Apr 2021 09:55:26 +0000 (11:55 +0200)]
vfio: Sanity check maximum number of DMA mappings with RamDiscardManager

Although RamDiscardManager can handle running into the maximum number of
DMA mappings by propagating errors when creating a DMA mapping, we want
to sanity check and warn the user early that there is a theoretical setup
issue and that virtio-mem might not be able to provide as much memory
towards a VM as desired.

As suggested by Alex, let's use the number of KVM memory slots to guess
how many other mappings we might see over time.

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-9-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
3 years agovfio: Query and store the maximum number of possible DMA mappings
David Hildenbrand [Tue, 13 Apr 2021 09:55:25 +0000 (11:55 +0200)]
vfio: Query and store the maximum number of possible DMA mappings

Let's query the maximum number of possible DMA mappings by querying the
available mappings when creating the container (before any mappings are
created). We'll use this informaton soon to perform some sanity checks
and warn the user.

Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-8-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
3 years agovfio: Support for RamDiscardManager in the !vIOMMU case
David Hildenbrand [Tue, 13 Apr 2021 09:55:24 +0000 (11:55 +0200)]
vfio: Support for RamDiscardManager in the !vIOMMU case

Implement support for RamDiscardManager, to prepare for virtio-mem
support. Instead of mapping the whole memory section, we only map
"populated" parts and update the mapping when notified about
discarding/population of memory via the RamDiscardListener. Similarly, when
syncing the dirty bitmaps, sync only the actually mapped (populated) parts
by replaying via the notifier.

Using virtio-mem with vfio is still blocked via
ram_block_discard_disable()/ram_block_discard_require() after this patch.

Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-7-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
3 years agovirtio-mem: Implement RamDiscardManager interface
David Hildenbrand [Tue, 13 Apr 2021 09:55:23 +0000 (11:55 +0200)]
virtio-mem: Implement RamDiscardManager interface

Let's properly notify when (un)plugging blocks, after discarding memory
and before allowing the guest to consume memory. Handle errors from
notifiers gracefully (e.g., no remaining VFIO mappings) when plugging,
rolling back the change and telling the guest that the VM is busy.

One special case to take care of is replaying all notifications after
restoring the vmstate. The device starts out with all memory discarded,
so after loading the vmstate, we have to notify about all plugged
blocks.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-6-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
3 years agovirtio-mem: Don't report errors when ram_block_discard_range() fails
David Hildenbrand [Tue, 13 Apr 2021 09:55:22 +0000 (11:55 +0200)]
virtio-mem: Don't report errors when ram_block_discard_range() fails

Any errors are unexpected and ram_block_discard_range() already properly
prints errors. Let's stop manually reporting errors.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-5-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
3 years agovirtio-mem: Factor out traversing unplugged ranges
David Hildenbrand [Tue, 13 Apr 2021 09:55:21 +0000 (11:55 +0200)]
virtio-mem: Factor out traversing unplugged ranges

Let's factor out the core logic, no need to replicate.

Reviewed-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-4-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
3 years agomemory: Helpers to copy/free a MemoryRegionSection
David Hildenbrand [Tue, 13 Apr 2021 09:55:20 +0000 (11:55 +0200)]
memory: Helpers to copy/free a MemoryRegionSection

In case one wants to create a permanent copy of a MemoryRegionSections,
one needs access to flatview_ref()/flatview_unref(). Instead of exposing
these, let's just add helpers to copy/free a MemoryRegionSection and
properly adjust references.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-3-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>