Zhenzhong Duan [Mon, 4 Nov 2024 12:55:34 +0000 (20:55 +0800)]
intel_iommu: Send IQE event when setting reserved bit in IQT_TAIL
According to VTD spec, Figure 11-22, Invalidation Queue Tail Register,
"When Descriptor Width (DW) field in Invalidation Queue Address Register
(IQA_REG) is Set (256-bit descriptors), hardware treats bit-4 as reserved
and a value of 1 in the bit will result in invalidation queue error."
Current code missed to send IQE event to guest, fix it.
Fixes: c0c1d351849b ("intel_iommu: add 256 bits qi_desc support")
Suggested-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <
20241104125536.
1236118-2-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Salil Mehta [Sun, 3 Nov 2024 10:24:19 +0000 (10:24 +0000)]
hw/acpi: Update GED with vCPU Hotplug VMSD for migration
The ACPI CPU hotplug states must be migrated along with other vCPU
hotplug states to the destination VM. Update the GED's VM State
Description (VMSD) table subsection to conditionally include the CPU
Hotplug VM State Description (VMSD).
Excerpt of GED VMSD State Dump at Source:
"acpi-ged (16)": {
"ged_state": {
"sel": "0x00000000"
},
[...]
"acpi-ged/cpuhp": {
"cpuhp_state": {
"selector": "0x00000005",
"command": "0x00",
"devs": [
{
"is_inserting": false,
"is_removing": false,
"ost_event": "0x00000000",
"ost_status": "0x00000000"
},
[...]
{
"is_inserting": false,
"is_removing": false,
"ost_event": "0x00000000",
"ost_status": "0x00000000"
}
]
}
}
},
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Message-Id: <
20241103102419.202225-6-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Salil Mehta [Mon, 4 Nov 2024 11:28:23 +0000 (06:28 -0500)]
tests/qtest/bios-tables-test: Update DSDT golden masters for x86/{pc,q35}
Update DSDT golden master files for x86/pc and x86/q35 platforms to
accommodate changes made in the architecture-agnostic CPU AML. These
updates notify the guest OS of vCPU hot-plug and hot-unplug status
using the ACPI `_STA.Enabled` bit.
The following is a diff of the changes in the .dsl file generated with
IASL:
@@ -1480,6 +1480,7 @@
CRMV, 1,
CEJ0, 1,
CEJF, 1,
+ CPRS, 1,
Offset (0x05),
CCMD, 8
}
@@ -1514,9 +1515,16 @@
Acquire (\_SB.PCI0.PRES.CPLK, 0xFFFF)
\_SB.PCI0.PRES.CSEL = Arg0
Local0 = Zero
- If ((\_SB.PCI0.PRES.CPEN == One))
- {
- Local0 = 0x0F
+ If ((\_SB.PCI0.PRES.CPRS == One))
+ {
+ If ((\_SB.PCI0.PRES.CPEN == One))
+ {
+ Local0 = 0x0F
+ }
+ Else
+ {
+ Local0 = 0x0D
+ }
}
Release (\_SB.PCI0.PRES.CPLK)
Reported-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Salil Mehta [Sun, 3 Nov 2024 10:24:17 +0000 (10:24 +0000)]
hw/acpi: Update ACPI `_STA` method with QOM vCPU ACPI Hotplug states
Reflect the QOM vCPUs ACPI CPU hotplug states in the `_STA.Present` and
and `_STA.Enabled` bits when the guest kernel evaluates the ACPI
`_STA` method during initialization, as well as when vCPUs are
hot-plugged or hot-unplugged. If the CPU is present then the its
`enabled` status can be fetched using architecture-specific code [1].
Reference:
[1] Example implementation of architecture-specific hook to fetch CPU
`enabled status
Link: https://github.com/salil-mehta/qemu/commit/c0b416b11e5af6505e558866f0eb6c9f3709173e
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Message-Id: <
20241103102419.202225-4-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Salil Mehta [Sun, 3 Nov 2024 10:24:16 +0000 (10:24 +0000)]
qtest: allow ACPI DSDT Table changes
list changed files in tests/qtest/bios-tables-test-allowed-diff.h
Reported-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Message-Id: <
20241103102419.202225-3-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Salil Mehta [Sun, 3 Nov 2024 10:24:15 +0000 (10:24 +0000)]
hw/acpi: Make CPUs ACPI `presence` conditional during vCPU hot-unplug
On most architectures, during vCPU hot-plug and hot-unplug actions, the
firmware or VMM/QEMU can update the OS on vCPU status by toggling the
ACPI method `_STA.Present` bit. However, certain CPU architectures
prohibit [1] modifications to a CPU’s `presence` status after the kernel
has booted.
This limitation [2][3] exists because many per-CPU components, such as
interrupt controllers and various per-CPU features tightly integrated
with CPUs, may not support reconfiguration once the kernel is
initialized. Often, these components cannot be powered down, as they may
belong to an `always-on` power domain. As a result, some architectures
require all CPUs to remain `_STA.Present` after system initialization.
Therefore, it is essential to mirror the exact QOM vCPU status through
ACPI for the Guest kernel. For this, we should determine—via
architecture-specific code[4]—whether vCPUs must always remain present
and whether the associated `AcpiCpuStatus::cpu` object should remain
valid, even following a vCPU hot-unplug operation.
References:
[1] Check comment 5 in the bugzilla entry
Link: https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
[2] KVMForum 2023 Presentation: Challenges Revisited in Supporting Virt CPU Hotplug on
architectures that don’t Support CPU Hotplug (like ARM64)
Link: https://kvm-forum.qemu.org/2023/KVM-forum-cpu-hotplug_7OJ1YyJ.pdf
b. Qemu Link: https://kvm-forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Virt_CPU_Hotplug_-__ii0iNb3.pdf
[3] KVMForum 2020 Presentation: Challenges in Supporting Virtual CPU Hotplug on
SoC Based Systems (like ARM64)
Link: https://kvmforum2020.sched.com/event/eE4m
[4] Example implementation of architecture-specific CPU persistence hook
Link: https://github.com/salil-mehta/qemu/commit/c0b416b11e5af6505e558866f0eb6c9f3709173e
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Message-Id: <
20241103102419.202225-2-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Roque Arcudia Hernandez [Fri, 1 Nov 2024 21:59:23 +0000 (21:59 +0000)]
hw/pci: Add parenthesis to PCI_BUILD_BDF macro
The bus parameter in the macro PCI_BUILD_BDF is not surrounded by
parenthesis. This can create a compile error when warnings are
treated as errors or can potentially create runtime errors due to the
operator precedence.
For instance:
file.c:x:32: error: suggest parentheses around '-' inside '<<'
[-Werror=parentheses]
171 | uint16_t bdf = PCI_BUILD_BDF(a - b, sdev->devfn);
| ~~^~~
include/hw/pci/pci.h:19:41: note: in definition of macro
'PCI_BUILD_BDF'
19 | #define PCI_BUILD_BDF(bus, devfn) ((bus << 8) | (devfn))
| ^~~
cc1: all warnings being treated as errors
Signed-off-by: Roque Arcudia Hernandez <roqueh@google.com>
Reviewed-by: Nabih Estefan <nabihestefan@google.com>
Message-Id: <
20241101215923.
3399311-1-roqueh@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 1 Nov 2024 13:39:17 +0000 (13:39 +0000)]
hw/cxl: Ensure there is enough data to read the input header in cmd_get_physical_port_state()
If len_in is smaller than the header length then the accessing the
number of ports will result in an out of bounds access.
Add a check to avoid this.
Reported-by: Esifiel <esifiel@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20241101133917.27634-11-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 1 Nov 2024 13:39:16 +0000 (13:39 +0000)]
hw/cxl: Ensure there is enough data for the header in cmd_ccls_set_lsa()
The properties of the requested set command cannot be established if
len_in is less than the size of the header.
Reported-by: Esifiel <esifiel@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20241101133917.27634-10-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 1 Nov 2024 13:39:15 +0000 (13:39 +0000)]
hw/cxl: Check that writes do not go beyond end of target attributes
In cmd_features_set_feature() the an offset + data size schemed
is used to allow for large features. Ensure this does not write
beyond the end fo the buffers used to accumulate the full feature
attribute set.
Reported-by: Esifiel <esifiel@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20241101133917.27634-9-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 1 Nov 2024 13:39:14 +0000 (13:39 +0000)]
hw/cxl: Ensuring enough data to read parameters in cmd_tunnel_management_cmd()
If len_in is less than the minimum spec allowed value, then return
CXL_MBOX_INVALID_PAYLOAD_LENGTH
Reported-by: Esifiel <esifiel@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20241101133917.27634-8-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 1 Nov 2024 13:39:13 +0000 (13:39 +0000)]
hw/cxl: Avoid accesses beyond the end of cel_log.
Add a check that the requested offset + length does not go beyond the end
of the cel_log.
Whilst the cci->cel_log is large enough to include all possible CEL
entries, the guest might still ask for entries beyond the end of it.
Move the comment to this new check rather than before the check on the
type of log requested.
Reported-by: Esifiel <esifiel@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20241101133917.27634-7-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 1 Nov 2024 13:39:12 +0000 (13:39 +0000)]
hw/cxl: Check the length of data requested fits in get_log()
Checking offset + length is of no relevance when verifying the CEL
data will fit in the mailbox payload. Only the length is is relevant.
Note that this removes a potential overflow.
Reported-by: Esifiel <esifiel@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20241101133917.27634-6-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 1 Nov 2024 13:39:11 +0000 (13:39 +0000)]
hw/cxl: Check enough data in cmd_firmware_update_transfer()
Buggy guest can write a message that advertises more data that
is provided. As QEMU internally duplicates the reported message
size, this may result in an out of bounds access.
Add sanity checks on the size to avoid this.
Reported-by: Esifiel <esifiel@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20241101133917.27634-5-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 1 Nov 2024 13:39:10 +0000 (13:39 +0000)]
hw/cxl: Check input length is large enough in cmd_events_clear_records()
Buggy software might write a message that is too short for
either the header, or the header + the event data that is specified
in the header. This may result in accesses beyond the range of the
message allocated as a duplicate of the incoming message buffer.
Reported-by: Esifiel <esifiel@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20241101133917.27634-4-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 1 Nov 2024 13:39:09 +0000 (13:39 +0000)]
hw/cxl: Check input includes at least the header in cmd_features_set_feature()
A buggy guest might write an insufficiently large message.
Check the header is present. Whilst zero data after the header is very
odd it will just result in failure to copy any data.
Reported-by: Esifiel <esifiel@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20241101133917.27634-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Fri, 1 Nov 2024 13:39:08 +0000 (13:39 +0000)]
hw/cxl: Check size of input data to dynamic capacity mailbox commands
cxl_cmd_dcd_release_dyn_cap() and cmd_dcd_add_dyn_cap_rsp() are missing
input message size checks. These must be done in the individual
commands when the command has a variable length input payload.
A buggy or malicious guest might send undersized messages via the mailbox.
As that size is used to take a copy of the mailbox content, each command
must check there is sufficient data. In this case the first check is that
there is enough data to read how many extents there are, and the second
that there is enough for those elements to be accessed.
Reported-by: Esifiel <esifiel@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20241101133917.27634-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fan Ni [Fri, 1 Nov 2024 13:20:05 +0000 (13:20 +0000)]
hw/cxl/cxl-mailbox-util: Fix output buffer index update when retrieving DC extents
In the function of retrieving DC extents (cmd_dcd_get_dyn_cap_ext_list),
the output buffer index was not correctly updated while iterating the
extent list on the device, leaving the extents returned incorrect except for
the first one.
Fixes: 1c9221f19e62 ("hw/mem/cxl_type3: Add DC extent list representative and get DC extent list mailbox support")
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20241101132005.26633-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fan Ni [Fri, 1 Nov 2024 13:20:04 +0000 (13:20 +0000)]
cxl/cxl-mailbox-utils: Fix size check for cmd_firmware_update_get_info
In the function cmd_firmware_update_get_info for handling Get FW info
command (0x0200h), the vmem, pmem and DC capacity size check were
incorrect. The size should be aligned to 256MiB, not smaller than
256MiB.
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20241101132005.26633-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Marcin Juszkiewicz [Wed, 23 Oct 2024 11:38:20 +0000 (13:38 +0200)]
pcie: enable Extended tag field support
>From what I read PCI has 32 transactions, PCI Express devices can handle
256 with Extended tag enabled (spec mentions also larger values but I
lack PCIe knowledge).
QEMU leaves 'Extended tag field' with 0 as value:
Capabilities: [e0] Express (v1) Root Complex Integrated Endpoint, IntMsgNum 0
DevCap: MaxPayload 128 bytes, PhantFunc 0
ExtTag- RBE+ FLReset- TEE-IO-
SBSA ACS has test 824 which checks for PCIe device capabilities. BSA
specification [1] (SBSA is on top of BSA) in section F.3.2 lists
expected values for Device Capabilities Register:
Device Capabilities Register Requirement
Role based error reporting RCEC and RCiEP: Hardwired to 1
Endpoint L0s acceptable latency RCEC and RCiEP: Hardwired to 0
L1 acceptable latency RCEC and RCiEP: Hardwired to 0
Captured slot power limit scale RCEC and RCiEP: Hardwired to 0
Captured slot power limit value RCEC and RCiEP: Hardwired to 0
Max payload size value must be compliant with PCIe spec
Phantom functions RCEC and RCiEP: Recommendation is to
hardwire this bit to 0.
Extended tag field Hardwired to 1
1. https://developer.arm.com/documentation/den0094/c/
This change enables Extended tag field. All versioned platforms should
have it disabled for older versions (tested with Arm/virt).
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Message-Id: <
20241023113820.486017-1-marcin.juszkiewicz@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Zhenzhong Duan [Mon, 28 Oct 2024 02:25:14 +0000 (10:25 +0800)]
intel_iommu: Introduce property "stale-tm" to control Transient Mapping (TM) field
VT-d spec removed Transient Mapping (TM) field from second-level page-tables
and treat the field as Reserved(0) since revision 3.2.
Changing the field as reserved(0) will break backward compatibility, so
introduce a property "stale-tm" to allow user to control the setting.
Use pc_compat_9_1 to handle the compatibility for machines before 9.2 which
allow guest to set the field. Starting from 9.2, this field is reserved(0)
by default to match spec. Of course, user can force it on command line.
This doesn't impact function of vIOMMU as there was no logic to emulate
Transient Mapping.
Suggested-by: Yi Liu <yi.l.liu@intel.com>
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <
20241028022514.806657-1-zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Albert Esteve [Tue, 22 Oct 2024 12:46:14 +0000 (14:46 +0200)]
vhost-user: fix shared object return values
VHOST_USER_BACKEND_SHARED_OBJECT_ADD and
VHOST_USER_BACKEND_SHARED_OBJECT_REMOVE state
in the spec that they return 0 for successful
operations, non-zero otherwise. However,
implementation relies on the return types
of the virtio-dmabuf library, with opposite
semantics (true if everything is correct,
false otherwise). Therefore, current
implementation violates the specification.
Revert the logic so that the implementation
of the vhost-user handling methods matches
the specification.
Fixes: 043e127a126bb3ceb5fc753deee27d261fd0c5ce
Fixes: 160947666276c5b7f6bca4d746bcac2966635d79
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Albert Esteve <aesteve@redhat.com>
Message-Id: <
20241022124615.585596-1-aesteve@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Mon, 14 Oct 2024 12:19:02 +0000 (13:19 +0100)]
hw/pci-bridge: Make pxb_dev_realize_common() return if it succeeded
For the CXL PXB there is additional code after pxb_dev_realize_common()
is called. If that realize failed (e.g. due to an out of range numa_node)
we will get a segfault. Return a bool so the caller can check if the
pxb_dev_realize_common() succeeded or not without having to poke around
in the errp.
Fixes: 4f8db8711cbd ("hw/pxb: Allow creation of a CXL PXB (host bridge)")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20241014121902.
2146424-8-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Mon, 14 Oct 2024 12:19:01 +0000 (13:19 +0100)]
hw/cxl: Fix indent of structure member
Add missing 4 spaces of indent to structure element.
Reported-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20241014121902.
2146424-7-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Shiju Jose [Mon, 14 Oct 2024 12:19:00 +0000 (13:19 +0100)]
hw/cxl/cxl-mailbox-utils: Fix for device DDR5 ECS control feature tables
CXL spec 3.1 section 8.2.9.9.11.2 describes the DDR5 Error Check Scrub (ECS)
control feature.
ECS log capabilities field in following ECS tables, which is common for all
memory media FRUs in a CXL device.
Fix struct CXLMemECSReadAttrs and struct CXLMemECSWriteAttrs to make
log entry type field common.
Fixes: 2d41ce38fb9a ("hw/cxl/cxl-mailbox-utils: Add device DDR5 ECS control feature")
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20241014121902.
2146424-6-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fan Ni [Mon, 14 Oct 2024 12:18:59 +0000 (13:18 +0100)]
hw/mem/cxl_type3: Fix More flag setting for dynamic capacity event records
Per cxl spec r3.1, for multiple dynamic capacity event records grouped via
the More flag, the last record in the sequence should clear the More flag.
Before the change, the More flag of the event record is cleared before
the loop of inserting records into the event log, which will leave the flag
always set once it is set in the loop.
Fixes: d0b9b28a5b9f ("hw/cxl/events: Add qmp interfaces to add/release dynamic capacity extents")
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Link: https://lore.kernel.org/r/20240827164304.88876-2-nifan.cxl@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20241014121902.
2146424-5-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Yao Xingtao [Mon, 14 Oct 2024 12:18:58 +0000 (13:18 +0100)]
mem/cxl_type3: Fix overlapping region validation error
When injecting a new poisoned region through qmp_cxl_inject_poison(),
the newly injected region should not overlap with existing poisoned
regions.
The current validation method does not consider the following
overlapping region:
┌───┬───────┬───┐
│a │ b(a) │a │
└───┴───────┴───┘
(a is a newly added region, b is an existing region, and b is a
subregion of a)
Fixes: 9547754f40ee ("hw/cxl: QMP based poison injection support")
Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20241014121902.
2146424-4-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ajay Joshi [Mon, 14 Oct 2024 12:18:57 +0000 (13:18 +0100)]
hw/cxl: Fix background completion percentage calculation
The current completion percentage calculation does not account for the
relative time since the start of the background activity, this leads to
showing incorrect start percentage vs what has actually been completed.
This patch calculates the percentage based on the actual elapsed time since
the start of the operation.
Fixes: 221d2cfbdb53 ("hw/cxl/mbox: Add support for background operations")
Signed-off-by: Ajay Joshi <ajay.opensrc@micron.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/r/20240729102338.22337-1-ajay.opensrc@micron.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20241014121902.
2146424-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Dmitry Frolov [Mon, 14 Oct 2024 12:18:56 +0000 (13:18 +0100)]
hw/cxl: Fix uint32 overflow cxl-mailbox-utils.c
The sum offset + length may overflow uint32. Since this sum is
compared with uint64_t return value of get_lsa_size(), it makes
sense to choose uint64_t type for offset and length.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 3ebe676a3463 ("hw/cxl/device: Implement get/set Label Storage Area (LSA)")
Signed-off-by: Dmitry Frolov <frolov@swemel.ru>
Link: https://lore.kernel.org/r/20240917080925.270597-2-frolov@swemel.ru
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20241014121902.
2146424-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
yaozhenguo [Fri, 11 Oct 2024 10:29:13 +0000 (18:29 +0800)]
virtio/vhost-user: fix qemu abort when hotunplug vhost-user-net device
During the hot-unplugging of vhost-user-net type network cards,
the vhost_user_cleanup function may add the same rcu node to
the rcu linked list. The function call in this case is as follows:
vhost_user_cleanup
->vhost_user_host_notifier_remove
->call_rcu(n, vhost_user_host_notifier_free, rcu);
->g_free_rcu(n, rcu);
When this happens, QEMU will abort in try_dequeue:
if (head == &dummy && qatomic_mb_read(&tail) == &dummy.next) {
abort();
}
backtrace is as follows:
0 __pthread_kill_implementation () at /usr/lib64/libc.so.6
1 raise () at /usr/lib64/libc.so.6
2 abort () at /usr/lib64/libc.so.6
3 try_dequeue () at ../util/rcu.c:235
4 call_rcu_thread (0) at ../util/rcu.c:288
5 qemu_thread_start (0) at ../util/qemu-thread-posix.c:541
6 start_thread () at /usr/lib64/libc.so.6
7 clone3 () at /usr/lib64/libc.so.6
The reason for the abort is that adding two identical nodes to
the rcu linked list will cause the rcu linked list to become a ring,
but when the dummy node is added after the two identical nodes,
the ring is opened. But only one node is added to list with
rcu_call_count added twice. This will cause rcu try_dequeue abort.
This happens when n->addr != 0. In some scenarios, this does happen.
For example, this situation will occur when using a 32-queue DPU
vhost-user-net type network card for hot-unplug testing, because
VhostUserHostNotifier->addr will be cleared during the processing of
VHOST_USER_BACKEND_VRING_HOST_NOTIFIER_MSG. However,it is asynchronous,
so we cannot guarantee that VhostUserHostNotifier->addr is zero in
vhost_user_cleanup. Therefore, it is necessary to merge g_free_rcu
and vhost_user_host_notifier_free into one rcu node.
Fixes: 503e355465 ("virtio/vhost-user: dynamically assign VhostUserHostNotifiers")
Signed-off-by: yaozhenguo <yaozhenguo@jd.com>
Message-Id: <
20241011102913.45582-1-yaozhenguo@jd.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Gao Shiyuan [Wed, 30 Oct 2024 13:13:24 +0000 (21:13 +0800)]
virtio-pci: fix memory_region_find for VirtIOPCIRegion's MR
As shown below, if a virtio PCI device is attached under a pci-bridge, the MR
of VirtIOPCIRegion does not belong to any address space. So memory_region_find
cannot be used to search for this MR.
Introduce the virtio-pci and pci_bridge address spaces to solve this problem.
Before:
memory-region: pci_bridge_pci
0000000000000000-
ffffffffffffffff (prio 0, i/o): pci_bridge_pci
00000000fe840000-
00000000fe840fff (prio 1, i/o): virtio-net-pci-msix
00000000fe840000-
00000000fe84003f (prio 0, i/o): msix-table
00000000fe840800-
00000000fe840807 (prio 0, i/o): msix-pba
0000380000000000-
0000380000003fff (prio 1, i/o): virtio-pci
0000380000000000-
0000380000000fff (prio 0, i/o): virtio-pci-common-virtio-net
0000380000001000-
0000380000001fff (prio 0, i/o): virtio-pci-isr-virtio-net
0000380000002000-
0000380000002fff (prio 0, i/o): virtio-pci-device-virtio-net
0000380000003000-
0000380000003fff (prio 0, i/o): virtio-pci-notify-virtio-net
After:
address-space: virtio-pci-cfg-mem-as
0000380000000000-
0000380000003fff (prio 1, i/o): virtio-pci
0000380000000000-
0000380000000fff (prio 0, i/o): virtio-pci-common-virtio-net
0000380000001000-
0000380000001fff (prio 0, i/o): virtio-pci-isr-virtio-net
0000380000002000-
0000380000002fff (prio 0, i/o): virtio-pci-device-virtio-net
0000380000003000-
0000380000003fff (prio 0, i/o): virtio-pci-notify-virtio-net
address-space: pci_bridge_pci_mem
0000000000000000-
ffffffffffffffff (prio 0, i/o): pci_bridge_pci
00000000fe840000-
00000000fe840fff (prio 1, i/o): virtio-net-pci-msix
00000000fe840000-
00000000fe84003f (prio 0, i/o): msix-table
00000000fe840800-
00000000fe840807 (prio 0, i/o): msix-pba
0000380000000000-
0000380000003fff (prio 1, i/o): virtio-pci
0000380000000000-
0000380000000fff (prio 0, i/o): virtio-pci-common-virtio-net
0000380000001000-
0000380000001fff (prio 0, i/o): virtio-pci-isr-virtio-net
0000380000002000-
0000380000002fff (prio 0, i/o): virtio-pci-device-virtio-net
0000380000003000-
0000380000003fff (prio 0, i/o): virtio-pci-notify-virtio-net
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2576
Fixes: ffa8a3e3b2e6 ("virtio-pci: Add lookup subregion of VirtIOPCIRegion MR")
Co-developed-by: Zuo Boqun <zuoboqun@baidu.com>
Signed-off-by: Zuo Boqun <zuoboqun@baidu.com>
Co-developed-by: Wang Liang <wangliang44@baidu.com>
Signed-off-by: Wang Liang <wangliang44@baidu.com>
Signed-off-by: Gao Shiyuan <gaoshiyuan@baidu.com>
Message-Id: <
20241030131324.34144-1-gaoshiyuan@baidu.com>
Tested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Suravee Suthikulpanit [Fri, 27 Sep 2024 17:29:13 +0000 (12:29 -0500)]
amd_iommu: Check APIC ID > 255 for XTSup
The XTSup mode enables x2APIC support for AMD IOMMU, which is needed
to support vcpu w/ APIC ID > 255.
Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
Message-Id: <
20240927172913.121477-6-santosh.shukla@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Suravee Suthikulpanit [Fri, 27 Sep 2024 17:29:12 +0000 (12:29 -0500)]
amd_iommu: Send notification when invalidate interrupt entry cache
In order to support AMD IOMMU interrupt remapping emulation with PCI
pass-through devices, QEMU needs to notify VFIO when guest IOMMU driver
updates and invalidate the guest interrupt remapping table (IRT), and
communicate information so that the host IOMMU driver can update
the shadowed interrupt remapping table in the host IOMMU.
Therefore, send notification when guest IOMMU emulates the IRT
invalidation commands.
Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
Message-Id: <
20240927172913.121477-5-santosh.shukla@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Suravee Suthikulpanit [Fri, 27 Sep 2024 17:29:11 +0000 (12:29 -0500)]
amd_iommu: Use shared memory region for Interrupt Remapping
Use shared memory region for interrupt remapping which can be
aliased by all devices.
Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
Message-Id: <
20240927172913.121477-4-santosh.shukla@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Suravee Suthikulpanit [Fri, 27 Sep 2024 17:29:10 +0000 (12:29 -0500)]
amd_iommu: Add support for pass though mode
Introduce 'nodma' shared memory region to support PT mode
so that for each device, we only create an alias to shared memory
region when DMA-remapping is disabled.
Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
Message-Id: <
20240927172913.121477-3-santosh.shukla@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Suravee Suthikulpanit [Fri, 27 Sep 2024 17:29:09 +0000 (12:29 -0500)]
amd_iommu: Rename variable mmio to mr_mmio
Rename the MMIO memory region variable 'mmio' to 'mr_mmio'
so to correctly name align with struct AMDVIState::variable type.
No functional change intended.
Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
Message-Id: <
20240927172913.121477-2-santosh.shukla@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ricardo Ribalda [Tue, 24 Sep 2024 13:24:12 +0000 (13:24 +0000)]
tests/acpi: pc: update golden masters for DSDT
Note: since all we did is replace VarPackageOp with PackageOP,
and both are represented by Package() in ASL, the AML is
different but ASL is the same.
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Message-Id: <
20240924132417.739809-4-ribalda@chromium.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Ricardo Ribalda [Tue, 24 Sep 2024 13:24:11 +0000 (13:24 +0000)]
hw/i386/acpi-build: return a non-var package from _PRT()
Windows XP seems to have issues when _PRT() returns a variable package.
We know in advance the size, so we can return a fixed package instead.
https://lore.kernel.org/qemu-devel/
c82d9331-a8ce-4bb0-b51f-
2ee789e27c86@ilande.co.uk/T/#m541190c942676bccf7a7f7fbcb450d94a4e2da53
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: 99cb2c6c7b ("hw/i386/acpi-build: Return a pre-computed _PRT table")
Closes: https://lore.kernel.org/all/eb11c984-ebe4-4a09-9d71-1e9db7fe7e6f@ilande.co.uk/
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Message-Id: <
20240924132417.739809-3-ribalda@chromium.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ricardo Ribalda [Tue, 24 Sep 2024 13:24:10 +0000 (13:24 +0000)]
tests/acpi: pc: allow DSDT acpi table changes
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Message-Id: <
20240924132417.739809-2-ribalda@chromium.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Michael S. Tsirkin [Mon, 4 Nov 2024 14:11:46 +0000 (09:11 -0500)]
acpi/disassemle-aml.sh: fix up after dir reorg
We moved expected files around, fix up the disassembler script.
Fixes: 7c08eefcaf ("tests/data/acpi: Move x86 ACPI tables under x86/${machine} path")
Fixes: 7434f90467 ("tests/data/acpi/virt: Move ARM64 ACPI tables under aarch64/${machine} path")
Cc: "Sunil V L" <sunilvl@ventanamicro.com>
Message-ID: <
ce456091058734b7f765617ac5dfeebcb366d4a9.
1730729695.git.mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Vladimir Sementsov-Ogievskiy [Fri, 20 Sep 2024 09:49:36 +0000 (12:49 +0300)]
qapi: introduce device-sync-config
Add command to sync config from vhost-user backend to the device. It
may be helpful when VHOST_USER_SLAVE_CONFIG_CHANGE_MSG failed or not
triggered interrupt to the guest or just not available (not supported
by vhost-user server).
Command result is racy if allow it during migration. Let's not allow
that.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Raphael Norwitz <raphael@enfabrica.net>
Message-Id: <
20240920094936.450987-4-vsementsov@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Vladimir Sementsov-Ogievskiy [Fri, 20 Sep 2024 09:49:35 +0000 (12:49 +0300)]
vhost-user-blk: split vhost_user_blk_sync_config()
Split vhost_user_blk_sync_config() out from
vhost_user_blk_handle_config_change(), to be reused in the following
commit.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Acked-by: Raphael Norwitz <raphael@enfabrica.net>
Message-Id: <
20240920094936.450987-3-vsementsov@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Vladimir Sementsov-Ogievskiy [Fri, 20 Sep 2024 09:49:34 +0000 (12:49 +0300)]
qdev-monitor: add option to report GenericError from find_device_state
Here we just prepare for the following patch, making possible to report
GenericError as recommended.
This patch doesn't aim to prevent further use of DeviceNotFound by
future interfaces:
- find_device_state() is used in blk_by_qdev_id() and qmp_get_blk()
functions, which may lead to spread of DeviceNotFound anyway
- also, nothing prevent simply copy-pasting find_device_state() calls
with false argument
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Raphael Norwitz <raphael@enfabrica.net>
Message-Id: <
20240920094936.450987-2-vsementsov@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Mon, 16 Sep 2024 17:35:18 +0000 (18:35 +0100)]
hw/pci-bridge/cxl-upstream: Add properties to control link speed and width
To establish performance characteristics of a CXL device when used via a
particular CXL topology (root ports, switches, end points) it is necessary
to set the appropriate link speed and width in the PCI Express capability
structure. Provide x-speed and x-link properties for this.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20240916173518.
1843023-7-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Mon, 16 Sep 2024 17:35:17 +0000 (18:35 +0100)]
hw/mem/cxl-type3: Add properties to control link speed and width
To establish performance characteristics of a CXL device when used via a
particular CXL topology (root ports, switches, end points) it is necessary
to set the appropriate link speed and width in the PCI Express capability
structure. Provide x-speed and x-link properties for this.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20240916173518.
1843023-6-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Mon, 16 Sep 2024 17:35:16 +0000 (18:35 +0100)]
hw/pcie: Provide a utility function for control of EP / SW USP link
Whilst similar to existing PCIESlot link configuration a few registers
need to be set differently so that the downstream device presents
a 'configured' state that is then used to 'train' the upstream port
on the link. Basically that means setting the status register to
reflect it succeeding in training up to target settings.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20240916173518.
1843023-5-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Mon, 16 Sep 2024 17:35:15 +0000 (18:35 +0100)]
hw/pcie: Factor out PCI Express link register filling common to EP.
Whilst not all link related registers are common between RP / Switch DSP
and EP / Switch USP many of them are. Factor that group out to save
on duplication when adding EP / Swtich USP configurability.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20240916173518.
1843023-4-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Mon, 16 Sep 2024 17:35:14 +0000 (18:35 +0100)]
hw/pci-bridge/cxl_upstream: Provide x-speed and x-width properties.
Copied from gen_pcie_root_port.c
Drop the previous code that ensured a valid value in s->width, s->speed
as now a default is provided so this will always be set.
Note this changes the default settings but it is unlikely to have a negative
effect on software as will only affect ports with now downstream device.
All other ports will use the settings from that device.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20240916173518.
1843023-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Mon, 16 Sep 2024 17:35:13 +0000 (18:35 +0100)]
hw/pci-bridge/cxl_root_port: Provide x-speed and x-width properties.
Approach copied from gen_pcie_root_port.c
Previously the link defaulted to a maximum of 2.5GT/s and 1x. Enable setting
it's maximum values. The actual value after 'training' will depend on the
downstream device configuration.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20240916173518.
1843023-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Jonathan Cameron [Mon, 16 Sep 2024 17:43:21 +0000 (18:43 +0100)]
hw/acpi: Generic Initiator - add missing object class property descriptions.
>From review of the Generic Ports support.
These properties had no description set so add one.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20240916174321.
1843228-1-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Mon, 16 Sep 2024 17:42:37 +0000 (18:42 +0100)]
hw/acpi: Make storage of node id uint32_t to reduce fragility
>From review of generic port introduction.
The value is handled as a uint32_t so store it in that type.
The value cannot in reality exceed MAX_NODES which is currently
128 but if the types are matched there is no need to rely on that
restriction.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20240916174237.
1843213-1-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Mon, 16 Sep 2024 17:41:22 +0000 (18:41 +0100)]
hw/acpi: Generic Port Affinity Structure support
These are very similar to the recently added Generic Initiators
but instead of representing an initiator of memory traffic they
represent an edge point beyond which may lie either targets or
initiators. Here we add these ports such that they may
be targets of hmat_lb records to describe the latency and
bandwidth from host side initiators to the port. A discoverable
mechanism such as UEFI CDAT read from CXL devices and switches
is used to discover the remainder of the path, and the OS can build
up full latency and bandwidth numbers as need for work and data
placement decisions.
Acked-by: Markus Armbruster <armbru@redhat.com>
Tested-by: "Huang, Ying" <ying.huang@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20240916174122.
1843197-1-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Mon, 16 Sep 2024 17:10:14 +0000 (18:10 +0100)]
hw/pci-host/gpex-acpi: Use acpi_uid property.
Reduce the direct use of PCI internals inside ACPI table creation.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20240916171017.
1841767-10-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Mon, 16 Sep 2024 17:10:13 +0000 (18:10 +0100)]
hw/i386/acpi: Use TYPE_PXB_BUS property acpi_uid for DSDT
Rather than relying on PCI internals, use the new acpi_property
to obtain the ACPI _UID values. These are still the same
as the PCI Bus numbers so no functional change.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20240916171017.
1841767-9-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Mon, 16 Sep 2024 17:10:12 +0000 (18:10 +0100)]
hw/pci-bridge: Add acpi_uid property to TYPE_PXB_BUS
Enable ACPI table creation for PCI Expander Bridges to be independent
of PCI internals. Note that the UID is currently the PCI bus number.
This is motivated by the forthcoming ACPI Generic Port SRAT entries
which can be made completely independent of PCI internals.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20240916171017.
1841767-8-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Mon, 16 Sep 2024 17:10:11 +0000 (18:10 +0100)]
acpi/pci: Move Generic Initiator object handling into acpi/pci.*
Whilst ACPI SRAT Generic Initiator Afinity Structures are able to refer to
both PCI and ACPI Device Handles, the QEMU implementation only implements
the PCI Device Handle case. For now move the code into the existing
hw/acpi/pci.c file and header. If support for ACPI Device Handles is
added in the future, perhaps this will be moved again.
Also push the struct AcpiGenericInitiator down into the c file as not
used outside pci.c.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20240916171017.
1841767-7-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Mon, 16 Sep 2024 17:10:10 +0000 (18:10 +0100)]
hw/pci: Add a busnr property to pci_props and use for acpi/gi
Using a property allows us to hide the internal details of the PCI device
from the code to build a SRAT Generic Initiator Affinity Structure with
PCI Device Handle.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20240916171017.
1841767-6-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Mon, 16 Sep 2024 17:10:09 +0000 (18:10 +0100)]
hw/acpi: Rename build_all_acpi_generic_initiators() to build_acpi_generic_initiator()
Igor noted that this function only builds one instance, so was rather
misleadingly named. Fix that.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: "Huang, Ying" <ying.huang@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20240916171017.
1841767-5-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Mon, 16 Sep 2024 17:10:08 +0000 (18:10 +0100)]
hw/acpi: Move AML building code for Generic Initiators to aml_build.c
Rather than attempting to create a generic function with mess of the two
different device handle types, use a PCI handle specific variant. If the
ACPI handle form is needed then that can be introduced alongside this
with little duplicated code.
Drop the PCIDeviceHandle in favor of just passing the bus, devfn
and segment directly. devfn kept as a single byte because ARI means
that in this case it is just an 8 bit function number.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20240618142333.102be976@imammedo.users.ipa.redhat.com/
Tested-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20240916171017.
1841767-4-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Mon, 16 Sep 2024 17:10:07 +0000 (18:10 +0100)]
hw/acpi/GI: Fix trivial parameter alignment issue.
Before making additional modification, tidy up this misleading indentation.
Reviewed-by: Ankit Agrawal <ankita@nvidia.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: "Huang, Ying" <ying.huang@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20240916171017.
1841767-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jonathan Cameron [Mon, 16 Sep 2024 17:10:06 +0000 (18:10 +0100)]
hw/acpi: Fix ordering of BDF in Generic Initiator PCI Device Handle.
The ordering in ACPI specification [1] has bus number in the lowest byte.
As ACPI tables are little endian this is the reverse of the ordering
used by PCI_BUILD_BDF(). As a minimal fix split the QEMU BDF up
into bus and devfn and write them as single bytes in the correct
order.
[1] ACPI Spec 6.3, Table 5.80
Fixes: 0a5b5acdf2d8 ("hw/acpi: Implement the SRAT GI affinity structure")
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: "Huang, Ying" <ying.huang@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <
20240916171017.
1841767-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
luzhixing12345 [Wed, 11 Sep 2024 06:04:00 +0000 (14:04 +0800)]
docs: fix vhost-user protocol doc
Some editorial tweaks to the doc:
Add a ref link to Memory region description and Multiple Memory region
description.
Descriptions about memory regions are merged into one line.
Add extra type(64 bits) to Log description structure fields
Fix ’s to 's
Signed-off-by: luzhixing12345 <luzhixing12345@gmail.com>
Message-Id: <
20240911060400.3472-1-luzhixing12345@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Mattias Nissler [Tue, 10 Sep 2024 21:35:12 +0000 (14:35 -0700)]
softmmu: Expand comments describing max_bounce_buffer_size
Clarify how the parameter gets configured and how it is used when
servicing DMA mapping requests targeting indirect memory regions.
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Message-Id: <
20240910213512.843130-1-mnissler@rivosinc.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Peter Maydell [Thu, 31 Oct 2024 16:34:25 +0000 (16:34 +0000)]
Merge tag 'pull-riscv-to-apply-
20241031-1' of https://github.com/alistair23/qemu into staging
RISC-V PR for 9.2
* Fix an access to VXSAT
* Expose RV32 cpu to RV64 QEMU
* Don't clear PLIC pending bits on IRQ lowering
* Make PLIC zeroth priority register read-only
* Set vtype.vill on CPU reset
* Check and update APLIC pending when write sourcecfg
* Avoid dropping charecters with HTIF
* Apply FIFO backpressure to guests using SiFive UART
* Support for control flow integrity extensions
* Support for the IOMMU with the virt machine
* set 'aia_mode' to default in error path
* clarify how 'riscv-aia' default works
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmci/tQACgkQr3yVEwxT
# gBNPAQ//dZKjjJm4Sh+UFdUslivBJYtL1rl2UUG2UqiNn/UoYh/vcHoSArljHTjt
# 8riEStnaQqXziOpMIJjIMLJ4KoiIk2SMvjNfFtcmPiPZEDEpjsTxfUxBFsBee+fI
# 4KNQKKFeljq4pa+VzVvXEqzCNJIzCThFXTZhZmer00M91HPA8ZQIHpv2JL1sWlgZ
# /HW24XEDFLGc/JsR55fxpPftlAqP+BfOrqMmbWy7x2Y+G8WI05hM2zTP/W8pnIz3
# z0GCRYSBlADtrp+3RqzTwQfK5pXoFc0iDktWVYlhoXaeEmOwo8IYxTjrvBGhnBq+
# ySX1DzTa23QmOIxSYYvCRuOxyOK9ziNn+EQ9FiFBt1h1o251CYMil1bwmYXMCMNJ
# rZwF1HfUx0g2GQW1ZOqh1eeyLO29JiOdV3hxlDO7X4bbISNgU6il5MXmnvf0/XVW
# Af3YhALeeDbHgHL1iVfjafzaviQc9+YrEX13eX6N2AjcgE5a3F7XNmGfFpFJ+mfQ
# CPgiwVBXat6UpBUGAt14UM+6wzp+crSgQR5IEGth+mKMKdkWoykvo7A2oHdu39zn
# 2cdzsshg2qcLLUPTFy06OOTXX382kCWXuykhHOjZ4uu2SJJ7R0W3PlYV8HSde2Vu
# Rj+89ZlUSICJNXXweQB39r87hNbtRuDIO22V0B9XrApQbJj6/yE=
# =rPaa
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 31 Oct 2024 03:51:48 GMT
# gpg: using RSA key
6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-
20241031-1' of https://github.com/alistair23/qemu: (50 commits)
target/riscv: Fix vcompress with rvv_ta_all_1s
target/riscv/kvm: clarify how 'riscv-aia' default works
target/riscv/kvm: set 'aia_mode' to default in error path
docs/specs: add riscv-iommu
qtest/riscv-iommu-test: add init queues test
hw/riscv/riscv-iommu: add DBG support
hw/riscv/riscv-iommu: add ATS support
hw/riscv/riscv-iommu: add Address Translation Cache (IOATC)
test/qtest: add riscv-iommu-pci tests
hw/riscv/virt.c: support for RISC-V IOMMU PCIDevice hotplug
hw/riscv: add riscv-iommu-pci reference device
pci-ids.rst: add Red Hat pci-id for RISC-V IOMMU device
hw/riscv: add RISC-V IOMMU base emulation
hw/riscv: add riscv-iommu-bits.h
exec/memtxattr: add process identifier to the transaction attributes
target/riscv: Expose zicfiss extension as a cpu property
disas/riscv: enable disassembly for compressed sspush/sspopchk
disas/riscv: enable disassembly for zicfiss instructions
target/riscv: compressed encodings for sspush and sspopchk
target/riscv: implement zicfiss instructions
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Thu, 31 Oct 2024 13:28:57 +0000 (13:28 +0000)]
Merge tag 'pull-target-arm-
20241029' of https://git.linaro.org/people/pmaydell/qemu-arm into staging
target-arm queue:
* arm/kvm: add support for MTE
* docs/system/cpu-hotplug: Update example's socket-id/core-id
* target/arm: Store FPSR cumulative exception bits in env->vfp.fpsr
* target/arm: Don't assert in regime_is_user() for E10 mmuidx values
* hw/sd/omap_mmc: Fix breakage of OMAP MMC controller
* tests/functional: Add functional tests for collie, sx1
* scripts/symlink-install-tree.py: Fix MESONINTROSPECT parsing
* docs/system/arm: Document remaining undocumented boards
* target/arm: Fix arithmetic underflow in SETM instruction
* docs/devel/reset: Fix minor grammatical error
* target/arm: kvm: require KVM_CAP_DEVICE_CTRL
# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmcg+oYZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3g/KD/4tzAD2zkWpnIPhY5ht4wBz
# Kioy+pnXJW5I6pAS4ljnI41pOFnPr6Ln1NfGkP+9pTND8lIQNY0Te2a/NjgEiYJc
# rYJ/A6UUuCqQ8+/oWWMPETcbbiKcSS2mzCJ/pNXeIquK5Co0Qk7mzdfObudwZpbw
# o3Cc9YrGZc64XAl2Rb83Oy2UHo1xjmV67wtEmcj+hmWC+tFc7pQpAKwIKcBMgns8
# ZILexX18RYZMDqQZQ5tvwTccJeFmljj9PyScou787RXK93BlF3sL/ypq1xMykRru
# JpMwAI6jD5LG9NO2zNr3FpBef8sJXqNF+O0DcYmhrKBwRkztuEU6DXF6xzdz/HRa
# c14hWK1jHku+HvKBXx3c5wibTbTU71Jv36Gw5VjOBQe/5cdKJAbZw8OH+IK8ozk9
# GwLVQ/JzrIi5m8FwXPwmkOPLX/CY8Wot6IWdJKKGTN8bY+9Cu2gTduFJIvi96HWU
# xkG1ySN61wKUR8Z26mizim2nBvQjybjqKEhrtQ21K548j4pWFVBgXJQX0Menca/v
# ziSLCd84Pmh9+DtElPCUyau/nX/jyUJ1gCScvcJjF5jAMPBREpAh53j/GL9JEgX6
# 9cX2WG6o+9R4Qcrh1O3Vy1bAUcJ27Tr2NitD+g5XObZ+vC6YgqfN2/M53so4rwws
# N4KCRdV6GcU70bQAul3mLQ==
# =KWM2
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 29 Oct 2024 15:08:54 GMT
# gpg: using RSA key
E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [ultimate]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate]
# gpg: aka "Peter Maydell <peter@archaic.org.uk>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* tag 'pull-target-arm-
20241029' of https://git.linaro.org/people/pmaydell/qemu-arm:
target/arm: kvm: require KVM_CAP_DEVICE_CTRL
docs/devel/reset: Fix minor grammatical error
target/arm: Fix arithmetic underflow in SETM instruction
docs/system/target-arm.rst: Remove "many boards are undocumented" note
docs/system/arm: Add placeholder docs for mcimx6ul-evk and mcimx7d-sabre
docs/system/arm: Add placeholder doc for xlnx-zcu102 board
docs/system/arm: Add placeholder doc for exynos4 boards
docs/system/arm: Split fby35 out from aspeed.rst
docs/system/arm: Don't use wildcard '*-bmc' in doc titles
docs/system/arm/stm32: List olimex-stm32-h405 in document title
scripts/symlink-install-tree.py: Fix MESONINTROSPECT parsing
tests/functional: Add a functional test for the sx1 board
tests/functional: Add a functional test for the collie board
hw/sd/omap_mmc: Don't use sd_cmd_type_t
target/arm: Don't assert in regime_is_user() for E10 mmuidx values
target/arm: Store FPSR cumulative exception bits in env->vfp.fpsr
docs/system/cpu-hotplug: Update example's socket-id/core-id
arm/kvm: add support for MTE
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Anton Blanchard [Wed, 30 Oct 2024 04:35:38 +0000 (15:35 +1100)]
target/riscv: Fix vcompress with rvv_ta_all_1s
vcompress packs vl or less fields into vd, so the tail starts after the
last packed field. This could be more clearly expressed in the ISA,
but for now this thread helps to explain it:
https://github.com/riscv/riscv-v-spec/issues/796
Signed-off-by: Anton Blanchard <antonb@tenstorrent.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241030043538.939712-1-antonb@tenstorrent.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Daniel Henrique Barboza [Mon, 28 Oct 2024 18:20:37 +0000 (15:20 -0300)]
target/riscv/kvm: clarify how 'riscv-aia' default works
We do not have control in the default 'riscv-aia' default value. We can
try to set it to a specific value, in this case 'auto', but there's no
guarantee that the host will accept it.
Couple with this we're always doing a 'qemu_log' to inform whether we're
ended up using the host default or if we managed to set the AIA mode to
the QEMU default we wanted to set.
Change the 'riscv-aia' description to better reflect how the option
works, and remove the two informative 'qemu_log' that are now unneeded:
if no message shows, riscv-aia was set to the default or uset-set value.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241028182037.290171-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Daniel Henrique Barboza [Mon, 28 Oct 2024 18:20:36 +0000 (15:20 -0300)]
target/riscv/kvm: set 'aia_mode' to default in error path
When failing to set the selected AIA mode, 'aia_mode' is left untouched.
This means that 'aia_mode' will not reflect the actual AIA mode,
retrieved in 'default_aia_mode',
This is benign for now, but it will impact QMP query commands that will
expose the 'aia_mode' value, retrieving the wrong value.
Set 'aia_mode' to 'default_aia_mode' if we fail to change the AIA mode
in KVM.
While we're at it, rework the log/warning messages to be a bit less
verbose. Instead of:
KVM AIA: default mode is emul
qemu-system-riscv64: warning: KVM AIA: failed to set KVM AIA mode
We can use a single warning message:
qemu-system-riscv64: warning: KVM AIA: failed to set KVM AIA mode 'auto', using default host mode 'emul'
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241028182037.290171-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Daniel Henrique Barboza [Wed, 16 Oct 2024 20:40:36 +0000 (17:40 -0300)]
docs/specs: add riscv-iommu
Add a simple guideline to use the existing RISC-V IOMMU support we just
added.
This doc will be updated once we add the riscv-iommu-sys device.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241016204038.649340-13-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Daniel Henrique Barboza [Wed, 16 Oct 2024 20:40:35 +0000 (17:40 -0300)]
qtest/riscv-iommu-test: add init queues test
Add an additional test to further exercise the IOMMU where we attempt to
initialize the command, fault and page-request queues.
These steps are taken from chapter 6.2 of the RISC-V IOMMU spec,
"Guidelines for initialization". It emulates what we expect from the
software/OS when initializing the IOMMU.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241016204038.649340-12-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Tomasz Jeznach [Wed, 16 Oct 2024 20:40:34 +0000 (17:40 -0300)]
hw/riscv/riscv-iommu: add DBG support
DBG support adds three additional registers: tr_req_iova, tr_req_ctl and
tr_response.
The DBG cap is always enabled. No on/off toggle is provided for it.
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241016204038.649340-11-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Tomasz Jeznach [Wed, 16 Oct 2024 20:40:33 +0000 (17:40 -0300)]
hw/riscv/riscv-iommu: add ATS support
Add PCIe Address Translation Services (ATS) capabilities to the IOMMU.
This will add support for ATS translation requests in Fault/Event
queues, Page-request queue and IOATC invalidations.
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241016204038.649340-10-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Tomasz Jeznach [Wed, 16 Oct 2024 20:40:32 +0000 (17:40 -0300)]
hw/riscv/riscv-iommu: add Address Translation Cache (IOATC)
The RISC-V IOMMU spec predicts that the IOMMU can use translation caches
to hold entries from the DDT. This includes implementation for all cache
commands that are marked as 'not implemented'.
There are some artifacts included in the cache that predicts s-stage and
g-stage elements, although we don't support it yet. We'll introduce them
next.
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241016204038.649340-9-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Daniel Henrique Barboza [Wed, 16 Oct 2024 20:40:31 +0000 (17:40 -0300)]
test/qtest: add riscv-iommu-pci tests
To test the RISC-V IOMMU emulation we'll use its PCI representation.
Create a new 'riscv-iommu-pci' libqos device that will be present with
CONFIG_RISCV_IOMMU. This config is only available for RISC-V, so this
device will only be consumed by the RISC-V libqos machine.
Start with basic tests: a PCI sanity check and a reset state register
test. The reset test was taken from the RISC-V IOMMU spec chapter 5.2,
"Reset behavior".
More tests will be added later.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241016204038.649340-8-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Tomasz Jeznach [Wed, 16 Oct 2024 20:40:30 +0000 (17:40 -0300)]
hw/riscv/virt.c: support for RISC-V IOMMU PCIDevice hotplug
Generate device tree entry for riscv-iommu PCI device, along with
mapping all PCI device identifiers to the single IOMMU device instance.
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241016204038.649340-7-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Tomasz Jeznach [Wed, 16 Oct 2024 20:40:29 +0000 (17:40 -0300)]
hw/riscv: add riscv-iommu-pci reference device
The RISC-V IOMMU can be modelled as a PCIe device following the
guidelines of the RISC-V IOMMU spec, chapter 7.1, "Integrating an IOMMU
as a PCIe device".
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241016204038.649340-6-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Daniel Henrique Barboza [Wed, 16 Oct 2024 20:40:28 +0000 (17:40 -0300)]
pci-ids.rst: add Red Hat pci-id for RISC-V IOMMU device
The RISC-V IOMMU PCI device we're going to add next is a reference
implementation of the riscv-iommu spec [1], which predicts that the
IOMMU can be implemented as a PCIe device.
However, RISC-V International (RVI), the entity that ratified the
riscv-iommu spec, didn't bother assigning a PCI ID for this IOMMU PCIe
implementation that the spec predicts. This puts us in an uncommon
situation because we want to add the reference IOMMU PCIe implementation
but we don't have a PCI ID for it.
Given that RVI doesn't provide a PCI ID for it we reached out to Red Hat
and Gerd Hoffman, and they were kind enough to give us a PCI ID for the
RISC-V IOMMU PCI reference device.
Thanks Red Hat and Gerd for this RISC-V IOMMU PCIe device ID.
[1] https://github.com/riscv-non-isa/riscv-iommu/releases/tag/v1.0.0
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Message-ID: <
20241016204038.649340-5-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Tomasz Jeznach [Wed, 16 Oct 2024 20:40:27 +0000 (17:40 -0300)]
hw/riscv: add RISC-V IOMMU base emulation
The RISC-V IOMMU specification is now ratified as-per the RISC-V
international process. The latest frozen specifcation can be found at:
https://github.com/riscv-non-isa/riscv-iommu/releases/download/v1.0/riscv-iommu.pdf
Add the foundation of the device emulation for RISC-V IOMMU. It includes
support for s-stage (sv32, sv39, sv48, sv57 caps) and g-stage (sv32x4,
sv39x4, sv48x4, sv57x4 caps).
Other capabilities like ATS and DBG support will be added incrementally
in the next patches.
Co-developed-by: Sebastien Boeuf <seb@rivosinc.com>
Signed-off-by: Sebastien Boeuf <seb@rivosinc.com>
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Jason Chien <jason.chien@sifive.com>
Message-ID: <
20241016204038.649340-4-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Tomasz Jeznach [Wed, 16 Oct 2024 20:40:26 +0000 (17:40 -0300)]
hw/riscv: add riscv-iommu-bits.h
This header will be used by the RISC-V IOMMU emulation to be added
in the next patch. Due to its size it's being sent in separate for
an easier review.
One thing to notice is that this header can be replaced by the future
Linux RISC-V IOMMU driver header, which would become a linux-header we
would import instead of keeping our own. The Linux implementation isn't
upstream yet so for now we'll have to manage riscv-iommu-bits.h.
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Jason Chien <jason.chien@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241016204038.649340-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Tomasz Jeznach [Wed, 16 Oct 2024 20:40:25 +0000 (17:40 -0300)]
exec/memtxattr: add process identifier to the transaction attributes
Extend memory transaction attributes with process identifier to allow
per-request address translation logic to use requester_id / process_id
to identify memory mapping (e.g. enabling IOMMU w/ PASID translations).
Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Jason Chien <jason.chien@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <
20241016204038.649340-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Deepak Gupta [Tue, 8 Oct 2024 22:50:10 +0000 (15:50 -0700)]
target/riscv: Expose zicfiss extension as a cpu property
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241008225010.
1861630-21-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Deepak Gupta [Tue, 8 Oct 2024 22:50:09 +0000 (15:50 -0700)]
disas/riscv: enable disassembly for compressed sspush/sspopchk
sspush and sspopchk have equivalent compressed encoding taken from zcmop.
cmop.1 is sspush x1 while cmop.5 is sspopchk x5. Due to unusual encoding
for both rs1 and rs2 from space bitfield, this required a new codec.
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241008225010.
1861630-20-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Deepak Gupta [Tue, 8 Oct 2024 22:50:08 +0000 (15:50 -0700)]
disas/riscv: enable disassembly for zicfiss instructions
Enable disassembly for sspush, sspopchk, ssrdp & ssamoswap.
Disasembly is only enabled if zimop and zicfiss ext is set to true.
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241008225010.
1861630-19-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Deepak Gupta [Tue, 8 Oct 2024 22:50:07 +0000 (15:50 -0700)]
target/riscv: compressed encodings for sspush and sspopchk
sspush/sspopchk have compressed encodings carved out of zcmops.
compressed sspush is designated as c.mop.1 while compressed sspopchk
is designated as c.mop.5.
Note that c.sspush x1 exists while c.sspush x5 doesn't. Similarly
c.sspopchk x5 exists while c.sspopchk x1 doesn't.
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Co-developed-by: Jim Shu <jim.shu@sifive.com>
Co-developed-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241008225010.
1861630-18-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Deepak Gupta [Tue, 8 Oct 2024 22:50:06 +0000 (15:50 -0700)]
target/riscv: implement zicfiss instructions
zicfiss has following instructions
- sspopchk: pops a value from shadow stack and compares with x1/x5.
If they dont match, reports a sw check exception with tval = 3.
- sspush: pushes value in x1/x5 on shadow stack
- ssrdp: reads current shadow stack
- ssamoswap: swaps contents of shadow stack atomically
sspopchk/sspush/ssrdp default to zimop if zimop implemented and SSE=0
If SSE=0, ssamoswap is illegal instruction exception.
This patch implements shadow stack operations for qemu-user and shadow
stack is not protected.
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Co-developed-by: Jim Shu <jim.shu@sifive.com>
Co-developed-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241008225010.
1861630-17-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Deepak Gupta [Tue, 8 Oct 2024 22:50:05 +0000 (15:50 -0700)]
target/riscv: update `decode_save_opc` to store extra word2
Extra word 2 is stored during tcg compile and `decode_save_opc` needs
additional argument in order to pass the value. This will be used during
unwind to get extra information about instruction like how to massage
exceptions. Updated all callsites as well.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/594
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241008225010.
1861630-16-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Deepak Gupta [Tue, 8 Oct 2024 22:50:04 +0000 (15:50 -0700)]
target/riscv: AMO operations always raise store/AMO fault
This patch adds one more word for tcg compile which can be obtained during
unwind time to determine fault type for original operation (example AMO).
Depending on that, fault can be promoted to store/AMO fault.
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241008225010.
1861630-15-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Deepak Gupta [Tue, 8 Oct 2024 22:50:03 +0000 (15:50 -0700)]
target/riscv: mmu changes for zicfiss shadow stack protection
zicfiss protects shadow stack using new page table encodings PTE.W=1,
PTE.R=0 and PTE.X=0. This encoding is reserved if zicfiss is not
implemented or if shadow stack are not enabled.
Loads on shadow stack memory are allowed while stores to shadow stack
memory leads to access faults. Shadow stack accesses to RO memory
leads to store page fault.
To implement special nature of shadow stack memory where only selected
stores (shadow stack stores from sspush) have to be allowed while rest
of regular stores disallowed, new MMU TLB index is created for shadow
stack.
Furthermore, `check_zicbom_access` (`cbo.clean/flush/inval`) may probe
shadow stack memory and must always raise store/AMO access fault because
it has store semantics. For non-shadow stack memory even though
`cbo.clean/flush/inval` have store semantics, it will not fault if read
is allowed (probably to follow `clflush` on x86). Although if read is not
allowed, eventually `probe_write` will do store page (or access) fault (if
permissions don't allow it). cbo operations on shadow stack memory must
always raise store access fault. Thus extending `get_physical_address` to
recieve `probe` parameter as well.
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241008225010.
1861630-14-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Deepak Gupta [Tue, 8 Oct 2024 22:50:02 +0000 (15:50 -0700)]
target/riscv: tb flag for shadow stack instructions
Shadow stack instructions can be decoded as zimop / zcmop or shadow stack
instructions depending on whether shadow stack are enabled at current
privilege. This requires a TB flag so that correct TB generation and correct
TB lookup happens. `DisasContext` gets a field indicating whether bcfi is
enabled or not.
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Co-developed-by: Jim Shu <jim.shu@sifive.com>
Co-developed-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241008225010.
1861630-13-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Deepak Gupta [Tue, 8 Oct 2024 22:50:01 +0000 (15:50 -0700)]
target/riscv: introduce ssp and enabling controls for zicfiss
zicfiss introduces a new state ssp ("shadow stack register") in cpu.
ssp is expressed as a new unprivileged csr (CSR_SSP=0x11) and holds
virtual address for shadow stack as programmed by software.
Shadow stack (for each mode) is enabled via bit3 in *envcfg CSRs.
Shadow stack can be enabled for a mode only if it's higher privileged
mode had it enabled for itself. M mode doesn't need enabling control,
it's always available if extension is available on cpu.
This patch also implements helper bcfi function which determines if bcfi
is enabled at current privilege or not.
Adds ssp to migration state as well.
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Co-developed-by: Jim Shu <jim.shu@sifive.com>
Co-developed-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241008225010.
1861630-12-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Deepak Gupta [Tue, 8 Oct 2024 22:50:00 +0000 (15:50 -0700)]
target/riscv: Add zicfiss extension
zicfiss [1] riscv cpu extension enables backward control flow integrity.
This patch sets up space for zicfiss extension in cpuconfig. And imple-
ments dependency on A, zicsr, zimop and zcmop extensions.
[1] - https://github.com/riscv/riscv-cfi
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Co-developed-by: Jim Shu <jim.shu@sifive.com>
Co-developed-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241008225010.
1861630-11-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Deepak Gupta [Tue, 8 Oct 2024 22:49:59 +0000 (15:49 -0700)]
target/riscv: Expose zicfilp extension as a cpu property
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241008225010.
1861630-10-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Deepak Gupta [Tue, 8 Oct 2024 22:49:58 +0000 (15:49 -0700)]
disas/riscv: enable `lpad` disassembly
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Co-developed-by: Jim Shu <jim.shu@sifive.com>
Co-developed-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241008225010.
1861630-9-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Deepak Gupta [Tue, 8 Oct 2024 22:49:57 +0000 (15:49 -0700)]
target/riscv: zicfilp `lpad` impl and branch tracking
Implements setting lp expected when `jalr` is encountered and implements
`lpad` instruction of zicfilp. `lpad` instruction is taken out of
auipc x0, <imm_20>. This is an existing HINTNOP space. If `lpad` is
target of an indirect branch, cpu checks for 20 bit value in x7 upper
with 20 bit value embedded in `lpad`. If they don't match, cpu raises a
sw check exception with tval = 2.
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Co-developed-by: Jim Shu <jim.shu@sifive.com>
Co-developed-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241008225010.
1861630-8-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Deepak Gupta [Tue, 8 Oct 2024 22:49:56 +0000 (15:49 -0700)]
target/riscv: tracking indirect branches (fcfi) for zicfilp
zicfilp protects forward control flow (if enabled) by enforcing all
indirect call and jmp must land on a landing pad instruction `lpad`. If
target of an indirect call or jmp is not `lpad` then cpu/hart must raise
a sw check exception with tval = 2.
This patch implements the mechanism using TCG. Target architecture branch
instruction must define the end of a TB. Using this property, during
translation of branch instruction, TB flag = FCFI_LP_EXPECTED can be set.
Translation of target TB can check if FCFI_LP_EXPECTED flag is set and a
flag (fcfi_lp_expected) can be set in DisasContext. If `lpad` gets
translated, fcfi_lp_expected flag in DisasContext can be cleared. Else
it'll fault.
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Co-developed-by: Jim Shu <jim.shu@sifive.com>
Co-developed-by: Andy Chiu <andy.chiu@sifive.com>
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241008225010.
1861630-7-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Deepak Gupta [Tue, 8 Oct 2024 22:49:55 +0000 (15:49 -0700)]
target/riscv: additional code information for sw check
sw check exception support was recently added. This patch further augments
sw check exception by providing support for additional code which is
provided in *tval. Adds `sw_check_code` field in cpuarchstate. Whenever
sw check exception is raised *tval gets the value deposited in
`sw_check_code`.
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241008225010.
1861630-6-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Deepak Gupta [Tue, 8 Oct 2024 22:49:54 +0000 (15:49 -0700)]
target/riscv: save and restore elp state on priv transitions
elp state is recorded in *status on trap entry (less privilege to higher
privilege) and restored in elp from *status on trap exit (higher to less
privilege).
Additionally this patch introduces a forward cfi helper function to
determine if current privilege has forward cfi is enabled or not based on
*envcfg (for U, VU, S, VU, HS) or mseccfg csr (for M).
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Co-developed-by: Jim Shu <jim.shu@sifive.com>
Co-developed-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241008225010.
1861630-5-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Deepak Gupta [Tue, 8 Oct 2024 22:49:53 +0000 (15:49 -0700)]
target/riscv: Introduce elp state and enabling controls for zicfilp
zicfilp introduces a new state elp ("expected landing pad") in cpu.
During normal execution, elp is idle (NO_LP_EXPECTED) i.e not expecting
landing pad. On an indirect call, elp moves LP_EXPECTED. When elp is
LP_EXPECTED, only a subsquent landing pad instruction can set state back
to NO_LP_EXPECTED. On reset, elp is set to NO_LP_EXPECTED.
zicfilp is enabled via bit2 in *envcfg CSRs. Enabling control for M-mode
is in mseccfg CSR at bit position 10.
On trap, elp state is saved away in *status.
Adds elp to the migration state as well.
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Co-developed-by: Jim Shu <jim.shu@sifive.com>
Co-developed-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241008225010.
1861630-4-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Deepak Gupta [Tue, 8 Oct 2024 22:49:52 +0000 (15:49 -0700)]
target/riscv: Add zicfilp extension
zicfilp [1] riscv cpu extension enables forward control flow integrity.
If enabled, all indirect calls must land on a landing pad instruction.
This patch sets up space for zicfilp extension in cpuconfig. zicfilp
is dependend on zicsr.
[1] - https://github.com/riscv/riscv-cfi
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Co-developed-by: Jim Shu <jim.shu@sifive.com>
Co-developed-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241008225010.
1861630-3-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Deepak Gupta [Tue, 8 Oct 2024 22:49:51 +0000 (15:49 -0700)]
target/riscv: expose *envcfg csr and priv to qemu-user as well
Execution environment config CSR controlling user env and current
privilege state shouldn't be limited to qemu-system only. *envcfg
CSRs control enabling of features in next lesser mode. In some cases
bits *envcfg CSR can be lit up by kernel as part of kernel policy or
software (user app) can choose to opt-in by issuing a system call
(e.g. prctl). In case of qemu-user, it should be no different because
qemu is providing underlying execution environment facility and thus
either should provide some default value in *envcfg CSRs or react to
system calls (prctls) initiated from application. priv is set to PRV_U
and menvcfg/senvcfg set to 0 for qemu-user on reest.
`henvcfg` has been left for qemu-system only because it is not expected
that someone will use qemu-user where application is expected to have
hypervisor underneath which is controlling its execution environment. If
such a need arises then `henvcfg` could be exposed as well.
Relevant discussion:
https://lore.kernel.org/all/CAKmqyKOTVWPFep2msTQVdUmJErkH+bqCcKEQ4hAnyDFPdWKe0Q@mail.gmail.com/
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <
20241008225010.
1861630-2-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>