The current model of memory mapping at the back-end works fine where a standard call to mmap() (for the respective file descriptor) is enough before the front-end can start accessing the guest memory.
There are other complex cases though where the back-end needs more information and simple mmap() isn't enough. For example Xen, a type-1 hypervisor, currently supports memory mapping via two different methods, foreign-mapping (via /dev/privcmd) and grant-dev (via /dev/gntdev). In both these cases, the back-end needs to call mmap() and ioctl(), and need to pass extra information via the ioctl(), like the Xen domain-id of the guest whose memory we are trying to map.
Add a new protocol feature, 'VHOST_USER_PROTOCOL_F_XEN_MMAP', which lets the back-end know about the additional memory mapping requirements. When this feature is negotiated, the front-end can send the 'VHOST_USER_SET_XEN_MMAP' message type to provide the additional information to the back-end.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- V1->V2: - Make the custom mmap feature Xen specific, instead of being generic. - Clearly define which memory regions are impacted by this change. - Allow VHOST_USER_SET_XEN_MMAP to be called multiple times. - Additional Bit(2) property in flags.
docs/interop/vhost-user.rst | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+)
diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst index 3f18ab424eb0..8be5f5eae941 100644 --- a/docs/interop/vhost-user.rst +++ b/docs/interop/vhost-user.rst @@ -258,6 +258,24 @@ Inflight description
:queue size: a 16-bit size of virtqueues
+Xen mmap description +^^^^^^^^^^^^^^^^^^^^ + ++-------+-------+ +| flags | domid | ++-------+-------+ + +:flags: 64-bit bit field + +- Bit 0 is set for Xen foreign memory memory mapping. +- Bit 1 is set for Xen grant memory memory mapping. +- Bit 2 is set if the back-end can directly map additional memory (like + descriptor buffers or indirect descriptors, which aren't part of already + shared memory regions) without the need of front-end sending an additional + memory region first. + +:domid: a 64-bit Xen hypervisor specific domain id. + C structure -----------
@@ -867,6 +885,7 @@ Protocol features #define VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS 14 #define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS 15 #define VHOST_USER_PROTOCOL_F_STATUS 16 + #define VHOST_USER_PROTOCOL_F_XEN_MMAP 17
Front-end message types ----------------------- @@ -1422,6 +1441,23 @@ Front-end message types query the back-end for its device status as defined in the Virtio specification.
+``VHOST_USER_SET_XEN_MMAP`` + :id: 41 + :equivalent ioctl: N/A + :request payload: Xen mmap description + :reply payload: N/A + + When the ``VHOST_USER_PROTOCOL_F_XEN_MMAP`` protocol feature has been + successfully negotiated, this message is submitted by the front-end to set the + Xen hypervisor specific memory mapping configurations at the back-end. These + configurations should be used to mmap memory regions, virtqueues, descriptors + and descriptor buffers. The front-end must send this message before any + memory-regions are sent to the back-end via ``VHOST_USER_SET_MEM_TABLE`` or + ``VHOST_USER_ADD_MEM_REG`` message types. The front-end can send this message + multiple times, if different mmap configurations are required for different + memory regions, where the most recent ``VHOST_USER_SET_XEN_MMAP`` must be used + by the back-end to map any newly shared memory regions. +
Back-end message types ----------------------
On Mon, Mar 06, 2023 at 04:40:24PM +0530, Viresh Kumar wrote:
The current model of memory mapping at the back-end works fine where a standard call to mmap() (for the respective file descriptor) is enough before the front-end can start accessing the guest memory.
There are other complex cases though where the back-end needs more information and simple mmap() isn't enough. For example Xen, a type-1 hypervisor, currently supports memory mapping via two different methods, foreign-mapping (via /dev/privcmd) and grant-dev (via /dev/gntdev). In both these cases, the back-end needs to call mmap() and ioctl(), and need to pass extra information via the ioctl(), like the Xen domain-id of the guest whose memory we are trying to map.
Add a new protocol feature, 'VHOST_USER_PROTOCOL_F_XEN_MMAP', which lets the back-end know about the additional memory mapping requirements. When this feature is negotiated, the front-end can send the 'VHOST_USER_SET_XEN_MMAP' message type to provide the additional information to the back-end.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
V1->V2:
- Make the custom mmap feature Xen specific, instead of being generic.
- Clearly define which memory regions are impacted by this change.
- Allow VHOST_USER_SET_XEN_MMAP to be called multiple times.
- Additional Bit(2) property in flags.
docs/interop/vhost-user.rst | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+)
diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst index 3f18ab424eb0..8be5f5eae941 100644 --- a/docs/interop/vhost-user.rst +++ b/docs/interop/vhost-user.rst @@ -258,6 +258,24 @@ Inflight description :queue size: a 16-bit size of virtqueues +Xen mmap description +^^^^^^^^^^^^^^^^^^^^
++-------+-------+ +| flags | domid | ++-------+-------+
+:flags: 64-bit bit field
+- Bit 0 is set for Xen foreign memory memory mapping. +- Bit 1 is set for Xen grant memory memory mapping. +- Bit 2 is set if the back-end can directly map additional memory (like
- descriptor buffers or indirect descriptors, which aren't part of already
- shared memory regions) without the need of front-end sending an additional
- memory region first.
I don't understand what Bit 2 does. Can you rephrase this? It's unclear to me how additional memory can be mapped without a memory region (especially the fd) is sent?
+:domid: a 64-bit Xen hypervisor specific domain id.
C structure
@@ -867,6 +885,7 @@ Protocol features #define VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS 14 #define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS 15 #define VHOST_USER_PROTOCOL_F_STATUS 16
- #define VHOST_USER_PROTOCOL_F_XEN_MMAP 17
Front-end message types
@@ -1422,6 +1441,23 @@ Front-end message types query the back-end for its device status as defined in the Virtio specification. +``VHOST_USER_SET_XEN_MMAP``
- :id: 41
- :equivalent ioctl: N/A
- :request payload: Xen mmap description
- :reply payload: N/A
- When the ``VHOST_USER_PROTOCOL_F_XEN_MMAP`` protocol feature has been
- successfully negotiated, this message is submitted by the front-end to set the
- Xen hypervisor specific memory mapping configurations at the back-end. These
- configurations should be used to mmap memory regions, virtqueues, descriptors
- and descriptor buffers. The front-end must send this message before any
- memory-regions are sent to the back-end via ``VHOST_USER_SET_MEM_TABLE`` or
- ``VHOST_USER_ADD_MEM_REG`` message types. The front-end can send this message
- multiple times, if different mmap configurations are required for different
- memory regions, where the most recent ``VHOST_USER_SET_XEN_MMAP`` must be used
- by the back-end to map any newly shared memory regions.
This message modifies the behavior of subsequent VHOST_USER_SET_MEM_TABLE and VHOST_USER_ADD_MEM_REG messages. The memory region structs can be extended and then VHOST_USER_SET_XEN_MMAP isn't needed.
In other words:
When VHOST_USER_PROTOCOL_F_XEN_MMAP is negotiated, each "Memory regions description" and "Single memory region description" has the following additional fields appended:
+----------------+-------+ | xen_mmap_flags | domid | +----------------+-------+
:xen_mmap_flags: 64-bit bit field :domid: a 64-bit Xen hypervisor specific domain id.
Stefan
On 06-03-23, 10:34, Stefan Hajnoczi wrote:
On Mon, Mar 06, 2023 at 04:40:24PM +0530, Viresh Kumar wrote:
+Xen mmap description +^^^^^^^^^^^^^^^^^^^^
++-------+-------+ +| flags | domid | ++-------+-------+
+:flags: 64-bit bit field
+- Bit 0 is set for Xen foreign memory memory mapping. +- Bit 1 is set for Xen grant memory memory mapping. +- Bit 2 is set if the back-end can directly map additional memory (like
- descriptor buffers or indirect descriptors, which aren't part of already
- shared memory regions) without the need of front-end sending an additional
- memory region first.
I don't understand what Bit 2 does. Can you rephrase this? It's unclear to me how additional memory can be mapped without a memory region (especially the fd) is sent?
I (somehow) assumed we will be able to use the same file descriptor that was shared for the virtqueues memory regions and yes I can see now why it wouldn't work or create problems.
And I need suggestion now on how to make this work.
With Xen grants, the front end receives grant address from the from guest kernel, they aren't physical addresses, kind of IOMMU stuff.
The back-end gets access for memory regions of the virtqueues alone initially. When the back-end gets a request, it reads the descriptor and finds the buffer address, which isn't part of already shared regions. The same happens for descriptor addresses in case indirect descriptor feature is negotiated.
At this point I was thinking maybe the back-end can simply call the mmap/ioctl to map the memory, using the file descriptor used for the virtqueues.
How else can we make this work ? We also need to unmap/remove the memory region, as soon as the buffer is processed as the grant address won't be relevant for any subsequent request.
Should I use VHOST_USER_IOTLB_MSG for this ? I did look at it and I wasn't convinced if it was an exact fit. For example it says that a memory address reported with miss/access fail should be part of an already sent memory region, which isn't the case here.
This message modifies the behavior of subsequent VHOST_USER_SET_MEM_TABLE and VHOST_USER_ADD_MEM_REG messages. The memory region structs can be extended and then VHOST_USER_SET_XEN_MMAP isn't needed.
In other words:
When VHOST_USER_PROTOCOL_F_XEN_MMAP is negotiated, each "Memory regions description" and "Single memory region description" has the following additional fields appended:
+----------------+-------+ | xen_mmap_flags | domid | +----------------+-------+
This looks fine.
On Tue, Mar 07, 2023 at 11:13:36AM +0530, Viresh Kumar wrote:
On 06-03-23, 10:34, Stefan Hajnoczi wrote:
On Mon, Mar 06, 2023 at 04:40:24PM +0530, Viresh Kumar wrote:
+Xen mmap description +^^^^^^^^^^^^^^^^^^^^
++-------+-------+ +| flags | domid | ++-------+-------+
+:flags: 64-bit bit field
+- Bit 0 is set for Xen foreign memory memory mapping. +- Bit 1 is set for Xen grant memory memory mapping. +- Bit 2 is set if the back-end can directly map additional memory (like
- descriptor buffers or indirect descriptors, which aren't part of already
- shared memory regions) without the need of front-end sending an additional
- memory region first.
I don't understand what Bit 2 does. Can you rephrase this? It's unclear to me how additional memory can be mapped without a memory region (especially the fd) is sent?
I (somehow) assumed we will be able to use the same file descriptor that was shared for the virtqueues memory regions and yes I can see now why it wouldn't work or create problems.
And I need suggestion now on how to make this work.
With Xen grants, the front end receives grant address from the from guest kernel, they aren't physical addresses, kind of IOMMU stuff.
The back-end gets access for memory regions of the virtqueues alone initially. When the back-end gets a request, it reads the descriptor and finds the buffer address, which isn't part of already shared regions. The same happens for descriptor addresses in case indirect descriptor feature is negotiated.
At this point I was thinking maybe the back-end can simply call the mmap/ioctl to map the memory, using the file descriptor used for the virtqueues.
How else can we make this work ? We also need to unmap/remove the memory region, as soon as the buffer is processed as the grant address won't be relevant for any subsequent request.
Should I use VHOST_USER_IOTLB_MSG for this ? I did look at it and I wasn't convinced if it was an exact fit. For example it says that a memory address reported with miss/access fail should be part of an already sent memory region, which isn't the case here.
VHOST_USER_IOTLB_MSG probably isn't necessary because address translation is not required. It will also reduce performance by adding extra communication.
Instead, you could change the 1 memory region : 1 mmap relationship that existing non-Xen vhost-user back-end implementations have. In Xen vhost-user back-ends, the memory region details (including the file descriptor and Xen domain id) would be stashed away in back-end when the front-end adds memory regions. No mmap would be performed upon VHOST_USER_ADD_MEM_REG or VHOST_USER_SET_MEM_TABLE.
Whenever the back-end needs to do DMA, it looks up the memory region and performs the mmap + Xen-specific calls: - A long-lived mmap of the vring is set up when VHOST_USER_SET_VRING_ENABLE is received. - Short-lived mmaps of the indirect descriptors and memory pointed to by the descriptors is set up by the virtqueue processing code.
Does this sound workable to you?
Stefan
On 07-03-23, 11:22, Stefan Hajnoczi wrote:
VHOST_USER_IOTLB_MSG probably isn't necessary because address translation is not required. It will also reduce performance by adding extra communication.
Instead, you could change the 1 memory region : 1 mmap relationship that existing non-Xen vhost-user back-end implementations have. In Xen vhost-user back-ends, the memory region details (including the file descriptor and Xen domain id) would be stashed away in back-end when the front-end adds memory regions. No mmap would be performed upon VHOST_USER_ADD_MEM_REG or VHOST_USER_SET_MEM_TABLE.
Whenever the back-end needs to do DMA, it looks up the memory region and performs the mmap + Xen-specific calls:
- A long-lived mmap of the vring is set up when VHOST_USER_SET_VRING_ENABLE is received.
- Short-lived mmaps of the indirect descriptors and memory pointed to by the descriptors is set up by the virtqueue processing code.
Does this sound workable to you?
Sounds good. I have sent a proposal (v3) based on that now.
stratos-dev@op-lists.linaro.org