On 20-02-23, 07:13, Juergen Gross wrote:
> There are no permission flags in Xen PV device protocols either. The kind of a
> mapping (RO or RW) in the backend is selected via the I/O operation: in case it
> is a write type operation (guest writing data to a device), the related grants
> are mapper as RO in the backend, in all other cases they are mapped as RW.
>
> The same applies to granted pages for virtio: the frontend side will grant the
> page as RO in case the I/O operation is flagged as "DMA_TO_DEVICE", and as RW
> in all other cases. The backend should always know, which direction the data is
> flowing, so it should be able to do the mapping with the correct access mode.
Right, so the back-end actually knows the permission details, but it
is getting lost while we do some vhost-user operations.
Anyway, I have taken this in a different direction now and suggested a
change to vhost-user protocol itself. That lets the back-end know that
it is actually running on Xen and then it can do the mapping itself
instead of asking the front-end, which doesn't make us loose the
permission details.
This also lets us write the backends in hypervisor agnostic way,
hypervisor specific stuff is handled in vhost-user protocol's
implementation now.
https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg05946.html
--
viresh
The current model of memory mapping at the back-end works fine with
Qemu, where a standard call to mmap() for the respective file
descriptor, passed from front-end, is generally all we need to do before
the front-end can start accessing the guest memory.
There are other complex cases though, where we need more information at
the backend and need to do more than just an mmap() call. For example,
Xen, a type-1 hypervisor, currently supports memory mapping via two
different methods, foreign-mapping (via /dev/privcmd) and grant-dev (via
/dev/gntdev). In both these cases, the back-end needs to call mmap()
followed by an ioctl() (or vice-versa), and need to pass extra
information via the ioctl(), like the Xen domain-id of the guest whose
memory we are trying to map.
Add a new protocol feature, 'VHOST_USER_PROTOCOL_F_CUSTOM_MMAP', which
lets the back-end know about the additional memory mapping requirements.
When this feature is negotiated, the front-end can send the
'VHOST_USER_CUSTOM_MMAP' message type to provide the additional
information to the back-end.
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
---
docs/interop/vhost-user.rst | 32 ++++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)
diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
index 3f18ab424eb0..f2b1d705593a 100644
--- a/docs/interop/vhost-user.rst
+++ b/docs/interop/vhost-user.rst
@@ -258,6 +258,23 @@ Inflight description
:queue size: a 16-bit size of virtqueues
+Custom mmap description
+^^^^^^^^^^^^^^^^^^^^^^^
+
++-------+-------+
+| flags | value |
++-------+-------+
+
+:flags: 64-bit bit field
+
+- Bit 0 is Xen foreign memory access flag - needs Xen foreign memory mapping.
+- Bit 1 is Xen grant memory access flag - needs Xen grant memory mapping.
+
+:value: a 64-bit hypervisor specific value.
+
+- For Xen foreign or grant memory access, this is set with guest's xen domain
+ id.
+
C structure
-----------
@@ -867,6 +884,7 @@ Protocol features
#define VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS 14
#define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS 15
#define VHOST_USER_PROTOCOL_F_STATUS 16
+ #define VHOST_USER_PROTOCOL_F_CUSTOM_MMAP 17
Front-end message types
-----------------------
@@ -1422,6 +1440,20 @@ Front-end message types
query the back-end for its device status as defined in the Virtio
specification.
+``VHOST_USER_CUSTOM_MMAP``
+ :id: 41
+ :equivalent ioctl: N/A
+ :request payload: Custom mmap description
+ :reply payload: N/A
+
+ When the ``VHOST_USER_PROTOCOL_F_CUSTOM_MMAP`` protocol feature has been
+ successfully negotiated, this message is submitted by the front-end to
+ notify the back-end of the special memory mapping requirements, that the
+ back-end needs to take care of, while mapping any memory regions sent
+ over by the front-end. The front-end must send this message before
+ any memory-regions are sent to the back-end via ``VHOST_USER_SET_MEM_TABLE``
+ or ``VHOST_USER_ADD_MEM_REG`` message types.
+
Back-end message types
----------------------
--
2.31.1.272.g89b43f80a514
Hi Oleksandr,
As you already know, I am looking at how we can integrate the Xen
grants work in our implementation of Rust based Xen vhost frontend [1].
The hypervisor independent vhost-user backends [2] talk to
xen-vhost-frontend using the standard vhost-user protocol [3]. Every
memory region that the backends get access to are sent to it by the
frontend as memory region descriptors, which contain only address and
size information and lack any permission flags.
I noticed that with Xen grants, there are strict memory access
restrictions, where a memory region may be marked READ only and we
can't map it as RW anymore, trying that just fails. Because the
standard vhost-user protocol doesn't have any permission flags, the
vhost libraries (in Rust) can't do anything else but try to map
everything as RW.
I am wondering how do I proceed on this as I am very much stuck here.
--
viresh
[1] https://github.com/vireshk/xen-vhost-frontend
[2] https://github.com/rust-vmm/vhost-device
[3] https://qemu.readthedocs.io/en/latest/interop/vhost-user.html
[4] https://qemu.readthedocs.io/en/latest/interop/vhost-user.html#memory-region…
Hi Oleksandr,
Finally I am getting around Xen grants and haven't got a running setup
yet. There are few questions I have at the moment:
- Xen's libxl_arm.c creates the iommu nodes only if backend isn't in
Dom0. Why are we forcing it this way ?
I am not running my backend in a separate dom as of now, as they
need to share a unix socket with dom0 (with vhost-user-fronend (our
virtio-disk counterpart)) for vhost-user protocol and am not sure
how to set it up. Maybe I need to use "channel" ? or something else
?
- I tried to hack it up, to keep backend in Dom0 only and create the
iommu nodes unconditionally and the guest kernel is crashing in
drivers/iommu/iommu.c:332
iommu_dev = ops->probe_device(dev);
Since grant_dma_iommu_ops have all the fields set to NULL.
- Anything else you might want to share ?
--
viresh