On Wed, Jun 22, 2022 at 2:49 PM Viresh Kumar viresh.kumar@linaro.org wrote:
On 28-04-22, 16:52, Oleksandr Tyshchenko wrote:
FYI, currently we are working on one feature to restrict memory access using Xen grant mappings based on xen-grant DMA-mapping layer for Linux
[1].
And there is a working PoC on Arm based on an updated virtio-disk. As for libraries, there is a new dependency on "xengnttab" library. In
comparison
with Xen foreign mappings model (xenforeignmemory), the Xen grant mappings model is a good fit into the Xen security model, this is a safe mechanism to share pages between guests.
Hi Oleksandr,
Hello Viresh
[sorry for the possible format issues]
I started getting this stuff into our work and have few questions.
- IIUC, with this feature the guest will allow the host to access only
certain parts of the guest memory, which is exactly what we want as well. I looked at the updated code in virtio-disk and you currently don't allow the grant table mappings along with MAP_IN_ADVANCE, is there any particular reason for that ?
MAP_IN_ADVANCE is the optimization which is only applicable if all incoming addresses are guest physical addresses and the backend is allowed to map arbitrary guest pages using foreign mappings. This is an option to demonstrate how the trusted backend (running in dom0, for example) can pre-map guest memory in advance and just only calculate a host address at the runtime based on the incoming gpa which is used as an offset (there are no xenforeignmemory_map/xenforeignmemory_unmap calls every request). But if the guest uses grant mappings for the virtio (CONFIG_XEN_VIRTIO=y), all incoming addresses are grants instead of gpa (even the virtqueue descriptor rings addresses are grants). Even leaving aside the fact that restricted virtio memory access in the guest means that not all of guest memory can be accessed, so even having pre-maped guest memory in advance, we are not able to calculate a host pointer as we don't know which gpa the particular grant belongs to.
- I understand that you currently map on the go, the virqueue descriptor
rings and then the protocol specific addresses later on, once virtio requests are received from the guest.
But in our case, Vhost user with Rust based hypervisor agnostic backend, the vhost master side can send a number of memory regions for the slave (backend) to map and the backend won't try to map anything apart from that. The virtqueue descriptor rings are available at this point and can be sent, but not the protocol specific addresses, which are available only when a virtio request comes.
- And so we would like to map everything in advance, and access only the
parts which we need to, assuming that the guest would just allow those (as the addresses are shared by the guest itself).
- Will that just work with the current stuff ?
I am not sure that I understand this use-case. Well, let's consider the virtio-disk example, it demonstrates three possible memory mapping modes: 1. All addresses are gpa, map/unmap at runtime using foreign mappings 2. All addresses are gpa, map in advance using foreign mappings 3. All addresses are grants, only map/unmap at runtime using grants mappings
If you are asking about #4 which would imply map in advance together with using grants then I think, no. This won't work with the current stuff. These are conflicting opinions, either grants and map at runtime or gpa and map in advance. If there is a wish to optimize when using grants then "maybe" it is worth looking into how persistent grants work for PV block device for example (feature-persistent in blkif.h).
In Linux's drivers/xen/gntdev.c, we have:
static unsigned int limit = 64*1024;
which translates to 256MB I think, i.e. the max amount of memory we can
map at once. Will making this 128*1024 allow me to map 512 MB for example in a single call ? Any other changes required ?
I am not sure, but I guess the total number is limited by the hypervisor itself. Could you try to increase gnttab_max_frames in the first place?
When I tried that, I got few errors which I am still not able to fix:
The IOCTL_GNTDEV_MAP_GRANT_REF ioctl passed but there were failures after that:
(XEN) common/grant_table.c:1055:d0v2 Bad ref 0x40000 for d1 (XEN) common/grant_table.c:1055:d0v2 Bad ref 0x40001 for d1
...
(XEN) common/grant_table.c:1055:d0v2 Bad ref 0x5fffd for d1 (XEN) common/grant_table.c:1055:d0v2 Bad ref 0x5fffe for d1 (XEN) common/grant_table.c:1055:d0v2 Bad ref 0x5ffff for d1 gnttab: error: mmap failed: Invalid argument
I am working on Linus's origin/master along with the initial patch from Juergen, picked your Xen patch for iommu node.
Yes, this is the correct environment. Please note that Juergen has recently pushed new version [1]
I am still at initial stages to properly test this stuff, just wanted to share the progress to help myself save some of the time debugging this :)
Thanks.
-- viresh
[1] https://lore.kernel.org/xen-devel/20220622063838.8854-1-jgross@suse.com/