Stefan Hajnoczi stefanha@redhat.com writes:
[[PGP Signed Part:Undecided]] On Thu, Jan 06, 2022 at 05:03:38PM +0000, Alex Bennée wrote:
Hi,
To start the new year I thought would dump some of my thoughts on zero-copy between VM domains. For project Stratos we've gamely avoided thinking too hard about this while we've been concentrating on solving more tractable problems. However we can't put it off forever so lets work through the problem.
Memory Sharing
<snip>
Buffer Allocation
<snip>
Transparent fallback and scaling
<snip>
- what other constraints we need to take into account?
- can we leverage existing sub-systems to build this support?
I look forward to your thoughts ;-)
(Side note: Shared Virtual Addressing (https://lwn.net/Articles/747230/) is an interesting IOMMU feature. It would be neat to have a CPU equivalent where loads and stores from/to another address space could be done cheaply with CPU support. I don't think this is possible today and that's why software IOMMUs are slow for per-transaction page protection. In other words, a virtio-net TX VM would set up a page table allowing read access only to the TX buffers and vring and the virtual network switch VM would have the capability to access the vring and buffers through the TX VM's dedicated address space.)
Does binding a device to an address space mean the driver allocations will be automatically done from the address space or do the drivers need modifying to take advantage of that? Jean-Phillipe?
Some storage and networking applications use buffered I/O where the guest kernel owns the DMA buffer while others use zero-copy I/O where guest userspace pages are pinned for DMA. I think both cases need to be considered.
Are guest userspace-visible API changes allowed (e.g. making the userspace application aware at buffer allocation time)?
I assume you mean enhanced rather than breaking APIs here? I don't see why not. Certainly for the vhost-user backends we are writing we aren't beholden to sticking to an old API.
Ideally the requirement would be that zero-copy must work for unmodified applications, but I'm not sure if that's realistic.
By the way, VIRTIO 1.2 introduces Shared Memory Regions. These are memory regions (e.g. PCI BAR ranges) that the guest driver can map. If the host pages must come from a special pool then Shared Memory Regions would be one way to configure the guest to use this special zero-copy area for data buffers and even vrings. New VIRTIO feature bits and transport-specific information may need to be added to do this.
Are these fixed sizes or could be accommodate a growing/shrinking region?
Thanks for the pointers.