On Tue, 12 Jan 2021, Alex Bennée wrote:
Hi,
I wanted to bounce some ideas around about our aim of limited memory sharing.
At the end of last year Windriver presented their shared memory approach for hypervisor-less virto. We have also discussed QC's iotlb approach. So far there is no proposed draft the Virtio spec and there are questions about how these shared memory approaches fit within the existing Virtio memory model and how they would interact with a Linux guest driver API to minimise the amount of copying needed as data moves from a primary guest to a back-end.
Given the performance requirements for high bandwidth multimedia devices it feels like we need to get some working code published so we can compare behaviour and implementations details. I think we are still a fair way off in being able to propose any updates to the standard until we can see the changes needed across guest APIs and get some measure of performance and bottlenecks.
However there are a range of devices we are interested in that are less performance sensitive - e.g. SPI, I2C and other "serial" buses. They would also benefit from having a minimal memory profile. Is it worth considering addressing a separate simpler and less performance orientated solution?
Arnd suggested something that I'm going to call a fat VirtQueues. The idea being that both data and descriptor are stored in the same VirtQueue structure. While it would necessitate copying data from guest address space to the queue and back it could be kept to the lower levels of the driver stack without the drivers themselves having to worry too much about the details. With everything contained in the VirtQueue there is only one bit of memory to co-ordinate between the primary guest and service OS which makes isolation a lot easier.
Of course this doesn't solve the problem for the more performance sensitive applications but it would be a workable demonstration of memory isolation across VMs and a useful suggestion in it's own right.
What do people think?
I think it is a good idea: everyone will agree that the first step is to implement a solution that relies on memcpys. Anything smarter is best done as a second step and probably requires new hypervisor interfaces.
From a performance perspective, whether we use a separate pre-shared
buffer, or "fat VirtQueues" as Arnd suggested, the results should be very similar. So I think fat VirtQueues are a good way forward to me.