On Wed, Feb 17, 2021 at 01:46:28PM +0000, Alex Bennée wrote:
Hi Gerd,
I was in a discussion with the AGL folks today talking about approaches to achieving zero-copy when running VirGL virtio guests. AIUI (which is probably not very much) the reasons for copy can be due to a number of reasons:
- the GPA not being mapped to a HPA that is accessible to the final HW
- the guest allocation of a buffer not meeting stride/alignment requirements
- data needing to be transformed for consumption by the real hardware?
With the current qemu code base each ressource has both a guest and host buffer and the data is copied over when the guest asks for it.
virtio-gpu got a new feature (VIRTIO_GPU_F_RESOURCE_BLOB) to improve that. For blob resources we have stride/alignment negotiation, and they can also be allocated by the host and mapped into the guest address space instead of living in guest ram.
linux guest support is there in the kernel and mesa, host side is supported by crosvm. qemu doesn't support blob resources though.
I'm curious if it's possible to measure the effect of these extra copies and where do they occur? Do all resources get copied from the guest buffer to host or does this only occur when there is a mismatch in the buffer requirements?
Without blob resources a copy is required whenever the guest cpu wants access to the resource (i.e. glWritePixels / glReadPixels + simliar). For resources which are a gpu render target and never touched by the cpu this is not needed. For these you wouldn't even need guest ram backing storage (VIRTIO_GPU_CMD_RESOURCE_ATTACH_BACKING), linux doesn't implement that optimization though.
Are there any functions where I could add trace points to measure this? If this occurs in the kernel I wonder if I could use an eBPF probe to count the number of bytes copied?
Copy happens in qemu or virglrenderer, in response to VIRTIO_GPU_CMD_TRANSFER_* commands from the guest.
There are tracepoint already in qemu (trace_virtio_gpu_cmd_res_xfer_*), they log only the resource id though, not the amount of data transfered.
Tracing on the guest side by adding trace points to the kernel shouldn't be hard too.
take care, Gerd