On Fri, Oct 2, 2020 at 5:26 PM Jean-Philippe Brucker jean-philippe@linaro.org wrote:
On Fri, Oct 02, 2020 at 04:26:10PM +0200, Arnd Bergmann wrote:
On Fri, Oct 2, 2020 at 3:44 PM Jean-Philippe Brucker jean-philippe@linaro.org wrote:
Maybe the partial-page sharing could be part of the iommu dma_ops? There are many scenarios where we already rely on the iommu for isolating the kernel from malicious or accidental DMA, but I have not seen anyone do this on a sub-page granularity. Using bounce buffers for partial page dma_map_* could be something we can do separately from the rest, as both aspects seem useful regardless of one another.
It is about to be added to the dma-iommu module, as part of the consolidation of the IOMMU dma_ops: https://lore.kernel.org/linux-iommu/20200912032200.11489-4-baolu.lu@linux.in...
That will enforce bounce-buffers for any device marked "untrusted" (external devices such as thunderbolt devices, which could be malicious). I believe it would be a good thing to enable in our case as well.
Ah, nice!
Since it depends on the device, I guess we'll need a survey of memory access patterns by the different virtio devices that we're considering. In the end a mix of both solutions might be necessary.
Can you describe a case in which the iommu would clearly be inferior? IOW, what stops us from always using an iommu here?
If the virtio device only transfers small sub-page payloads, for example small packets:
- with static regions we copy each buffer to/from the static region,
- with an IOMMU we copy each buffer to/from a safe page *and* send requests to map+unmap that page. Even if we didn't use bounce buffer in this case, map+unmap generally has a very high cost due to context switching and could easily be much slower than copying a small buffer.
Ok, I see. So the question is mainly if any of the devices that do have sub-page buffers actually care about performance, right?
For this case I see a possible optimization, keeping the bounce buffers mapped for some time so subsequent transfers can reuse them, but it's not implemented at the moment (and I wonder if it opens a vulnerability, though I can't see one right now).
Right, the bounce buffers could essentially use the coherent mapping here, assuming we can find an upper bound on the size. I don't see any vulnerability with that either.
There is a similar issue with vq->indirect virtqueues, where we also need to map the vring descriptors dynamically, and these are not typically page aligned.
Arnd