September 2021 - Stratos-dev - op-lists.linaro.org

Xen Rust VirtIO demos work breakdown for Project Stratos

by Alex Bennée

Hi, The following is a breakdown (as best I can figure) of the work needed to demonstrate VirtIO backends in Rust on the Xen hypervisor. It requires work across a number of projects but notably core rust and virtio enabling in the Xen project (building on the work EPAM has already done) and the start of enabling rust-vmm crate to work with Xen. The first demo is a fairly simple toy to exercise the direct hypercall approach for a unikernel backend. On it's own it isn't super impressive but hopefully serves as a proof of concept for the idea of having backends running in a single exception level where latency will be important. The second is a much more ambitious bridge between Xen and vhost-user to allow for re-use of the existing vhost-user backends with the bridge acting as a proxy for what would usually be a full VMM in the type-2 hypervisor case. With that in mind the rust-vmm work is only aimed at doing the device emulation and doesn't address the larger question of how type-1 hypervisors can be integrated into the rust-vmm hypervisor model. A quick note about the estimates. They are exceedingly rough guesses plucked out of the air and I would be grateful for feedback from the appropriate domain experts on if I'm being overly optimistic or pessimistic. The links to the Stratos JIRA should be at least read accessible to all although they contain the same information as the attached document (albeit with nicer PNG renderings of my ASCII art ;-). There is a Stratos sync-up call next Thursday: https://calendar.google.com/event?action=TEMPLATE&tmeid=MWpidm5lbzM5NjlydnA… and I'm sure there will also be discussion in the various projects (hence the wide CC list). The Stratos calls are open to anyone who wants to attend and we welcome feedback from all who are interested. So on with the work breakdown: ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ STRATOS PLANNING FOR 21 TO 22 Alex Bennée ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Table of Contents ───────────────── 1. Xen Rust Bindings ([STR-51]) .. 1. Upstream an "official" rust crate for Xen ([STR-52]) .. 2. Basic Hypervisor Interactions hypercalls ([STR-53]) .. 3. [#10] Access to XenStore service ([STR-54]) .. 4. VirtIO support hypercalls ([STR-55]) 2. Xen Hypervisor Support for Stratos ([STR-56]) .. 1. Stable ABI for foreignmemory mapping to non-dom0 ([STR-57]) .. 2. Tweaks to tooling to launch VirtIO guests 3. rust-vmm support for Xen VirtIO ([STR-59]) .. 1. Make vm-memory Xen aware ([STR-60]) .. 2. Xen IO notification and IRQ injections ([STR-61]) 4. Stratos Demos .. 1. Rust based stubdomain monitor ([STR-62]) .. 2. Xen aware vhost-user master ([STR-63]) 1 Xen Rust Bindings ([STR-51]) ══════════════════════════════ There exists a [placeholder repository] with the start of a set of x86_64 bindings for Xen and a very basic hello world uni-kernel example. This forms the basis of the initial Xen Rust work and will be available as a [xen-sys crate] via cargo. [STR-51] <https://linaro.atlassian.net/browse/STR-51> [placeholder repository] <https://gitlab.com/cardoe/oxerun.git> [xen-sys crate] <https://crates.io/crates/xen-sys> 1.1 Upstream an "official" rust crate for Xen ([STR-52]) ──────────────────────────────────────────────────────── To start with we will want an upstream location for future work to be based upon. The intention is the crate is independent of the version of Xen it runs on (above the baseline version chosen). This will entail: • ☐ agreeing with upstream the name/location for the source • ☐ documenting the rules for the "stable" hypercall ABI • ☐ establish an internal interface to elide between ioctl mediated and direct hypercalls • ☐ ensure the crate is multi-arch and has feature parity for arm64 As such we expect the implementation to be standalone, i.e. not wrapping the existing Xen libraries for mediation. There should be a close (1-to-1) mapping between the interfaces in the crate and the eventual hypercall made to the hypervisor. Estimate: 4w (elapsed likely longer due to discussion) [STR-52] <https://linaro.atlassian.net/browse/STR-52> 1.2 Basic Hypervisor Interactions hypercalls ([STR-53]) ─────────────────────────────────────────────────────── These are the bare minimum hypercalls implemented as both ioctl and direct calls. These allow for a very basic binary to: • ☐ console_io - output IO via the Xen console • ☐ domctl stub - basic stub for domain control (different API?) • ☐ sysctl stub - basic stub for system control (different API?) The idea would be this provides enough hypercall interface to query the list of domains and output their status via the xen console. There is an open question about if the domctl and sysctl hypercalls are way to go. Estimate: 6w [STR-53] <https://linaro.atlassian.net/browse/STR-53> 1.3 [#10] Access to XenStore service ([STR-54]) ─────────────────────────────────────────────── This is a shared configuration storage space accessed via either Unix sockets (on dom0) or via the Xenbus. This is used to access configuration information for the domain. Is this needed for a backend though? Can everything just be passed direct on the command line? Estimate: 4w [STR-54] <https://linaro.atlassian.net/browse/STR-54> 1.4 VirtIO support hypercalls ([STR-55]) ──────────────────────────────────────── These are the hypercalls that need to be implemented to support a VirtIO backend. This includes the ability to map another guests memory into the current domains address space, register to receive IOREQ events when the guest knocks at the doorbell and inject kicks into the guest. The hypercalls we need to support would be: • ☐ dmop - device model ops (*_ioreq_server, setirq, nr_vpus) • ☐ foreignmemory - map and unmap guest memory The DMOP space is larger than what we need for an IOREQ backend so I've based it just on what arch/arm/dm.c exports which is the subset introduced for EPAM's virtio work. Estimate: 12w [STR-55] <https://linaro.atlassian.net/browse/STR-55> 2 Xen Hypervisor Support for Stratos ([STR-56]) ═══════════════════════════════════════════════ These tasks include tasks needed to support the various different deployments of Stratos components in Xen. [STR-56] <https://linaro.atlassian.net/browse/STR-56> 2.1 Stable ABI for foreignmemory mapping to non-dom0 ([STR-57]) ─────────────────────────────────────────────────────────────── Currently the foreign memory mapping support only works for dom0 due to reference counting issues. If we are to support backends running in their own domains this will need to get fixed. Estimate: 8w [STR-57] <https://linaro.atlassian.net/browse/STR-57> 2.2 Tweaks to tooling to launch VirtIO guests ───────────────────────────────────────────── There might not be too much to do here. The EPAM work already did something similar for their PoC for virtio-block. Essentially we need to ensure: • ☐ DT bindings are passed to the guest for virtio-mmio device discovery • ☐ Our rust backend can be instantiated before the domU is launched This currently assumes the tools and the backend are running in dom0. Estimate: 4w 3 rust-vmm support for Xen VirtIO ([STR-59]) ════════════════════════════════════════════ This encompasses the tasks required to get a vhost-user server up and running while interfacing to the Xen hypervisor. This will require the xen-sys.rs crate for the actual interface to the hypervisor. We need to work out how a Xen configuration option would be passed to the various bits of rust-vmm when something is being built. [STR-59] <https://linaro.atlassian.net/browse/STR-59> 3.1 Make vm-memory Xen aware ([STR-60]) ─────────────────────────────────────── The vm-memory crate is the root crate for abstracting access to the guests memory. It currently has multiple configuration builds to handle difference between mmap on Windows and Unix. Although mmap isn't directly exposed the public interfaces support a mmap like interface. We would need to: • ☐ work out how to expose foreign memory via the vm-memory mechanism I'm not sure if this just means implementing the GuestMemory trait for a GuestMemoryXen or if we need to present a mmap like interface. Estimate: 8w [STR-60] <https://linaro.atlassian.net/browse/STR-60> 3.2 Xen IO notification and IRQ injections ([STR-61]) ───────────────────────────────────────────────────── The KVM world provides for ioeventfd (notifications) and irqfd (injection) to signal asynchronously between the guest and the backend. As far a I can tell this is currently handled inside the various VMMs which assume a KVM backend. While the vhost-user slave code doesn't see the register_ioevent/register_irqfd events it does deal with EventFDs throughout the code. Perhaps the best approach here would be to create a IOREQ crate that can create EventFD descriptors which can then be passed to the slaves to use for notification and injection. Otherwise there might be an argument for a new crate that can encapsulate this behaviour for both KVM/ioeventd and Xen/IOREQ setups? Estimate: 8w? [STR-61] <https://linaro.atlassian.net/browse/STR-61> 4 Stratos Demos ═══════════════ These tasks cover the creation of demos that brig together all the previous bits of work to demonstrate a new area of capability that has been opened up by Stratos work. 4.1 Rust based stubdomain monitor ([STR-62]) ──────────────────────────────────────────── This is a basic demo that is a proof of concept for a unikernel style backend written in pure Rust. This work would be a useful precursor for things such as the RTOS Dom0 on a safety island ([STR-11]) or as a carrier for the virtio-scmi backend. The monitor program will periodically poll the state of the other domains and echo their status to the Xen console. Estimate: 4w #+name: stub-domain-example #+begin_src ditaa :cmdline -o :file stub_domain_example.png Dom0 | DomU | DomStub | | : /-------------\ : | |cPNK | | | | | | | | | | /------------------------------------\ | | GuestOS | | |cPNK | | | | | EL0 | Dom0 Userspace (xl tools, QEMU) | | | | | /---------------\ | | | | | | |cYEL | \------------------------------------/ | | | | | | +------------------------------------+ | | | | | Rust Monitor | EL1 |cA1B Dom0 Kernel | | | | | | | +------------------------------------+ | \-------------/ | \---------------/ -------------------------------------------------------------------------------=------------------ +-------------------------------------------------------------------------------------+ EL2 |cC02 Xen Hypervisor | +-------------------------------------------------------------------------------------+ #+end_src [STR-62] <https://linaro.atlassian.net/browse/STR-62> [STR-11] <https://linaro.atlassian.net/browse/STR-11> 4.2 Xen aware vhost-user master ([STR-63]) ────────────────────────────────────────── Usually the master side of a vhost-user system is embedded directly in the VMM itself. However in a Xen deployment their is no overarching VMM but a series of utility programs that query the hypervisor directly. The Xen tooling is also responsible for setting up any support processes that are responsible for emulating HW for the guest. The task aims to bridge the gap between Xen's normal HW emulation path (ioreq) and VirtIO's userspace device emulation (vhost-user). The process would be started with some information on where the virtio-mmio address space is and what the slave binary will be. It will then: • map the guest into Dom0 userspace and attach to a MemFD • register the appropriate memory regions as IOREQ regions with Xen • create EventFD channels for the virtio kick notifications (one each way) • spawn the vhost-user slave process and mediate the notifications and kicks between the slave and Xen itself #+name: xen-vhost-user-master #+begin_src ditaa :cmdline -o :file xen_vhost_user_master.png Dom0 DomU | | | | | | +-------------------+ +-------------------+ | | |----------->| | | | vhost-user | vhost-user | vhost-user | : /------------------------------------\ | slave | protocol | master | | | | | (existing) |<-----------| (rust) | | | | +-------------------+ +-------------------+ | | | ^ ^ | ^ | | Guest Userspace | | | | | | | | | | | IOREQ | | | | | | | | | | | v v V | | \------------------------------------/ +---------------------------------------------------+ | +------------------------------------+ | ^ ^ | ioctl ^ | | | | | | iofd/irqfd eventFD | | | | | | Guest Kernel | | +---------------------------+ | | | | | +-------------+ | | | | | | | | virtio-dev | | | Host Kernel V | | | | +-------------+ | +---------------------------------------------------+ | +------------------------------------+ | ^ | | ^ | hyper | | | ----------------------=------------- | -=--- | ----=------ | -----=- | --------=------------------ | call | Trap | | IRQ V | V | +-------------------------------------------------------------------------------------+ | | ^ | ^ | | | +-------------+ | | EL2 | Xen Hypervisor | | | | +-------------------------------+ | | | +-------------------------------------------------------------------------------------+ #+end_src [STR-63] <https://linaro.atlassian.net/browse/STR-63> -- Alex Bennée

3 years, 7 months

8
15
0 0

Enabling hypervisor agnosticism for VirtIO backends

by Alex Bennée

Hi, One of the goals of Project Stratos is to enable hypervisor agnostic backends so we can enable as much re-use of code as possible and avoid repeating ourselves. This is the flip side of the front end where multiple front-end implementations are required - one per OS, assuming you don't just want Linux guests. The resultant guests are trivially movable between hypervisors modulo any abstracted paravirt type interfaces. In my original thumb nail sketch of a solution I envisioned vhost-user daemons running in a broadly POSIX like environment. The interface to the daemon is fairly simple requiring only some mapped memory and some sort of signalling for events (on Linux this is eventfd). The idea was a stub binary would be responsible for any hypervisor specific setup and then launch a common binary to deal with the actual virtqueue requests themselves. Since that original sketch we've seen an expansion in the sort of ways backends could be created. There is interest in encapsulating backends in RTOSes or unikernels for solutions like SCMI. There interest in Rust has prompted ideas of using the trait interface to abstract differences away as well as the idea of bare-metal Rust backends. We have a card (STR-12) called "Hypercall Standardisation" which calls for a description of the APIs needed from the hypervisor side to support VirtIO guests and their backends. However we are some way off from that at the moment as I think we need to at least demonstrate one portable backend before we start codifying requirements. To that end I want to think about what we need for a backend to function. Configuration ============= In the type-2 setup this is typically fairly simple because the host system can orchestrate the various modules that make up the complete system. In the type-1 case (or even type-2 with delegated service VMs) we need some sort of mechanism to inform the backend VM about key details about the system: - where virt queue memory is in it's address space - how it's going to receive (interrupt) and trigger (kick) events - what (if any) resources the backend needs to connect to Obviously you can elide over configuration issues by having static configurations and baking the assumptions into your guest images however this isn't scalable in the long term. The obvious solution seems to be extending a subset of Device Tree data to user space but perhaps there are other approaches? Before any virtio transactions can take place the appropriate memory mappings need to be made between the FE guest and the BE guest. Currently the whole of the FE guests address space needs to be visible to whatever is serving the virtio requests. I can envision 3 approaches: * BE guest boots with memory already mapped This would entail the guest OS knowing where in it's Guest Physical Address space is already taken up and avoiding clashing. I would assume in this case you would want a standard interface to userspace to then make that address space visible to the backend daemon. * BE guests boots with a hypervisor handle to memory The BE guest is then free to map the FE's memory to where it wants in the BE's guest physical address space. To activate the mapping will require some sort of hypercall to the hypervisor. I can see two options at this point: - expose the handle to userspace for daemon/helper to trigger the mapping via existing hypercall interfaces. If using a helper you would have a hypervisor specific one to avoid the daemon having to care too much about the details or push that complexity into a compile time option for the daemon which would result in different binaries although a common source base. - expose a new kernel ABI to abstract the hypercall differences away in the guest kernel. In this case the userspace would essentially ask for an abstract "map guest N memory to userspace ptr" and let the kernel deal with the different hypercall interfaces. This of course assumes the majority of BE guests would be Linux kernels and leaves the bare-metal/unikernel approaches to their own devices. Operation ========= The core of the operation of VirtIO is fairly simple. Once the vhost-user feature negotiation is done it's a case of receiving update events and parsing the resultant virt queue for data. The vhost-user specification handles a bunch of setup before that point, mostly to detail where the virt queues are set up FD's for memory and event communication. This is where the envisioned stub process would be responsible for getting the daemon up and ready to run. This is currently done inside a big VMM like QEMU but I suspect a modern approach would be to use the rust-vmm vhost crate. It would then either communicate with the kernel's abstracted ABI or be re-targeted as a build option for the various hypervisors. One question is how to best handle notification and kicks. The existing vhost-user framework uses eventfd to signal the daemon (although QEMU is quite capable of simulating them when you use TCG). Xen has it's own IOREQ mechanism. However latency is an important factor and having events go through the stub would add quite a lot. Could we consider the kernel internally converting IOREQ messages from the Xen hypervisor to eventfd events? Would this scale with other kernel hypercall interfaces? So any thoughts on what directions are worth experimenting with? -- Alex Bennée

4 years, 1 month

11
52
0 0

[PATCH V3] virtio: i2c: Allow zero-length transactions

by Viresh Kumar

The I2C protocol allows zero-length requests with no data, like the SMBus Quick command, where the command is inferred based on the read/write flag itself. In order to allow such a request, allocate another bit, VIRTIO_I2C_FLAGS_M_RD(1), in the flags to pass the request type, as read or write. This was earlier done using the read/write permission to the buffer itself. This still won't work well if multiple buffers are passed for the same request, i.e. the write-read requests, as the VIRTIO_I2C_FLAGS_M_RD flag can only be used with a single buffer. Coming back to it, there is no need to send multiple buffers with a single request. All we need, is a way to group several requests together, which we can already do based on the VIRTIO_I2C_FLAGS_FAIL_NEXT flag. Remove support for multiple buffers within a single request. Since we are at very early stage of development currently, we can do these modifications without addition of new features or versioning of the protocol. Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org> --- V2->V3: - Add conformance clauses that require that the flag is consistent with the buffer. V1->V2: - Name the buffer-less request as zero-length request. Hi Guys, I did try to follow the discussion you guys had during V4, where we added support for multiple buffers for the same request, which I think is unnecessary now, after introduction of the VIRTIO_I2C_FLAGS_FAIL_NEXT flag. https://lists.oasis-open.org/archives/virtio-comment/202011/msg00005.html And so starting this discussion again, because we need to support stuff like: i2cdetect -q <i2c-bus-number>, which issues a zero-length SMBus Quick command. --- virtio-i2c.tex | 66 +++++++++++++++++++++++++++----------------------- 1 file changed, 36 insertions(+), 30 deletions(-) diff --git a/virtio-i2c.tex b/virtio-i2c.tex index 949d75f44158..c7335372a8bb 100644 --- a/virtio-i2c.tex +++ b/virtio-i2c.tex @@ -54,8 +54,7 @@ \subsubsection{Device Operation: Request Queue}\label{sec:Device Types / I2C Ada \begin{lstlisting} struct virtio_i2c_req { struct virtio_i2c_out_hdr out_hdr; - u8 write_buf[]; - u8 read_buf[]; + u8 buf[]; struct virtio_i2c_in_hdr in_hdr; }; \end{lstlisting} @@ -84,16 +83,16 @@ \subsubsection{Device Operation: Request Queue}\label{sec:Device Types / I2C Ada and sets it on the other requests. If this bit is set and a device fails to process the current request, it needs to fail the next request instead of attempting to execute it. + +\item[VIRTIO_I2C_FLAGS_M_RD(1)] is used to mark the request as READ or WRITE. \end{description} Other bits of \field{flags} are currently reserved as zero for future feature extensibility. -The \field{write_buf} of the request contains one segment of an I2C transaction -being written to the device. - -The \field{read_buf} of the request contains one segment of an I2C transaction -being read from the device. +The \field{buf} of the request is optional and contains one segment of an I2C +transaction being read from or written to the device, based on the value of the +\field{VIRTIO_I2C_FLAGS_M_RD} bit in the \field{flags} field. The final \field{status} byte of the request is written by the device: either VIRTIO_I2C_MSG_OK for success or VIRTIO_I2C_MSG_ERR for error. @@ -103,27 +102,27 @@ \subsubsection{Device Operation: Request Queue}\label{sec:Device Types / I2C Ada #define VIRTIO_I2C_MSG_ERR 1 \end{lstlisting} -If ``length of \field{read_buf}''=0 and ``length of \field{write_buf}''>0, -the request is called write request. +If \field{VIRTIO_I2C_FLAGS_M_RD} bit is set in the \field{flags}, then the +request is called a read request. -If ``length of \field{read_buf}''>0 and ``length of \field{write_buf}''=0, -the request is called read request. +If \field{VIRTIO_I2C_FLAGS_M_RD} bit is unset in the \field{flags}, then the +request is called a write request. -If ``length of \field{read_buf}''>0 and ``length of \field{write_buf}''>0, -the request is called write-read request. It means an I2C write segment followed -by a read segment. Usually, the write segment provides the number of an I2C -controlled device register to be read. +The \field{buf} is optional and will not be present for a zero-length request, +like SMBus Quick. -The case when ``length of \field{write_buf}''=0, and at the same time, -``length of \field{read_buf}''=0 doesn't make any sense. +The virtio I2C protocol supports write-read requests, i.e. an I2C write segment +followed by a read segment (usually, the write segment provides the number of an +I2C controlled device register to be read), by grouping a list of requests +together using the \field{VIRTIO_I2C_FLAGS_FAIL_NEXT} flag. \subsubsection{Device Operation: Operation Status}\label{sec:Device Types / I2C Adapter Device / Device Operation: Operation Status} -\field{addr}, \field{flags}, ``length of \field{write_buf}'' and ``length of \field{read_buf}'' -are determined by the driver, while \field{status} is determined by the processing -of the device. A driver puts the data written to the device into \field{write_buf}, while -a device puts the data of the corresponding length into \field{read_buf} according to the -request of the driver. +\field{addr}, \field{flags}, and ``length of \field{buf}'' are determined by the +driver, while \field{status} is determined by the processing of the device. A +driver, for a write request, puts the data to be written to the device into the +\field{buf}, while a device, for a read request, puts the data read from device +into the \field{buf} according to the request from the driver. A driver may send one request or multiple requests to the device at a time. The requests in the virtqueue are both queued and processed in order. @@ -141,11 +140,16 @@ \subsubsection{Device Operation: Operation Status}\label{sec:Device Types / I2C A driver MUST set the reserved bits of \field{flags} to be zero. -The driver MUST NOT send a request with ``length of \field{write_buf}''=0 and -``length of \field{read_buf}''=0 at the same time. +A driver MUST NOT send the \field{buf}, for a zero-length request. + +A driver MUST NOT use \field{buf}, for a read request, if the final +\field{status} returned from the device is VIRTIO_I2C_MSG_ERR. -A driver MUST NOT use \field{read_buf} if the final \field{status} returned -from the device is VIRTIO_I2C_MSG_ERR. +A driver MUST set the \field{VIRTIO_I2C_FLAGS_M_RD} flag for a read operation, +where the buffer is write-only for the device. + +A driver MUST NOT set the \field{VIRTIO_I2C_FLAGS_M_RD} flag for a write +operation, where the buffer is read-only for the device. A driver MUST queue the requests in order if multiple requests are going to be sent at a time. @@ -160,11 +164,13 @@ \subsubsection{Device Operation: Operation Status}\label{sec:Device Types / I2C A device SHOULD keep consistent behaviors with the hardware as described in \hyperref[intro:I2C]{I2C}. -A device MUST NOT change the value of \field{addr}, reserved bits of \field{flags} -and \field{write_buf}. +A device MUST NOT change the value of \field{addr}, and reserved bits of +\field{flags}. + +A device MUST not change the value of the \field{buf} for a write request. -A device MUST place one I2C segment of the corresponding length into \field{read_buf} -according the driver's request. +A device MUST place one I2C segment of the ``length of \field{buf}'', for the +read request, into the \field{buf} according the driver's request. A device MUST guarantee the requests in the virtqueue being processed in order if multiple requests are received at a time. -- 2.31.1.272.g89b43f80a514

4 years, 1 month

2
2
0 0

Is it time to start implementing Xen bindings for rust-vmm?

by Alex Bennée

Hi, As we consider the next cycle for Project Stratos I would like to make some more progress on hypervisor agnosticism for our virtio backends. While we have implemented a number of virtio vhost-user backends using C we've rapidly switched to using rust-vmm based ones for virtio-i2c, virtio-rng and virtio-gpio. Given the interest in Rust for implementing backends does it make sense to do some enabling work in rust-vmm to support Xen? There are two chunks of work I can think of: 1. Enough of libxl/hypervisor interface to implement an IOREQ end point. This would require supporting enough of the hypervisor interface to support the implementation of an IOREQ server. We would also need to think about how we would map the IOREQ view of the world into the existing vhost-user interface so we can re-use the current vhost-user backends code base. The two approaches I can think of are: a) implement a vhost-user master that speaks IOREQ to the hypervisor and vhost-user to the vhost-user slave. In this case the bridge would be standing in for something like QEMU. b) implement some variants of the vhost-user slave traits that can talk directly to the hypervisor to get/send the equivalent kick/notify events. I don't know if this might be too complex as the impedance matching between the two interfaces might be too great. This assumes most of the setup is done by the existing toolstack, so the existing libxl tools are used to create, connect and configure the domains before the backend is launched. which leads to: 2. The rest of the libxl/hypervisor interface. This would be the rest of the interface to allow rust-vmm tools to be written that could create, configure and manage Xen domains with pure rust tools. My main concern about this is how rust-vmm's current model (which is very much KVM influenced) will be able to handle the differences for a type-1 hypervisor. Wei's pointed me to the Linux support that was added to expose a Hyper-V control interface via the Linux kernel. While I can see support has been merged on other rust based projects I think the rust-vmm crate is still outstanding: https://github.com/rust-vmm/community/issues/50 and I guess this would need revisiting for Xen to see if the proposed abstraction would scale across other hypervisors. Finally there is the question of how/if any of this would relate to the concept of bare-metal rust backends? We've talked about bare metal backends before but I wonder if the programming model for them is going to be outside the scope of rust-vmm? Would be program just be hardwired to IRQs and be presented a doorbell port to kick or would we want to have at least some of the higher level rust-vmm abstractions for dealing with navigating the virtqueues and responding and filling in data? Thoughts? -- Alex Bennée

4 years, 2 months

4
6
0 0

[PATCH V2 0/3] virtio: Add vhost-user-i2c device's support

by Viresh Kumar

Hello, This patchset adds vhost-user-i2c device's support in Qemu. Initially I tried to add the backend implementation as well into Qemu, but as I was looking for a hypervisor agnostic backend implementation, I decided to keep it outside of Qemu. Eventually I implemented it in Rust and it works very well with this patchset, and it is under review [1] to be merged in common rust vhost devices crate. The kernel virtio I2C driver [2] is fully reviewed and is ready to be merged soon. V1->V2: - Dropped the backend support from qemu and minor cleanups. I2C Testing: ------------ I didn't have access to a real hardware where I can play with a I2C client device (like RTC, eeprom, etc) to verify the working of the backend daemon, so I decided to test it on my x86 box itself with hierarchy of two ARM64 guests. The first ARM64 guest was passed "-device ds1338,address=0x20" option, so it could emulate a ds1338 RTC device, which connects to an I2C bus. Once the guest came up, ds1338 device instance was created within the guest kernel by doing: echo ds1338 0x20 > /sys/bus/i2c/devices/i2c-0/new_device [ Note that this may end up binding the ds1338 device to its driver, which won't let our i2c daemon talk to the device. For that we need to manually unbind the device from the driver: echo 0-0020 > /sys/bus/i2c/devices/0-0020/driver/unbind ] After this is done, you will get /dev/rtc1. This is the device we wanted to emulate, which will be accessed by the vhost-user-i2c backend daemon via the /dev/i2c-0 file present in the guest VM. At this point we need to start the backend daemon and give it a socket-path to talk to from qemu (you can pass -v to it to get more detailed messages): vhost-user-i2c --socket-path=vi2c.sock -l 0:32 [ Here, 0:32 is the bus/device mapping, 0 for /dev/i2c-0 and 32 (i.e. 0x20) is client address of ds1338 that we used while creating the device. ] Now we need to start the second level ARM64 guest (from within the first guest) to get the i2c-virtio.c Linux driver up. The second level guest is passed the following options to connect to the same socket: -chardev socket,path=vi2c.sock0,id=vi2c \ -device vhost-user-i2c-pci,chardev=vi2c,id=i2c Once the second level guest boots up, we will see the i2c-virtio bus at /sys/bus/i2c/devices/i2c-X/. From there we can now make it emulate the ds1338 device again by doing: echo ds1338 0x20 > /sys/bus/i2c/devices/i2c-0/new_device [ This time we want ds1338's driver to be bound to the device, so it should be enabled in the kernel as well. ] And we will get /dev/rtc1 device again here in the second level guest. Now we can play with the rtc device with help of hwclock utility and we can see the following sequence of transfers happening if we try to update rtc's time from system time. hwclock -w -f /dev/rtc1 (in guest2) -> Reaches i2c-virtio.c (Linux bus driver in guest2) -> transfer over virtio -> Reaches the qemu's vhost-i2c device emulation (running over guest1) -> Reaches the backend daemon vhost-user-i2c started earlier (in guest1) -> ioctl(/dev/i2c-0, I2C_RDWR, ..); (in guest1) -> reaches qemu's hw/rtc/ds1338.c (running over host) SMBUS Testing: -------------- I wasn't required to have such a tedious setup for testing out with SMBUS devices. I was able to emulate a SMBUS device on my x86 machine using i2c-stub driver. $ modprobe i2c-stub chip_addr=0x20 //Boot the arm64 guest now with i2c-virtio driver and then do: $ echo al3320a 0x20 > /sys/class/i2c-adapter/i2c-0/new_device $ cat /sys/bus/iio/devices/iio:device0/in_illuminance_raw That's it. I hope I was able to give a clear picture of my test setup here :) -- Viresh Viresh Kumar (3): hw/virtio: add boilerplate for vhost-user-i2c device hw/virtio: add vhost-user-i2c-pci boilerplate MAINTAINERS: Add entry for virtio-i2c MAINTAINERS | 7 + hw/virtio/Kconfig | 5 + hw/virtio/meson.build | 2 + hw/virtio/vhost-user-i2c-pci.c | 69 +++++++ hw/virtio/vhost-user-i2c.c | 288 +++++++++++++++++++++++++++++ include/hw/virtio/vhost-user-i2c.h | 28 +++ 6 files changed, 399 insertions(+) create mode 100644 hw/virtio/vhost-user-i2c-pci.c create mode 100644 hw/virtio/vhost-user-i2c.c create mode 100644 include/hw/virtio/vhost-user-i2c.h -- 2.31.1.272.g89b43f80a514

4 years, 2 months

2
5
0 0

FYI: Issues in offset calculation in virtqueue

by François Ozog

Hi I believe there is a hidden problem in the virtio implementation of Qemu (up to 6.1.0) in calculating the offset of the "used" split vring and the spec need some clarifications. Should anyone decide it deserve upstream/spec changes, feel free to do so. Cheers FF According to the specification in 2.6.2: #define ALIGN(x) (((x) + qalign) & ~qalign) static inline unsigned virtq_size(unsigned int qsz) { return ALIGN(sizeof(struct virtq_desc)*qsz + sizeof(u16)*(*3* + qsz)) } And more specifically, "used" starts after "avail" as defined in 2.6.6 struct virtq_avail { #define VIRTQ_AVAIL_F_NO_INTERRUPT 1 le16 flags; le16 idx; le16 ring[ /* Queue Size */ ]; *le16 used_event; /* Only if VIRTIO_F_EVENT_IDX */* }; Linux and kvmtool calculates the offset with the formula: LINUX: vring_init @ include/uapi/linux/virtio_ring.h vr->avail = (struct vring_avail *)((char *)p + num * sizeof(struct vring_desc)); vr->used = (void *)(((uintptr_t)&vr->avail->ring[num] *+ sizeof(__virtio16)* + align-1) & ~(align - 1)); The "+ sizeof(__virtio16)" properly accounts for the "used_event" in struct virtue_avail. Hypervisor ACRN uses a similar scheme: virtio_vq_init @ /devicemodel/hw/pci/virtio/virtio.c vq->avail = (struct vring_avail *)vb; vb += (2 + vq->qsize *+ 1*) * sizeof(uint16_t); /* Then it's rounded up to the next page... */ vb = (char *)roundup2((uintptr_t)vb, VIRTIO_PCI_VRING_ALIGN); /* ... and the last page(s) are the used ring. */ vq->used = (struct vring_used *)vb; But Qemu uses: QEMU: virtio_queue_update_rings @ hw/virtio/virtio.c vring->used = vring_align(vring->avail + offsetof(VRingAvail, ring[vring->num]), vring->align); Linux alignment policies end up having vring->align values either 4096 (for MMIO) or 64 (PCI), and thus there are no visible issues. If you use a different OS that choses an alignment of 4 (valid as per section 2.6) then Qemu does not calculate the same location for used and virtio does not work anymore. The OS actually works fine with the alignment of 4 with kvmtool and ACRN. There are two other problems: on the spec, the comment "/* Only if VIRTIO_F_EVENT_IDX */" on the avail structure is not clear wether if: - the field is always there but its content are only valid if... - the field may be absent altogether. Inferring from calculation formulae the field is always present, but some language would help clarifying this. On 2.6.2, the alignment formula "#define ALIGN(x) (((x) + qalign) & ~qalign)" is true if "qalign" is actually a mask as in many parts of the spec, align is referred to as a power of 2. It may be good to change the text with something like: #define ALIGN(x) (((x) + (qalign - 1)) & ~(qalign -1)) /* where qalign is a power of Z */ -- François-Frédéric Ozog | *Director Business Development* T: +33.67221.6485 francois.ozog(a)linaro.org | Skype: ffozog

4 years, 2 months

1
0
0 0

virtio-video support

by François Ozog

Hi I was asked by AGL Virtualization AG if there was any implementation of the virtio-video driver: could anyone let me know what I should answer? Cheers FF -- François-Frédéric Ozog | *Director Business Development* T: +33.67221.6485 francois.ozog(a)linaro.org | Skype: ffozog

4 years, 2 months

2
1
0 0

Re: [Stratos-dev] Enabling hypervisor agnosticism for VirtIO backends

by François Ozog

I top post as I find it difficult to identify where to make the comments. 1) BE acceleration Network and storage backends may actually be executed in SmartNICs. As virtio 1.1 is hardware friendly, there may be SmartNICs with virtio 1.1 PCI VFs. Is it a valid use case for the generic BE framework to be used in this context? DPDK is used in some BE to significantly accelerate switching. DPDK is also used sometimes in guests. In that case, there are no event injection but just high performance memory scheme. Is this considered as a use case? 2) Virtio as OS HAL Panasonic CTO has been calling for a virtio based HAL and based on the teachings of Google GKI, an internal HAL seem inevitable in the long term. Virtio is then a contender to Google promoted Android HAL. Could the framework be used in that context? On Wed, 11 Aug 2021 at 08:28, AKASHI Takahiro via Stratos-dev < stratos-dev(a)op-lists.linaro.org> wrote: > On Wed, Aug 04, 2021 at 12:20:01PM -0700, Stefano Stabellini wrote: > > CCing people working on Xen+VirtIO and IOREQs. Not trimming the original > > email to let them read the full context. > > > > My comments below are related to a potential Xen implementation, not > > because it is the only implementation that matters, but because it is > > the one I know best. > > Please note that my proposal (and hence the working prototype)[1] > is based on Xen's virtio implementation (i.e. IOREQ) and particularly > EPAM's virtio-disk application (backend server). > It has been, I believe, well generalized but is still a bit biased > toward this original design. > > So I hope you like my approach :) > > [1] > https://op-lists.linaro.org/pipermail/stratos-dev/2021-August/000546.html > > Let me take this opportunity to explain a bit more about my approach below. > > > Also, please see this relevant email thread: > > https://marc.info/?l=xen-devel&m=162373754705233&w=2 > > > > > > On Wed, 4 Aug 2021, Alex Bennée wrote: > > > Hi, > > > > > > One of the goals of Project Stratos is to enable hypervisor agnostic > > > backends so we can enable as much re-use of code as possible and avoid > > > repeating ourselves. This is the flip side of the front end where > > > multiple front-end implementations are required - one per OS, assuming > > > you don't just want Linux guests. The resultant guests are trivially > > > movable between hypervisors modulo any abstracted paravirt type > > > interfaces. > > > > > > In my original thumb nail sketch of a solution I envisioned vhost-user > > > daemons running in a broadly POSIX like environment. The interface to > > > the daemon is fairly simple requiring only some mapped memory and some > > > sort of signalling for events (on Linux this is eventfd). The idea was > a > > > stub binary would be responsible for any hypervisor specific setup and > > > then launch a common binary to deal with the actual virtqueue requests > > > themselves. > > > > > > Since that original sketch we've seen an expansion in the sort of ways > > > backends could be created. There is interest in encapsulating backends > > > in RTOSes or unikernels for solutions like SCMI. There interest in Rust > > > has prompted ideas of using the trait interface to abstract differences > > > away as well as the idea of bare-metal Rust backends. > > > > > > We have a card (STR-12) called "Hypercall Standardisation" which > > > calls for a description of the APIs needed from the hypervisor side to > > > support VirtIO guests and their backends. However we are some way off > > > from that at the moment as I think we need to at least demonstrate one > > > portable backend before we start codifying requirements. To that end I > > > want to think about what we need for a backend to function. > > > > > > Configuration > > > ============= > > > > > > In the type-2 setup this is typically fairly simple because the host > > > system can orchestrate the various modules that make up the complete > > > system. In the type-1 case (or even type-2 with delegated service VMs) > > > we need some sort of mechanism to inform the backend VM about key > > > details about the system: > > > > > > - where virt queue memory is in it's address space > > > - how it's going to receive (interrupt) and trigger (kick) events > > > - what (if any) resources the backend needs to connect to > > > > > > Obviously you can elide over configuration issues by having static > > > configurations and baking the assumptions into your guest images > however > > > this isn't scalable in the long term. The obvious solution seems to be > > > extending a subset of Device Tree data to user space but perhaps there > > > are other approaches? > > > > > > Before any virtio transactions can take place the appropriate memory > > > mappings need to be made between the FE guest and the BE guest. > > > > > Currently the whole of the FE guests address space needs to be visible > > > to whatever is serving the virtio requests. I can envision 3 > approaches: > > > > > > * BE guest boots with memory already mapped > > > > > > This would entail the guest OS knowing where in it's Guest Physical > > > Address space is already taken up and avoiding clashing. I would > assume > > > in this case you would want a standard interface to userspace to then > > > make that address space visible to the backend daemon. > > Yet another way here is that we would have well known "shared memory" > between > VMs. I think that Jailhouse's ivshmem gives us good insights on this matter > and that it can even be an alternative for hypervisor-agnostic solution. > > (Please note memory regions in ivshmem appear as a PCI device and can be > mapped locally.) > > I want to add this shared memory aspect to my virtio-proxy, but > the resultant solution would eventually look similar to ivshmem. > > > > * BE guests boots with a hypervisor handle to memory > > > > > > The BE guest is then free to map the FE's memory to where it wants in > > > the BE's guest physical address space. > > > > I cannot see how this could work for Xen. There is no "handle" to give > > to the backend if the backend is not running in dom0. So for Xen I think > > the memory has to be already mapped > > In Xen's IOREQ solution (virtio-blk), the following information is expected > to be exposed to BE via Xenstore: > (I know that this is a tentative approach though.) > - the start address of configuration space > - interrupt number > - file path for backing storage > - read-only flag > And the BE server have to call a particular hypervisor interface to > map the configuration space. > > In my approach (virtio-proxy), all those Xen (or hypervisor)-specific > stuffs are contained in virtio-proxy, yet another VM, to hide all details. > > # My point is that a "handle" is not mandatory for executing mapping. > > > and the mapping probably done by the > > toolstack (also see below.) Or we would have to invent a new Xen > > hypervisor interface and Xen virtual machine privileges to allow this > > kind of mapping. > > > If we run the backend in Dom0 that we have no problems of course. > > One of difficulties on Xen that I found in my approach is that calling > such hypervisor intefaces (registering IOREQ, mapping memory) is only > allowed on BE servers themselvies and so we will have to extend those > interfaces. > This, however, will raise some concern on security and privilege > distribution > as Stefan suggested. > > > > > > > To activate the mapping will > > > require some sort of hypercall to the hypervisor. I can see two > options > > > at this point: > > > > > > - expose the handle to userspace for daemon/helper to trigger the > > > mapping via existing hypercall interfaces. If using a helper you > > > would have a hypervisor specific one to avoid the daemon having to > > > care too much about the details or push that complexity into a > > > compile time option for the daemon which would result in different > > > binaries although a common source base. > > > > > > - expose a new kernel ABI to abstract the hypercall differences away > > > in the guest kernel. In this case the userspace would essentially > > > ask for an abstract "map guest N memory to userspace ptr" and let > > > the kernel deal with the different hypercall interfaces. This of > > > course assumes the majority of BE guests would be Linux kernels and > > > leaves the bare-metal/unikernel approaches to their own devices. > > > > > > Operation > > > ========= > > > > > > The core of the operation of VirtIO is fairly simple. Once the > > > vhost-user feature negotiation is done it's a case of receiving update > > > events and parsing the resultant virt queue for data. The vhost-user > > > specification handles a bunch of setup before that point, mostly to > > > detail where the virt queues are set up FD's for memory and event > > > communication. This is where the envisioned stub process would be > > > responsible for getting the daemon up and ready to run. This is > > > currently done inside a big VMM like QEMU but I suspect a modern > > > approach would be to use the rust-vmm vhost crate. It would then either > > > communicate with the kernel's abstracted ABI or be re-targeted as a > > > build option for the various hypervisors. > > > > One thing I mentioned before to Alex is that Xen doesn't have VMMs the > > way they are typically envisioned and described in other environments. > > Instead, Xen has IOREQ servers. Each of them connects independently to > > Xen via the IOREQ interface. E.g. today multiple QEMUs could be used as > > emulators for a single Xen VM, each of them connecting to Xen > > independently via the IOREQ interface. > > > > The component responsible for starting a daemon and/or setting up shared > > interfaces is the toolstack: the xl command and the libxl/libxc > > libraries. > > I think that VM configuration management (or orchestration in Startos > jargon?) is a subject to debate in parallel. > Otherwise, is there any good assumption to avoid it right now? > > > Oleksandr and others I CCed have been working on ways for the toolstack > > to create virtio backends and setup memory mappings. They might be able > > to provide more info on the subject. I do think we miss a way to provide > > the configuration to the backend and anything else that the backend > > might require to start doing its job. > > > > > > > One question is how to best handle notification and kicks. The existing > > > vhost-user framework uses eventfd to signal the daemon (although QEMU > > > is quite capable of simulating them when you use TCG). Xen has it's own > > > IOREQ mechanism. However latency is an important factor and having > > > events go through the stub would add quite a lot. > > > > Yeah I think, regardless of anything else, we want the backends to > > connect directly to the Xen hypervisor. > > In my approach, > a) BE -> FE: interrupts triggered by BE calling a hypervisor interface > via virtio-proxy > b) FE -> BE: MMIO to config raises events (in event channels), which is > converted to a callback to BE via virtio-proxy > (Xen's event channel is internnally implemented by > interrupts.) > > I don't know what "connect directly" means here, but sending interrupts > to the opposite side would be best efficient. > Ivshmem, I suppose, takes this approach by utilizing PCI's msi-x mechanism. > > > > > > Could we consider the kernel internally converting IOREQ messages from > > > the Xen hypervisor to eventfd events? Would this scale with other > kernel > > > hypercall interfaces? > > > > > > So any thoughts on what directions are worth experimenting with? > > > > One option we should consider is for each backend to connect to Xen via > > the IOREQ interface. We could generalize the IOREQ interface and make it > > hypervisor agnostic. The interface is really trivial and easy to add. > > As I said above, my proposal does the same thing that you mentioned here :) > The difference is that I do call hypervisor interfaces via virtio-proxy. > > > The only Xen-specific part is the notification mechanism, which is an > > event channel. If we replaced the event channel with something else the > > interface would be generic. See: > > > https://gitlab.com/xen-project/xen/-/blob/staging/xen/include/public/hvm/io… > > > > I don't think that translating IOREQs to eventfd in the kernel is a > > good idea: if feels like it would be extra complexity and that the > > kernel shouldn't be involved as this is a backend-hypervisor interface. > > Given that we may want to implement BE as a bare-metal application > as I did on Zephyr, I don't think that the translation would not be > a big issue, especially on RTOS's. > It will be some kind of abstraction layer of interrupt handling > (or nothing but a callback mechanism). > > > Also, eventfd is very Linux-centric and we are trying to design an > > interface that could work well for RTOSes too. If we want to do > > something different, both OS-agnostic and hypervisor-agnostic, perhaps > > we could design a new interface. One that could be implementable in the > > Xen hypervisor itself (like IOREQ) and of course any other hypervisor > > too. > > > > > > There is also another problem. IOREQ is probably not be the only > > interface needed. Have a look at > > https://marc.info/?l=xen-devel&m=162373754705233&w=2. Don't we also need > > an interface for the backend to inject interrupts into the frontend? And > > if the backend requires dynamic memory mappings of frontend pages, then > > we would also need an interface to map/unmap domU pages. > > My proposal document might help here; All the interfaces required for > virtio-proxy (or hypervisor-related interfaces) are listed as > RPC protocols :) > > > These interfaces are a lot more problematic than IOREQ: IOREQ is tiny > > and self-contained. It is easy to add anywhere. A new interface to > > inject interrupts or map pages is more difficult to manage because it > > would require changes scattered across the various emulators. > > Exactly. I have no confident yet that my approach will also apply > to other hypervisors than Xen. > Technically, yes, but whether people can accept it or not is a different > matter. > > Thanks, > -Takahiro Akashi > > -- > Stratos-dev mailing list > Stratos-dev(a)op-lists.linaro.org > https://op-lists.linaro.org/mailman/listinfo/stratos-dev > -- François-Frédéric Ozog | *Director Business Development* T: +33.67221.6485 francois.ozog(a)linaro.org | Skype: ffozog

4 years, 2 months

3
2
0 0

[PATCH V2] virtio: i2c: Allow zero-length transactions

by Viresh Kumar

The I2C protocol allows zero-length requests with no data, like the SMBus Quick command, where the command is inferred based on the read/write flag itself. In order to allow such a request, allocate another bit, VIRTIO_I2C_FLAGS_M_RD(1), in the flags to pass the request type, as read or write. This was earlier done using the read/write permission to the buffer itself. This still won't work well if multiple buffers are passed for the same request, i.e. the write-read requests, as the VIRTIO_I2C_FLAGS_M_RD flag can only be used with a single buffer. Coming back to it, there is no need to send multiple buffers with a single request. All we need, is a way to group several requests together, which we can already do based on the VIRTIO_I2C_FLAGS_FAIL_NEXT flag. Remove support for multiple buffers within a single request. Since we are at very early stage of development currently, we can do these modifications without addition of new features or versioning of the protocol. Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org> --- V1->V2: - Name the buffer-less request as zero-length request. Hi Guys, I did try to follow the discussion you guys had during V4, where we added support for multiple buffers for the same request, which I think is unnecessary now, after introduction of the VIRTIO_I2C_FLAGS_FAIL_NEXT flag. https://lists.oasis-open.org/archives/virtio-comment/202011/msg00005.html And so starting this discussion again, because we need to support stuff like: i2cdetect -q <i2c-bus-number>, which issues a zero-length SMBus Quick command. --- virtio-i2c.tex | 60 +++++++++++++++++++++++++------------------------- 1 file changed, 30 insertions(+), 30 deletions(-) diff --git a/virtio-i2c.tex b/virtio-i2c.tex index 949d75f44158..ae344b2bc822 100644 --- a/virtio-i2c.tex +++ b/virtio-i2c.tex @@ -54,8 +54,7 @@ \subsubsection{Device Operation: Request Queue}\label{sec:Device Types / I2C Ada \begin{lstlisting} struct virtio_i2c_req { struct virtio_i2c_out_hdr out_hdr; - u8 write_buf[]; - u8 read_buf[]; + u8 buf[]; struct virtio_i2c_in_hdr in_hdr; }; \end{lstlisting} @@ -84,16 +83,16 @@ \subsubsection{Device Operation: Request Queue}\label{sec:Device Types / I2C Ada and sets it on the other requests. If this bit is set and a device fails to process the current request, it needs to fail the next request instead of attempting to execute it. + +\item[VIRTIO_I2C_FLAGS_M_RD(1)] is used to mark the request as READ or WRITE. \end{description} Other bits of \field{flags} are currently reserved as zero for future feature extensibility. -The \field{write_buf} of the request contains one segment of an I2C transaction -being written to the device. - -The \field{read_buf} of the request contains one segment of an I2C transaction -being read from the device. +The \field{buf} of the request is optional and contains one segment of an I2C +transaction being read from or written to the device, based on the value of the +\field{VIRTIO_I2C_FLAGS_M_RD} bit in the \field{flags} field. The final \field{status} byte of the request is written by the device: either VIRTIO_I2C_MSG_OK for success or VIRTIO_I2C_MSG_ERR for error. @@ -103,27 +102,27 @@ \subsubsection{Device Operation: Request Queue}\label{sec:Device Types / I2C Ada #define VIRTIO_I2C_MSG_ERR 1 \end{lstlisting} -If ``length of \field{read_buf}''=0 and ``length of \field{write_buf}''>0, -the request is called write request. +If \field{VIRTIO_I2C_FLAGS_M_RD} bit is set in the \field{flags}, then the +request is called a read request. -If ``length of \field{read_buf}''>0 and ``length of \field{write_buf}''=0, -the request is called read request. +If \field{VIRTIO_I2C_FLAGS_M_RD} bit is unset in the \field{flags}, then the +request is called a write request. -If ``length of \field{read_buf}''>0 and ``length of \field{write_buf}''>0, -the request is called write-read request. It means an I2C write segment followed -by a read segment. Usually, the write segment provides the number of an I2C -controlled device register to be read. +The \field{buf} is optional and will not be present for a zero-length request, +like SMBus Quick. -The case when ``length of \field{write_buf}''=0, and at the same time, -``length of \field{read_buf}''=0 doesn't make any sense. +The virtio I2C protocol supports write-read requests, i.e. an I2C write segment +followed by a read segment (usually, the write segment provides the number of an +I2C controlled device register to be read), by grouping a list of requests +together using the \field{VIRTIO_I2C_FLAGS_FAIL_NEXT} flag. \subsubsection{Device Operation: Operation Status}\label{sec:Device Types / I2C Adapter Device / Device Operation: Operation Status} -\field{addr}, \field{flags}, ``length of \field{write_buf}'' and ``length of \field{read_buf}'' -are determined by the driver, while \field{status} is determined by the processing -of the device. A driver puts the data written to the device into \field{write_buf}, while -a device puts the data of the corresponding length into \field{read_buf} according to the -request of the driver. +\field{addr}, \field{flags}, and ``length of \field{buf}'' are determined by the +driver, while \field{status} is determined by the processing of the device. A +driver, for a write request, puts the data to be written to the device into the +\field{buf}, while a device, for a read request, puts the data read from device +into the \field{buf} according to the request from the driver. A driver may send one request or multiple requests to the device at a time. The requests in the virtqueue are both queued and processed in order. @@ -141,11 +140,10 @@ \subsubsection{Device Operation: Operation Status}\label{sec:Device Types / I2C A driver MUST set the reserved bits of \field{flags} to be zero. -The driver MUST NOT send a request with ``length of \field{write_buf}''=0 and -``length of \field{read_buf}''=0 at the same time. +A driver MUST NOT send the \field{buf}, for a zero-length request. -A driver MUST NOT use \field{read_buf} if the final \field{status} returned -from the device is VIRTIO_I2C_MSG_ERR. +A driver MUST NOT use \field{buf}, for a read request, if the final +\field{status} returned from the device is VIRTIO_I2C_MSG_ERR. A driver MUST queue the requests in order if multiple requests are going to be sent at a time. @@ -160,11 +158,13 @@ \subsubsection{Device Operation: Operation Status}\label{sec:Device Types / I2C A device SHOULD keep consistent behaviors with the hardware as described in \hyperref[intro:I2C]{I2C}. -A device MUST NOT change the value of \field{addr}, reserved bits of \field{flags} -and \field{write_buf}. +A device MUST NOT change the value of \field{addr}, and reserved bits of +\field{flags}. + +A device MUST not change the value of the \field{buf} for a write request. -A device MUST place one I2C segment of the corresponding length into \field{read_buf} -according the driver's request. +A device MUST place one I2C segment of the ``length of \field{buf}'', for the +read request, into the \field{buf} according the driver's request. A device MUST guarantee the requests in the virtqueue being processed in order if multiple requests are received at a time. -- 2.31.1.272.g89b43f80a514

4 years, 2 months

4
12
0 0

2025

2024

2023

2022

2021

2020

Stratos-dev September 2021