Hello,
We verified our hypervisor-agnostic Rust based vhost-user backends with Qemu based setup earlier, and there was growing concern if they were truly hypervisor-agnostic.
In order to prove that, we decided to give it a try with Xen, a type-1 bare-metal hypervisor.
We are happy to announce that we were able to make progress on that front and have a working setup where we can test our existing Rust based backends, like I2C, GPIO, RNG (though only I2C is tested as of now) over Xen.
Key components: --------------
- Xen: https://github.com/vireshk/xen
Xen requires MMIO and device specific support in order to populate the required devices at the guest. This tree contains four patches on the top of mainline Xen, two from Oleksandr (mmio/disk) and two from me (I2C).
- libxen-sys: https://github.com/vireshk/libxen-sys
We currently depend on the userspace tools/libraries provided by Xen, like xendevicemodel, xenevtchn, xenforeignmemory, etc. This crates provides Rust wrappers over those calls, generated automatically with help of bindgen utility in Rust, that allow us to use the installed Xen libraries. Though we plan to replace this with Rust based "oxerun" (find below) in longer run.
- oxerun (WIP): https://gitlab.com/mathieupoirier/oxerun/-/tree/xen-ioctls
This is Rust based implementations for Ioctl and hypercalls to Xen. This is WIP and should eventually replace "libxen-sys" crate entirely (which are C based implementation of the same).
- vhost-device: https://github.com/vireshk/vhost-device
These are Rust based vhost-user backends, maintained inside the rust-vmm project. This already contain support for I2C and RNG, while GPIO is under review. These are not required to be modified based on hypervisor and are truly hypervisor-agnostic.
Ideally the backends are hypervisor agnostic, as explained earlier, but because of the way Xen maps the guest memory currently, we need a minor update for the backends to work. Xen maps the memory via a kernel file /dev/xen/privcmd, which needs calls to mmap() followed by an ioctl() to make it work. For this a hack has been added to one of the rust-vmm crates, vm-virtio, which is used by vhost-user.
https://github.com/vireshk/vm-memory/commit/54b56c4dd7293428edbd7731c4dbe573...
The update to vm-memory is responsible to do ioctl() after the already present mmap().
- vhost-user-master (WIP): https://github.com/vireshk/vhost-user-master
This implements the master side interface of the vhost protocol, and is like the vhost-user-backend (https://github.com/rust-vmm/vhost-user-backend) crate maintained inside the rust-vmm project, which provides similar infrastructure for the backends to use. This shall be hypervisor independent and provide APIs for the hypervisor specific implementations. This will eventually be maintained inside the rust-vmm project and used by all Rust based hypervisors.
- xen-vhost-master (WIP): https://github.com/vireshk/xen-vhost-master
This is the Xen specific implementation and uses the APIs provided by "vhost-user-master", "oxerun" and "libxen-sys" crates for its functioning.
This is designed based on the EPAM's "virtio-disk" repository (https://github.com/xen-troops/virtio-disk/) and is pretty much similar to it.
One can see the analogy as:
Virtio-disk == "Xen-vhost-master" + "vhost-user-master" + "oxerun" + "libxen-sys" + "vhost-device".
Test setup: ----------
1. Build Xen:
$ ./configure --libdir=/usr/lib --build=x86_64-unknown-linux-gnu --host=aarch64-linux-gnu --disable-docs --disable-golang --disable-ocamltools --with-system-qemu=/root/qemu/build/i386-softmmu/qemu-system-i386; $ make -j9 debball CROSS_COMPILE=aarch64-linux-gnu- XEN_TARGET_ARCH=arm64
2. Run Xen via Qemu on X86 machine:
$ qemu-system-aarch64 -machine virt,virtualization=on -cpu cortex-a57 -serial mon:stdio \ -device virtio-net-pci,netdev=net0 -netdev user,id=net0,hostfwd=tcp::8022-:22 \ -device virtio-scsi-pci -drive file=/home/vireshk/virtio/debian-bullseye-arm64.qcow2,index=0,id=hd0,if=none,format=qcow2 -device scsi-hd,drive=hd0 \ -display none -m 8192 -smp 8 -kernel /home/vireshk/virtio/xen/xen \ -append "dom0_mem=5G,max:5G dom0_max_vcpus=7 loglvl=all guest_loglvl=all" \ -device guest-loader,addr=0x46000000,kernel=/home/vireshk/kernel/barm64/arch/arm64/boot/Image,bootargs="root=/dev/sda2 console=hvc0 earlyprintk=xen" \ -device ds1338,address=0x20 # This is required to create a virtual I2C based RTC device on Dom0.
This should get Dom0 up and running.
3. Build rust crates:
$ cd /root/ $ git clone https://github.com/vireshk/xen-vhost-master $ cd xen-vhost-master $ cargo build
$ cd ../ $ git clone https://github.com/vireshk/vhost-device $ cd vhost-device $ cargo build
4. Setup I2C based RTC device
$ echo ds1338 0x20 > /sys/bus/i2c/devices/i2c-0/new_device; echo 0-0020 > /sys/bus/i2c/devices/0-0020/driver/unbind
5. Lets run everything now
# Start the I2C backend in one terminal (open new terminal with "ssh # root@localhost -p8022"). This tells the I2C backend to hook up to # "/root/vi2c.sock0" socket and wait for the master to start transacting. $ /root/vhost-device/target/debug/vhost-device-i2c -s /root/vi2c.sock -c 1 -l 0:32
# Start the xen-vhost-master in another terminal. This provides the path of # the socket to the master side and the device to look from Xen, which is I2C # here. $ /root/xen-vhost-master/target/debug/xen-vhost-master --socket-path /root/vi2c.sock0 --name i2c
# Start guest in another terminal, i2c_domu.conf is attached. The guest kernel # should have Virtio related config options enabled, along with i2c-virtio # driver. $ xl create -c i2c_domu.conf
# The guest should boot fine now. Once the guest is up, you can create the I2C # RTC device and use it. Following will create /dev/rtc0 in the guest, which # you can configure with 'hwclock' utility.
$ echo ds1338 0x20 > /sys/bus/i2c/devices/i2c-0/new_device
Hope this helps.
+xen-devel
On 14-04-22, 14:45, Viresh Kumar wrote:
Hello,
We verified our hypervisor-agnostic Rust based vhost-user backends with Qemu based setup earlier, and there was growing concern if they were truly hypervisor-agnostic.
In order to prove that, we decided to give it a try with Xen, a type-1 bare-metal hypervisor.
We are happy to announce that we were able to make progress on that front and have a working setup where we can test our existing Rust based backends, like I2C, GPIO, RNG (though only I2C is tested as of now) over Xen.
Key components:
Xen: https://github.com/vireshk/xen
Xen requires MMIO and device specific support in order to populate the required devices at the guest. This tree contains four patches on the top of mainline Xen, two from Oleksandr (mmio/disk) and two from me (I2C).
libxen-sys: https://github.com/vireshk/libxen-sys
We currently depend on the userspace tools/libraries provided by Xen, like xendevicemodel, xenevtchn, xenforeignmemory, etc. This crates provides Rust wrappers over those calls, generated automatically with help of bindgen utility in Rust, that allow us to use the installed Xen libraries. Though we plan to replace this with Rust based "oxerun" (find below) in longer run.
oxerun (WIP): https://gitlab.com/mathieupoirier/oxerun/-/tree/xen-ioctls
This is Rust based implementations for Ioctl and hypercalls to Xen. This is WIP and should eventually replace "libxen-sys" crate entirely (which are C based implementation of the same).
vhost-device: https://github.com/vireshk/vhost-device
These are Rust based vhost-user backends, maintained inside the rust-vmm project. This already contain support for I2C and RNG, while GPIO is under review. These are not required to be modified based on hypervisor and are truly hypervisor-agnostic.
Ideally the backends are hypervisor agnostic, as explained earlier, but because of the way Xen maps the guest memory currently, we need a minor update for the backends to work. Xen maps the memory via a kernel file /dev/xen/privcmd, which needs calls to mmap() followed by an ioctl() to make it work. For this a hack has been added to one of the rust-vmm crates, vm-virtio, which is used by vhost-user.
https://github.com/vireshk/vm-memory/commit/54b56c4dd7293428edbd7731c4dbe573...
The update to vm-memory is responsible to do ioctl() after the already present mmap().
vhost-user-master (WIP): https://github.com/vireshk/vhost-user-master
This implements the master side interface of the vhost protocol, and is like the vhost-user-backend (https://github.com/rust-vmm/vhost-user-backend) crate maintained inside the rust-vmm project, which provides similar infrastructure for the backends to use. This shall be hypervisor independent and provide APIs for the hypervisor specific implementations. This will eventually be maintained inside the rust-vmm project and used by all Rust based hypervisors.
xen-vhost-master (WIP): https://github.com/vireshk/xen-vhost-master
This is the Xen specific implementation and uses the APIs provided by "vhost-user-master", "oxerun" and "libxen-sys" crates for its functioning.
This is designed based on the EPAM's "virtio-disk" repository (https://github.com/xen-troops/virtio-disk/) and is pretty much similar to it.
One can see the analogy as:
Virtio-disk == "Xen-vhost-master" + "vhost-user-master" + "oxerun" + "libxen-sys" + "vhost-device".
Test setup:
- Build Xen:
$ ./configure --libdir=/usr/lib --build=x86_64-unknown-linux-gnu --host=aarch64-linux-gnu --disable-docs --disable-golang --disable-ocamltools --with-system-qemu=/root/qemu/build/i386-softmmu/qemu-system-i386; $ make -j9 debball CROSS_COMPILE=aarch64-linux-gnu- XEN_TARGET_ARCH=arm64
- Run Xen via Qemu on X86 machine:
$ qemu-system-aarch64 -machine virt,virtualization=on -cpu cortex-a57 -serial mon:stdio \ -device virtio-net-pci,netdev=net0 -netdev user,id=net0,hostfwd=tcp::8022-:22 \ -device virtio-scsi-pci -drive file=/home/vireshk/virtio/debian-bullseye-arm64.qcow2,index=0,id=hd0,if=none,format=qcow2 -device scsi-hd,drive=hd0 \ -display none -m 8192 -smp 8 -kernel /home/vireshk/virtio/xen/xen \ -append "dom0_mem=5G,max:5G dom0_max_vcpus=7 loglvl=all guest_loglvl=all" \ -device guest-loader,addr=0x46000000,kernel=/home/vireshk/kernel/barm64/arch/arm64/boot/Image,bootargs="root=/dev/sda2 console=hvc0 earlyprintk=xen" \ -device ds1338,address=0x20 # This is required to create a virtual I2C based RTC device on Dom0.
This should get Dom0 up and running.
- Build rust crates:
$ cd /root/ $ git clone https://github.com/vireshk/xen-vhost-master $ cd xen-vhost-master $ cargo build
$ cd ../ $ git clone https://github.com/vireshk/vhost-device $ cd vhost-device $ cargo build
- Setup I2C based RTC device
$ echo ds1338 0x20 > /sys/bus/i2c/devices/i2c-0/new_device; echo 0-0020 > /sys/bus/i2c/devices/0-0020/driver/unbind
- Lets run everything now
# Start the I2C backend in one terminal (open new terminal with "ssh # root@localhost -p8022"). This tells the I2C backend to hook up to # "/root/vi2c.sock0" socket and wait for the master to start transacting. $ /root/vhost-device/target/debug/vhost-device-i2c -s /root/vi2c.sock -c 1 -l 0:32
# Start the xen-vhost-master in another terminal. This provides the path of # the socket to the master side and the device to look from Xen, which is I2C # here. $ /root/xen-vhost-master/target/debug/xen-vhost-master --socket-path /root/vi2c.sock0 --name i2c
# Start guest in another terminal, i2c_domu.conf is attached. The guest kernel # should have Virtio related config options enabled, along with i2c-virtio # driver. $ xl create -c i2c_domu.conf
# The guest should boot fine now. Once the guest is up, you can create the I2C # RTC device and use it. Following will create /dev/rtc0 in the guest, which # you can configure with 'hwclock' utility.
$ echo ds1338 0x20 > /sys/bus/i2c/devices/i2c-0/new_device
Hope this helps.
-- viresh
i2c_domu.conf
kernel="/root/Image" memory=512 vcpus=2 command="console=hvc0 earlycon=xenboot" name="domu" i2c = [ "virtio=true, irq=1, base=1" ]
Hi Viresh
This is very cool.
On Thu, Apr 14, 2022 at 02:53:58PM +0530, Viresh Kumar wrote:
+xen-devel
On 14-04-22, 14:45, Viresh Kumar wrote:
Hello,
We verified our hypervisor-agnostic Rust based vhost-user backends with Qemu based setup earlier, and there was growing concern if they were truly hypervisor-agnostic.
In order to prove that, we decided to give it a try with Xen, a type-1 bare-metal hypervisor.
We are happy to announce that we were able to make progress on that front and have a working setup where we can test our existing Rust based backends, like I2C, GPIO, RNG (though only I2C is tested as of now) over Xen.
Key components:
Xen: https://github.com/vireshk/xen
Xen requires MMIO and device specific support in order to populate the required devices at the guest. This tree contains four patches on the top of mainline Xen, two from Oleksandr (mmio/disk) and two from me (I2C).
libxen-sys: https://github.com/vireshk/libxen-sys
We currently depend on the userspace tools/libraries provided by Xen, like xendevicemodel, xenevtchn, xenforeignmemory, etc. This crates provides Rust wrappers over those calls, generated automatically with help of bindgen utility in Rust, that allow us to use the installed Xen libraries. Though we plan to replace this with Rust based "oxerun" (find below) in longer run.
oxerun (WIP): https://gitlab.com/mathieupoirier/oxerun/-/tree/xen-ioctls
This is Rust based implementations for Ioctl and hypercalls to Xen. This is WIP and should eventually replace "libxen-sys" crate entirely (which are C based implementation of the same).
I'm curious to learn why there is a need to replace libxen-sys with the pure Rust implementation. Those libraries (xendevicemodel, xenevtchn, xenforeignmemory) are very stable and battle tested. Their interfaces are stable.
Thanks, Wei.
On 14/04/2022 12:45, Wei Liu wrote:
Hi Viresh
This is very cool.
On Thu, Apr 14, 2022 at 02:53:58PM +0530, Viresh Kumar wrote:
+xen-devel
On 14-04-22, 14:45, Viresh Kumar wrote:
Hello,
We verified our hypervisor-agnostic Rust based vhost-user backends with Qemu based setup earlier, and there was growing concern if they were truly hypervisor-agnostic.
In order to prove that, we decided to give it a try with Xen, a type-1 bare-metal hypervisor.
We are happy to announce that we were able to make progress on that front and have a working setup where we can test our existing Rust based backends, like I2C, GPIO, RNG (though only I2C is tested as of now) over Xen.
Key components:
Xen: https://github.com/vireshk/xen
Xen requires MMIO and device specific support in order to populate the required devices at the guest. This tree contains four patches on the top of mainline Xen, two from Oleksandr (mmio/disk) and two from me (I2C).
libxen-sys: https://github.com/vireshk/libxen-sys
We currently depend on the userspace tools/libraries provided by Xen, like xendevicemodel, xenevtchn, xenforeignmemory, etc. This crates provides Rust wrappers over those calls, generated automatically with help of bindgen utility in Rust, that allow us to use the installed Xen libraries. Though we plan to replace this with Rust based "oxerun" (find below) in longer run.
oxerun (WIP): https://gitlab.com/mathieupoirier/oxerun/-/tree/xen-ioctls
This is Rust based implementations for Ioctl and hypercalls to Xen. This is WIP and should eventually replace "libxen-sys" crate entirely (which are C based implementation of the same).
I'm curious to learn why there is a need to replace libxen-sys with the pure Rust implementation. Those libraries (xendevicemodel, xenevtchn, xenforeignmemory) are very stable and battle tested. Their interfaces are stable.
Very easy. The library APIs are mess even if they are technically stable, and violate various commonly-agreed rules of being a libary such as not messing with stdout/stderr behind the applications back, and everything gets more simple when you remove an unnecessary level of C indirection.
~Andrew
On Thu, Apr 14, 2022 at 12:07:10PM +0000, Andrew Cooper wrote:
On 14/04/2022 12:45, Wei Liu wrote:
Hi Viresh
This is very cool.
On Thu, Apr 14, 2022 at 02:53:58PM +0530, Viresh Kumar wrote:
+xen-devel
On 14-04-22, 14:45, Viresh Kumar wrote:
Hello,
We verified our hypervisor-agnostic Rust based vhost-user backends with Qemu based setup earlier, and there was growing concern if they were truly hypervisor-agnostic.
In order to prove that, we decided to give it a try with Xen, a type-1 bare-metal hypervisor.
We are happy to announce that we were able to make progress on that front and have a working setup where we can test our existing Rust based backends, like I2C, GPIO, RNG (though only I2C is tested as of now) over Xen.
Key components:
Xen: https://github.com/vireshk/xen
Xen requires MMIO and device specific support in order to populate the required devices at the guest. This tree contains four patches on the top of mainline Xen, two from Oleksandr (mmio/disk) and two from me (I2C).
libxen-sys: https://github.com/vireshk/libxen-sys
We currently depend on the userspace tools/libraries provided by Xen, like xendevicemodel, xenevtchn, xenforeignmemory, etc. This crates provides Rust wrappers over those calls, generated automatically with help of bindgen utility in Rust, that allow us to use the installed Xen libraries. Though we plan to replace this with Rust based "oxerun" (find below) in longer run.
oxerun (WIP): https://gitlab.com/mathieupoirier/oxerun/-/tree/xen-ioctls
This is Rust based implementations for Ioctl and hypercalls to Xen. This is WIP and should eventually replace "libxen-sys" crate entirely (which are C based implementation of the same).
I'm curious to learn why there is a need to replace libxen-sys with the pure Rust implementation. Those libraries (xendevicemodel, xenevtchn, xenforeignmemory) are very stable and battle tested. Their interfaces are stable.
Very easy. The library APIs are mess even if they are technically stable, and violate various commonly-agreed rules of being a libary such as not messing with stdout/stderr behind the applications back, and everything gets more simple when you remove an unnecessary level of C indirection.
You don't have to use the stdio logger FWIW. I don't disagree things can be simpler though.
Wei.
~Andrew
Wei Liu wl@xen.org writes:
On Thu, Apr 14, 2022 at 12:07:10PM +0000, Andrew Cooper wrote:
On 14/04/2022 12:45, Wei Liu wrote:
Hi Viresh
This is very cool.
On Thu, Apr 14, 2022 at 02:53:58PM +0530, Viresh Kumar wrote:
+xen-devel
On 14-04-22, 14:45, Viresh Kumar wrote:
Hello,
We verified our hypervisor-agnostic Rust based vhost-user backends with Qemu based setup earlier, and there was growing concern if they were truly hypervisor-agnostic.
In order to prove that, we decided to give it a try with Xen, a type-1 bare-metal hypervisor.
We are happy to announce that we were able to make progress on that front and have a working setup where we can test our existing Rust based backends, like I2C, GPIO, RNG (though only I2C is tested as of now) over Xen.
Key components:
Xen: https://github.com/vireshk/xen
Xen requires MMIO and device specific support in order to populate the required devices at the guest. This tree contains four patches on the top of mainline Xen, two from Oleksandr (mmio/disk) and two from me (I2C).
libxen-sys: https://github.com/vireshk/libxen-sys
We currently depend on the userspace tools/libraries provided by Xen, like xendevicemodel, xenevtchn, xenforeignmemory, etc. This crates provides Rust wrappers over those calls, generated automatically with help of bindgen utility in Rust, that allow us to use the installed Xen libraries. Though we plan to replace this with Rust based "oxerun" (find below) in longer run.
oxerun (WIP): https://gitlab.com/mathieupoirier/oxerun/-/tree/xen-ioctls
This is Rust based implementations for Ioctl and hypercalls to Xen. This is WIP and should eventually replace "libxen-sys" crate entirely (which are C based implementation of the same).
I'm curious to learn why there is a need to replace libxen-sys with the pure Rust implementation. Those libraries (xendevicemodel, xenevtchn, xenforeignmemory) are very stable and battle tested. Their interfaces are stable.
Very easy. The library APIs are mess even if they are technically stable, and violate various commonly-agreed rules of being a libary such as not messing with stdout/stderr behind the applications back, and everything gets more simple when you remove an unnecessary level of C indirection.
You don't have to use the stdio logger FWIW. I don't disagree things can be simpler though.
Not directly related to this use case but the Rust API can also be built to make direct HYP calls which will be useful for building Rust based unikernels that need to interact with Xen. For example for a dom0less system running a very minimal heartbeat/healthcheck monitor written in pure rust.
We would also like to explore unikernel virtio backends but I suspect currently the rest of the rust-vmm virtio bits assume a degree of POSIX like userspace to set things up.
On Thu, Apr 14, 2022 at 02:36:12PM +0100, Alex Bennée wrote:
Wei Liu wl@xen.org writes:
On Thu, Apr 14, 2022 at 12:07:10PM +0000, Andrew Cooper wrote:
On 14/04/2022 12:45, Wei Liu wrote:
Hi Viresh
This is very cool.
On Thu, Apr 14, 2022 at 02:53:58PM +0530, Viresh Kumar wrote:
+xen-devel
On 14-04-22, 14:45, Viresh Kumar wrote:
Hello,
We verified our hypervisor-agnostic Rust based vhost-user backends with Qemu based setup earlier, and there was growing concern if they were truly hypervisor-agnostic.
In order to prove that, we decided to give it a try with Xen, a type-1 bare-metal hypervisor.
We are happy to announce that we were able to make progress on that front and have a working setup where we can test our existing Rust based backends, like I2C, GPIO, RNG (though only I2C is tested as of now) over Xen.
Key components:
Xen: https://github.com/vireshk/xen
Xen requires MMIO and device specific support in order to populate the required devices at the guest. This tree contains four patches on the top of mainline Xen, two from Oleksandr (mmio/disk) and two from me (I2C).
libxen-sys: https://github.com/vireshk/libxen-sys
We currently depend on the userspace tools/libraries provided by Xen, like xendevicemodel, xenevtchn, xenforeignmemory, etc. This crates provides Rust wrappers over those calls, generated automatically with help of bindgen utility in Rust, that allow us to use the installed Xen libraries. Though we plan to replace this with Rust based "oxerun" (find below) in longer run.
oxerun (WIP): https://gitlab.com/mathieupoirier/oxerun/-/tree/xen-ioctls
This is Rust based implementations for Ioctl and hypercalls to Xen. This is WIP and should eventually replace "libxen-sys" crate entirely (which are C based implementation of the same).
I'm curious to learn why there is a need to replace libxen-sys with the pure Rust implementation. Those libraries (xendevicemodel, xenevtchn, xenforeignmemory) are very stable and battle tested. Their interfaces are stable.
Very easy. The library APIs are mess even if they are technically stable, and violate various commonly-agreed rules of being a libary such as not messing with stdout/stderr behind the applications back, and everything gets more simple when you remove an unnecessary level of C indirection.
You don't have to use the stdio logger FWIW. I don't disagree things can be simpler though.
Not directly related to this use case but the Rust API can also be built to make direct HYP calls which will be useful for building Rust based unikernels that need to interact with Xen. For example for a dom0less system running a very minimal heartbeat/healthcheck monitor written in pure rust.
I think this is a strong reason for not using existing C libraries. It would be nice if the APIs can work with no_std.
We would also like to explore unikernel virtio backends but I suspect currently the rest of the rust-vmm virtio bits assume a degree of POSIX like userspace to set things up.
Indeed.
Thanks, Wei.
-- Alex Bennée
On Apr 14, 2022, at 9:10 AM, Wei Liu wl@xen.org wrote:
On Thu, Apr 14, 2022 at 02:36:12PM +0100, Alex Bennée wrote:
Wei Liu wl@xen.org writes:
On Thu, Apr 14, 2022 at 12:07:10PM +0000, Andrew Cooper wrote:
On 14/04/2022 12:45, Wei Liu wrote:
Hi Viresh
This is very cool.
On Thu, Apr 14, 2022 at 02:53:58PM +0530, Viresh Kumar wrote:
+xen-devel
> On 14-04-22, 14:45, Viresh Kumar wrote: >> Hello, >> >> We verified our hypervisor-agnostic Rust based vhost-user backends with Qemu >> based setup earlier, and there was growing concern if they were truly >> hypervisor-agnostic. >> >> In order to prove that, we decided to give it a try with Xen, a type-1 >> bare-metal hypervisor. >> >> We are happy to announce that we were able to make progress on that front and >> have a working setup where we can test our existing Rust based backends, like >> I2C, GPIO, RNG (though only I2C is tested as of now) over Xen. >> >> Key components: >> -------------- >> >> - Xen: https://github.com/vireshk/xen >> >> Xen requires MMIO and device specific support in order to populate the >> required devices at the guest. This tree contains four patches on the top of >> mainline Xen, two from Oleksandr (mmio/disk) and two from me (I2C). >> >> - libxen-sys: https://github.com/vireshk/libxen-sys >> >> We currently depend on the userspace tools/libraries provided by Xen, like >> xendevicemodel, xenevtchn, xenforeignmemory, etc. This crates provides Rust >> wrappers over those calls, generated automatically with help of bindgen >> utility in Rust, that allow us to use the installed Xen libraries. Though we >> plan to replace this with Rust based "oxerun" (find below) in longer run. >> >> - oxerun (WIP): https://gitlab.com/mathieupoirier/oxerun/-/tree/xen-ioctls >> >> This is Rust based implementations for Ioctl and hypercalls to Xen. This is WIP >> and should eventually replace "libxen-sys" crate entirely (which are C based >> implementation of the same). >> I'm curious to learn why there is a need to replace libxen-sys with the pure Rust implementation. Those libraries (xendevicemodel, xenevtchn, xenforeignmemory) are very stable and battle tested. Their interfaces are stable.
Very easy. The library APIs are mess even if they are technically stable, and violate various commonly-agreed rules of being a libary such as not messing with stdout/stderr behind the applications back, and everything gets more simple when you remove an unnecessary level of C indirection.
You don't have to use the stdio logger FWIW. I don't disagree things can be simpler though.
Not directly related to this use case but the Rust API can also be built to make direct HYP calls which will be useful for building Rust based unikernels that need to interact with Xen. For example for a dom0less system running a very minimal heartbeat/healthcheck monitor written in pure rust.
I think this is a strong reason for not using existing C libraries. It would be nice if the APIs can work with no_std.
This was the goal I had with the way I structured the xen-sys crate.
We would also like to explore unikernel virtio backends but I suspect currently the rest of the rust-vmm virtio bits assume a degree of POSIX like userspace to set things up.
Same area I had an interest in. As well. I played with a xenstore implementation in a unikernel as well. Some of the code was published but unfortunately the actual functional bits were not.
— Doug
+rust-vmm@lists.opendev.org
On Thu, 14 Apr 2022 at 14:54, Viresh Kumar viresh.kumar@linaro.org wrote:
+xen-devel
On 14-04-22, 14:45, Viresh Kumar wrote:
Hello,
We verified our hypervisor-agnostic Rust based vhost-user backends with Qemu based setup earlier, and there was growing concern if they were truly hypervisor-agnostic.
In order to prove that, we decided to give it a try with Xen, a type-1 bare-metal hypervisor.
We are happy to announce that we were able to make progress on that front and have a working setup where we can test our existing Rust based backends, like I2C, GPIO, RNG (though only I2C is tested as of now) over Xen.
Key components:
Xen: https://github.com/vireshk/xen
Xen requires MMIO and device specific support in order to populate the required devices at the guest. This tree contains four patches on the top of mainline Xen, two from Oleksandr (mmio/disk) and two from me (I2C).
libxen-sys: https://github.com/vireshk/libxen-sys
We currently depend on the userspace tools/libraries provided by Xen, like xendevicemodel, xenevtchn, xenforeignmemory, etc. This crates provides Rust wrappers over those calls, generated automatically with help of bindgen utility in Rust, that allow us to use the installed Xen libraries. Though we plan to replace this with Rust based "oxerun" (find below) in longer run.
oxerun (WIP): https://gitlab.com/mathieupoirier/oxerun/-/tree/xen-ioctls
This is Rust based implementations for Ioctl and hypercalls to Xen. This is WIP and should eventually replace "libxen-sys" crate entirely (which are C based implementation of the same).
vhost-device: https://github.com/vireshk/vhost-device
These are Rust based vhost-user backends, maintained inside the rust-vmm project. This already contain support for I2C and RNG, while GPIO is under review. These are not required to be modified based on hypervisor and are truly hypervisor-agnostic.
Ideally the backends are hypervisor agnostic, as explained earlier, but because of the way Xen maps the guest memory currently, we need a minor update for the backends to work. Xen maps the memory via a kernel file /dev/xen/privcmd, which needs calls to mmap() followed by an ioctl() to make it work. For this a hack has been added to one of the rust-vmm crates, vm-virtio, which is used by vhost-user.
https://github.com/vireshk/vm-memory/commit/54b56c4dd7293428edbd7731c4dbe573...
The update to vm-memory is responsible to do ioctl() after the already present mmap().
vhost-user-master (WIP): https://github.com/vireshk/vhost-user-master
This implements the master side interface of the vhost protocol, and is like the vhost-user-backend (https://github.com/rust-vmm/vhost-user-backend) crate maintained inside the rust-vmm project, which provides similar infrastructure for the backends to use. This shall be hypervisor independent and provide APIs for the hypervisor specific implementations. This will eventually be maintained inside the rust-vmm project and used by all Rust based hypervisors.
xen-vhost-master (WIP): https://github.com/vireshk/xen-vhost-master
This is the Xen specific implementation and uses the APIs provided by "vhost-user-master", "oxerun" and "libxen-sys" crates for its functioning.
This is designed based on the EPAM's "virtio-disk" repository (https://github.com/xen-troops/virtio-disk/) and is pretty much similar to it.
One can see the analogy as:
Virtio-disk == "Xen-vhost-master" + "vhost-user-master" + "oxerun" + "libxen-sys" + "vhost-device".
Test setup:
- Build Xen:
$ ./configure --libdir=/usr/lib --build=x86_64-unknown-linux-gnu --host=aarch64-linux-gnu --disable-docs --disable-golang --disable-ocamltools --with-system-qemu=/root/qemu/build/i386-softmmu/qemu-system-i386; $ make -j9 debball CROSS_COMPILE=aarch64-linux-gnu- XEN_TARGET_ARCH=arm64
- Run Xen via Qemu on X86 machine:
$ qemu-system-aarch64 -machine virt,virtualization=on -cpu cortex-a57 -serial mon:stdio \ -device virtio-net-pci,netdev=net0 -netdev user,id=net0,hostfwd=tcp::8022-:22 \ -device virtio-scsi-pci -drive file=/home/vireshk/virtio/debian-bullseye-arm64.qcow2,index=0,id=hd0,if=none,format=qcow2 -device scsi-hd,drive=hd0 \ -display none -m 8192 -smp 8 -kernel /home/vireshk/virtio/xen/xen \ -append "dom0_mem=5G,max:5G dom0_max_vcpus=7 loglvl=all guest_loglvl=all" \ -device guest-loader,addr=0x46000000,kernel=/home/vireshk/kernel/barm64/arch/arm64/boot/Image,bootargs="root=/dev/sda2 console=hvc0 earlyprintk=xen" \ -device ds1338,address=0x20 # This is required to create a virtual I2C based RTC device on Dom0.
This should get Dom0 up and running.
- Build rust crates:
$ cd /root/ $ git clone https://github.com/vireshk/xen-vhost-master $ cd xen-vhost-master $ cargo build
$ cd ../ $ git clone https://github.com/vireshk/vhost-device $ cd vhost-device $ cargo build
- Setup I2C based RTC device
$ echo ds1338 0x20 > /sys/bus/i2c/devices/i2c-0/new_device; echo 0-0020 > /sys/bus/i2c/devices/0-0020/driver/unbind
- Lets run everything now
# Start the I2C backend in one terminal (open new terminal with "ssh # root@localhost -p8022"). This tells the I2C backend to hook up to # "/root/vi2c.sock0" socket and wait for the master to start transacting. $ /root/vhost-device/target/debug/vhost-device-i2c -s /root/vi2c.sock -c 1 -l 0:32
# Start the xen-vhost-master in another terminal. This provides the path of # the socket to the master side and the device to look from Xen, which is I2C # here. $ /root/xen-vhost-master/target/debug/xen-vhost-master --socket-path /root/vi2c.sock0 --name i2c
# Start guest in another terminal, i2c_domu.conf is attached. The guest kernel # should have Virtio related config options enabled, along with i2c-virtio # driver. $ xl create -c i2c_domu.conf
# The guest should boot fine now. Once the guest is up, you can create the I2C # RTC device and use it. Following will create /dev/rtc0 in the guest, which # you can configure with 'hwclock' utility.
$ echo ds1338 0x20 > /sys/bus/i2c/devices/i2c-0/new_device
Hope this helps.
-- viresh
i2c_domu.conf
kernel="/root/Image" memory=512 vcpus=2 command="console=hvc0 earlycon=xenboot" name="domu" i2c = [ "virtio=true, irq=1, base=1" ]
-- viresh
On Thu, Apr 14, 2022 at 12:15 PM Viresh Kumar viresh.kumar@linaro.org wrote:
Hello,
Hello Viresh
[Cc Juergen and Julien]
[sorry for the possible format issues and for the late response]
We verified our hypervisor-agnostic Rust based vhost-user backends with Qemu based setup earlier, and there was growing concern if they were truly hypervisor-agnostic.
In order to prove that, we decided to give it a try with Xen, a type-1 bare-metal hypervisor.
We are happy to announce that we were able to make progress on that front and have a working setup where we can test our existing Rust based backends, like I2C, GPIO, RNG (though only I2C is tested as of now) over Xen.
Great work!
Key components:
Xen: https://github.com/vireshk/xen
Xen requires MMIO and device specific support in order to populate the required devices at the guest. This tree contains four patches on the
top of mainline Xen, two from Oleksandr (mmio/disk) and two from me (I2C).
I skimmed through your toolstack patches, awesome, you created a completely new virtual device "I2C". FYI, I have updated "Virtio support for toolstack on Arm" [1] since (to make it more generic), now V7 is available and I have a plan to push V8 soon.
libxen-sys: https://github.com/vireshk/libxen-sys
We currently depend on the userspace tools/libraries provided by Xen,
like xendevicemodel, xenevtchn, xenforeignmemory, etc. This crates provides Rust wrappers over those calls, generated automatically with help of bindgen utility in Rust, that allow us to use the installed Xen libraries. Though we plan to replace this with Rust based "oxerun" (find below) in longer run.
oxerun (WIP): https://gitlab.com/mathieupoirier/oxerun/-/tree/xen-ioctls
This is Rust based implementations for Ioctl and hypercalls to Xen. This
is WIP and should eventually replace "libxen-sys" crate entirely (which are C based implementation of the same).
FYI, currently we are working on one feature to restrict memory access using Xen grant mappings based on xen-grant DMA-mapping layer for Linux [1]. And there is a working PoC on Arm based on an updated virtio-disk. As for libraries, there is a new dependency on "xengnttab" library. In comparison with Xen foreign mappings model (xenforeignmemory), the Xen grant mappings model is a good fit into the Xen security model, this is a safe mechanism to share pages between guests.
vhost-device: https://github.com/vireshk/vhost-device
These are Rust based vhost-user backends, maintained inside the rust-vmm project. This already contain support for I2C and RNG, while GPIO is
under review. These are not required to be modified based on hypervisor and are truly hypervisor-agnostic.
Ideally the backends are hypervisor agnostic, as explained earlier, but because of the way Xen maps the guest memory currently, we need a minor update for the backends to work. Xen maps the memory via a kernel file /dev/xen/privcmd, which needs calls to mmap() followed by an ioctl() to make it work. For this a hack has been added to one of the rust-vmm crates, vm-virtio, which is used by vhost-user.
https://github.com/vireshk/vm-memory/commit/54b56c4dd7293428edbd7731c4dbe573...
The update to vm-memory is responsible to do ioctl() after the already present mmap().
With Xen grant mappings, if I am not mistaken, it is going to be almost the same: mmap() then ioctl(). But the file will be "/dev/xen/gntdev".
vhost-user-master (WIP): https://github.com/vireshk/vhost-user-master
This implements the master side interface of the vhost protocol, and is
like the vhost-user-backend (https://github.com/rust-vmm/vhost-user-backend) crate maintained inside the rust-vmm project, which provides similar infrastructure for the backends to use. This shall be hypervisor independent and provide APIs for the hypervisor specific implementations. This will eventually be maintained inside the rust-vmm project and used by all Rust based hypervisors.
xen-vhost-master (WIP): https://github.com/vireshk/xen-vhost-master
This is the Xen specific implementation and uses the APIs provided by "vhost-user-master", "oxerun" and "libxen-sys" crates for its
functioning.
This is designed based on the EPAM's "virtio-disk" repository (https://github.com/xen-troops/virtio-disk/) and is pretty much similar to it.
FYI, new branch "virtio_grant" besides supporting Xen grant mappings also supports virtio-mmio modern transport.
One can see the analogy as:
Virtio-disk == "Xen-vhost-master" + "vhost-user-master" + "oxerun" + "libxen-sys" + "vhost-device".
Test setup:
- Build Xen:
$ ./configure --libdir=/usr/lib --build=x86_64-unknown-linux-gnu --host=aarch64-linux-gnu --disable-docs --disable-golang --disable-ocamltools --with-system-qemu=/root/qemu/build/i386-softmmu/qemu-system-i386; $ make -j9 debball CROSS_COMPILE=aarch64-linux-gnu- XEN_TARGET_ARCH=arm64
- Run Xen via Qemu on X86 machine:
$ qemu-system-aarch64 -machine virt,virtualization=on -cpu cortex-a57 -serial mon:stdio \ -device virtio-net-pci,netdev=net0 -netdev user,id=net0,hostfwd=tcp::8022-:22 \ -device virtio-scsi-pci -drive file=/home/vireshk/virtio/debian-bullseye-arm64.qcow2,index=0,id=hd0,if=none,format=qcow2 -device scsi-hd,drive=hd0 \ -display none -m 8192 -smp 8 -kernel /home/vireshk/virtio/xen/xen \ -append "dom0_mem=5G,max:5G dom0_max_vcpus=7 loglvl=all guest_loglvl=all" \ -device guest-loader,addr=0x46000000,kernel=/home/vireshk/kernel/barm64/arch/arm64/boot/Image,bootargs="root=/dev/sda2 console=hvc0 earlyprintk=xen" \ -device ds1338,address=0x20 # This is required to create a virtual I2C based RTC device on Dom0.
This should get Dom0 up and running.
- Build rust crates:
$ cd /root/ $ git clone https://github.com/vireshk/xen-vhost-master $ cd xen-vhost-master $ cargo build
$ cd ../ $ git clone https://github.com/vireshk/vhost-device $ cd vhost-device $ cargo build
- Setup I2C based RTC device
$ echo ds1338 0x20 > /sys/bus/i2c/devices/i2c-0/new_device; echo 0-0020
/sys/bus/i2c/devices/0-0020/driver/unbind
- Lets run everything now
# Start the I2C backend in one terminal (open new terminal with "ssh # root@localhost -p8022"). This tells the I2C backend to hook up to # "/root/vi2c.sock0" socket and wait for the master to start transacting. $ /root/vhost-device/target/debug/vhost-device-i2c -s /root/vi2c.sock -c 1 -l 0:32
# Start the xen-vhost-master in another terminal. This provides the path of # the socket to the master side and the device to look from Xen, which is I2C # here. $ /root/xen-vhost-master/target/debug/xen-vhost-master --socket-path /root/vi2c.sock0 --name i2c
# Start guest in another terminal, i2c_domu.conf is attached. The guest kernel # should have Virtio related config options enabled, along with i2c-virtio # driver. $ xl create -c i2c_domu.conf
# The guest should boot fine now. Once the guest is up, you can create the I2C # RTC device and use it. Following will create /dev/rtc0 in the guest, which # you can configure with 'hwclock' utility.
$ echo ds1338 0x20 > /sys/bus/i2c/devices/i2c-0/new_device
Thanks for the detailed instruction.
Hope this helps.
-- viresh
[1] https://lore.kernel.org/xen-devel/1649442065-8332-1-git-send-email-olekstysh... [2] https://lore.kernel.org/xen-devel/1650646263-22047-1-git-send-email-olekstys...
On 28-04-22, 16:52, Oleksandr Tyshchenko wrote:
Great work!
Thanks Oleksandr.
I skimmed through your toolstack patches, awesome, you created a completely new virtual device "I2C".
I have also created GPIO now :)
What should I do about these patches ? Send them to xen list ? I can at least send the stuff which doesn't depend on your series ?
FYI, I have updated "Virtio support for toolstack on Arm" [1] since (to make it more generic), now V7 is available and I have a plan to push V8 soon.
I will surely have a look, thanks.
FYI, currently we are working on one feature to restrict memory access using Xen grant mappings based on xen-grant DMA-mapping layer for Linux [1]. And there is a working PoC on Arm based on an updated virtio-disk. As for libraries, there is a new dependency on "xengnttab" library. In comparison with Xen foreign mappings model (xenforeignmemory), the Xen grant mappings model is a good fit into the Xen security model, this is a safe mechanism to share pages between guests.
Right, I was aware of this work but didn't dive into it yet. We will surely need to do that eventually, lets see when I will be able to get to that. The current focus is the get the solution a bit more robust (so it can be used with any device) and upstream it to rust-vmm space on github.
With Xen grant mappings, if I am not mistaken, it is going to be almost the same: mmap() then ioctl(). But the file will be "/dev/xen/gntdev".
Okay, the problem (for us) still exists then :)
FYI, new branch "virtio_grant" besides supporting Xen grant mappings also supports virtio-mmio modern transport.
Somehow the timing of your emails have been spot on.
Last time, when you told me about the "dev" branch, I have already started to reinvent the wheel and your branch really helped.
Now, it was just yesterday that I started looking into MMIO modern stuff as the GPIO device needs it and you sent me working code to look how to do it as well. You saved at least 1-2 days of my time :)
Thanks Oleksandr.
On 29-04-22, 09:18, Viresh Kumar wrote:
Now, it was just yesterday that I started looking into MMIO modern stuff as the GPIO device needs it and you sent me working code to look how to do it as well. You saved at least 1-2 days of my time :)
One question though, do we need to support Legacy mode at all in the work we are doing ?
On 29.04.22 06:59, Viresh Kumar wrote:
Hello Viresh
On 29-04-22, 09:18, Viresh Kumar wrote:
Now, it was just yesterday that I started looking into MMIO modern stuff as the GPIO device needs it and you sent me working code to look how to do it as well. You saved at least 1-2 days of my time :)
One question though, do we need to support Legacy mode at all in the work we are doing ?
I am not 100% sure I can answer precisely here. virtio-disk backend worked perfectly fine in legacy virtio-mmio transport mode with the latest vanilla Linux. For the "restricted memory access using Xen grant mappings" feature to work I had to switch it to use modern virtio-mmio transport. CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS requires the virtio devices to support VIRTIO_F_VERSION_1. In addition, we do need 64-bit addresses in the virtqueue.
BTW, the virtio-iommu also requires VIRTIO_F_VERSION_1.
On 29.04.22 06:48, Viresh Kumar wrote:
Hello Viresh
On 28-04-22, 16:52, Oleksandr Tyshchenko wrote:
Great work!
Thanks Oleksandr.
I skimmed through your toolstack patches, awesome, you created a completely new virtual device "I2C".
I have also created GPIO now :)
Awesome!
What should I do about these patches ? Send them to xen list ? I can at least send the stuff which doesn't depend on your series ?
Below my understanding, which might be wrong)
I think, the best case scenario - is to try to get these features upstreamed. I expect a possible interest to virtulized I2C/GPIO devices on Xen, especially in embedded environment where the passthrough of dedicated I2C/GPIO controller to the guest is not possible for some reason (clocks, pins, power domains, etc). But I do understand it most likely takes some time. If upsteaming this stuff is not your primary target, then I think, such patch series deserves to be sent to the Xen mailing list anyway for someone who is interested in the topic to give it a try. For example, you can send RFC version saying in cover letter that it depends on non-upsteamed yet stuff to start discussion.
FYI, I have updated "Virtio support for toolstack on Arm" [1] since (to make it more generic), now V7 is available and I have a plan to push V8 soon.
I will surely have a look, thanks.
FYI, currently we are working on one feature to restrict memory access using Xen grant mappings based on xen-grant DMA-mapping layer for Linux [1]. And there is a working PoC on Arm based on an updated virtio-disk. As for libraries, there is a new dependency on "xengnttab" library. In comparison with Xen foreign mappings model (xenforeignmemory), the Xen grant mappings model is a good fit into the Xen security model, this is a safe mechanism to share pages between guests.
Right, I was aware of this work but didn't dive into it yet. We will surely need to do that eventually, lets see when I will be able to get to that. The current focus is the get the solution a bit more robust (so it can be used with any device) and upstream it to rust-vmm space on github.
ok, I see. I understand your point, your primary target is hypervisor-agnostic Rust based backend(s) to be applicable for any device.
With Xen grant mappings, if I am not mistaken, it is going to be almost the same: mmap() then ioctl(). But the file will be "/dev/xen/gntdev".
Okay, the problem (for us) still exists then :)
It seems, yes.
FYI, new branch "virtio_grant" besides supporting Xen grant mappings also supports virtio-mmio modern transport.
Somehow the timing of your emails have been spot on.
Last time, when you told me about the "dev" branch, I have already started to reinvent the wheel and your branch really helped.
Now, it was just yesterday that I started looking into MMIO modern stuff as the GPIO device needs it and you sent me working code to look how to do it as well. You saved at least 1-2 days of my time :)
Great, I'm glad to hear it.
Thanks Oleksandr.
On 29-04-22, 13:44, Oleksandr wrote:
On 29.04.22 06:48, Viresh Kumar wrote:
What should I do about these patches ? Send them to xen list ? I can at least send the stuff which doesn't depend on your series ?
Below my understanding, which might be wrong)
I think, the best case scenario - is to try to get these features upstreamed. I expect a possible interest to virtulized I2C/GPIO devices on Xen, especially in embedded environment where the passthrough of dedicated I2C/GPIO controller to the guest is not possible for some reason (clocks, pins, power domains, etc). But I do understand it most likely takes some time. If upsteaming this stuff is not your primary target, then I think, such patch series deserves to be sent to the Xen mailing list anyway for someone who is interested in the topic to give it a try. For example, you can send RFC version saying in cover letter that it depends on non-upsteamed yet stuff to start discussion.
I have sent the patchset to xen list. Thanks.
On 28-04-22, 16:52, Oleksandr Tyshchenko wrote:
FYI, currently we are working on one feature to restrict memory access using Xen grant mappings based on xen-grant DMA-mapping layer for Linux [1]. And there is a working PoC on Arm based on an updated virtio-disk. As for libraries, there is a new dependency on "xengnttab" library. In comparison with Xen foreign mappings model (xenforeignmemory), the Xen grant mappings model is a good fit into the Xen security model, this is a safe mechanism to share pages between guests.
Hi Oleksandr,
I started getting this stuff into our work and have few questions.
- IIUC, with this feature the guest will allow the host to access only certain parts of the guest memory, which is exactly what we want as well. I looked at the updated code in virtio-disk and you currently don't allow the grant table mappings along with MAP_IN_ADVANCE, is there any particular reason for that ?
- I understand that you currently map on the go, the virqueue descriptor rings and then the protocol specific addresses later on, once virtio requests are received from the guest.
But in our case, Vhost user with Rust based hypervisor agnostic backend, the vhost master side can send a number of memory regions for the slave (backend) to map and the backend won't try to map anything apart from that. The virtqueue descriptor rings are available at this point and can be sent, but not the protocol specific addresses, which are available only when a virtio request comes.
- And so we would like to map everything in advance, and access only the parts which we need to, assuming that the guest would just allow those (as the addresses are shared by the guest itself).
- Will that just work with the current stuff ?
- In Linux's drivers/xen/gntdev.c, we have:
static unsigned int limit = 64*1024;
which translates to 256MB I think, i.e. the max amount of memory we can map at once. Will making this 128*1024 allow me to map 512 MB for example in a single call ? Any other changes required ?
- When I tried that, I got few errors which I am still not able to fix:
The IOCTL_GNTDEV_MAP_GRANT_REF ioctl passed but there were failures after that:
(XEN) common/grant_table.c:1055:d0v2 Bad ref 0x40000 for d1 (XEN) common/grant_table.c:1055:d0v2 Bad ref 0x40001 for d1
...
(XEN) common/grant_table.c:1055:d0v2 Bad ref 0x5fffd for d1 (XEN) common/grant_table.c:1055:d0v2 Bad ref 0x5fffe for d1 (XEN) common/grant_table.c:1055:d0v2 Bad ref 0x5ffff for d1 gnttab: error: mmap failed: Invalid argument
I am working on Linus's origin/master along with the initial patch from Juergen, picked your Xen patch for iommu node.
I am still at initial stages to properly test this stuff, just wanted to share the progress to help myself save some of the time debugging this :)
Thanks.
On Wed, Jun 22, 2022 at 2:49 PM Viresh Kumar viresh.kumar@linaro.org wrote:
On 28-04-22, 16:52, Oleksandr Tyshchenko wrote:
FYI, currently we are working on one feature to restrict memory access using Xen grant mappings based on xen-grant DMA-mapping layer for Linux
[1].
And there is a working PoC on Arm based on an updated virtio-disk. As for libraries, there is a new dependency on "xengnttab" library. In
comparison
with Xen foreign mappings model (xenforeignmemory), the Xen grant mappings model is a good fit into the Xen security model, this is a safe mechanism to share pages between guests.
Hi Oleksandr,
Hello Viresh
[sorry for the possible format issues]
I started getting this stuff into our work and have few questions.
- IIUC, with this feature the guest will allow the host to access only
certain parts of the guest memory, which is exactly what we want as well. I looked at the updated code in virtio-disk and you currently don't allow the grant table mappings along with MAP_IN_ADVANCE, is there any particular reason for that ?
MAP_IN_ADVANCE is the optimization which is only applicable if all incoming addresses are guest physical addresses and the backend is allowed to map arbitrary guest pages using foreign mappings. This is an option to demonstrate how the trusted backend (running in dom0, for example) can pre-map guest memory in advance and just only calculate a host address at the runtime based on the incoming gpa which is used as an offset (there are no xenforeignmemory_map/xenforeignmemory_unmap calls every request). But if the guest uses grant mappings for the virtio (CONFIG_XEN_VIRTIO=y), all incoming addresses are grants instead of gpa (even the virtqueue descriptor rings addresses are grants). Even leaving aside the fact that restricted virtio memory access in the guest means that not all of guest memory can be accessed, so even having pre-maped guest memory in advance, we are not able to calculate a host pointer as we don't know which gpa the particular grant belongs to.
- I understand that you currently map on the go, the virqueue descriptor
rings and then the protocol specific addresses later on, once virtio requests are received from the guest.
But in our case, Vhost user with Rust based hypervisor agnostic backend, the vhost master side can send a number of memory regions for the slave (backend) to map and the backend won't try to map anything apart from that. The virtqueue descriptor rings are available at this point and can be sent, but not the protocol specific addresses, which are available only when a virtio request comes.
- And so we would like to map everything in advance, and access only the
parts which we need to, assuming that the guest would just allow those (as the addresses are shared by the guest itself).
- Will that just work with the current stuff ?
I am not sure that I understand this use-case. Well, let's consider the virtio-disk example, it demonstrates three possible memory mapping modes: 1. All addresses are gpa, map/unmap at runtime using foreign mappings 2. All addresses are gpa, map in advance using foreign mappings 3. All addresses are grants, only map/unmap at runtime using grants mappings
If you are asking about #4 which would imply map in advance together with using grants then I think, no. This won't work with the current stuff. These are conflicting opinions, either grants and map at runtime or gpa and map in advance. If there is a wish to optimize when using grants then "maybe" it is worth looking into how persistent grants work for PV block device for example (feature-persistent in blkif.h).
In Linux's drivers/xen/gntdev.c, we have:
static unsigned int limit = 64*1024;
which translates to 256MB I think, i.e. the max amount of memory we can
map at once. Will making this 128*1024 allow me to map 512 MB for example in a single call ? Any other changes required ?
I am not sure, but I guess the total number is limited by the hypervisor itself. Could you try to increase gnttab_max_frames in the first place?
When I tried that, I got few errors which I am still not able to fix:
The IOCTL_GNTDEV_MAP_GRANT_REF ioctl passed but there were failures after that:
(XEN) common/grant_table.c:1055:d0v2 Bad ref 0x40000 for d1 (XEN) common/grant_table.c:1055:d0v2 Bad ref 0x40001 for d1
...
(XEN) common/grant_table.c:1055:d0v2 Bad ref 0x5fffd for d1 (XEN) common/grant_table.c:1055:d0v2 Bad ref 0x5fffe for d1 (XEN) common/grant_table.c:1055:d0v2 Bad ref 0x5ffff for d1 gnttab: error: mmap failed: Invalid argument
I am working on Linus's origin/master along with the initial patch from Juergen, picked your Xen patch for iommu node.
Yes, this is the correct environment. Please note that Juergen has recently pushed new version [1]
I am still at initial stages to properly test this stuff, just wanted to share the progress to help myself save some of the time debugging this :)
Thanks.
-- viresh
[1] https://lore.kernel.org/xen-devel/20220622063838.8854-1-jgross@suse.com/
On 22-06-22, 18:05, Oleksandr Tyshchenko wrote:
Even leaving aside the fact that restricted virtio memory access in the guest means that not all of guest memory can be accessed, so even having pre-maped guest memory in advance, we are not able to calculate a host pointer as we don't know which gpa the particular grant belongs to.
Ahh, I clearly missed that as well. We can't simply convert the address here on the requests :(
I am not sure that I understand this use-case. Well, let's consider the virtio-disk example, it demonstrates three possible memory mapping modes:
- All addresses are gpa, map/unmap at runtime using foreign mappings
- All addresses are gpa, map in advance using foreign mappings
- All addresses are grants, only map/unmap at runtime using grants mappings
If you are asking about #4 which would imply map in advance together with using grants then I think, no. This won't work with the current stuff. These are conflicting opinions, either grants and map at runtime or gpa and map in advance. If there is a wish to optimize when using grants then "maybe" it is worth looking into how persistent grants work for PV block device for example (feature-persistent in blkif.h).
I though #4 may make it work for our setup, but it isn't what we need necessarily.
The deal is that we want hypervisor agnostic backends, they won't and shouldn't know what hypervisor they are running against. So ideally, no special handling.
To make it work, the simplest of the solutions can be to map all that we need in advance, when the vhost negotiations happen and memory regions are passed to the backend. It doesn't necessarily mean mapping entire guest, but just the regions we need.
With what I have understood about grants until now, I don't think it will work straight away.
Yes, this is the correct environment. Please note that Juergen has recently pushed new version [1]
Yeah, I am following them up, will test the one you all agree on :)
Thanks.
Hello Viresh
[sorry for the possible format issues]
On Thu, Jun 23, 2022 at 8:48 AM Viresh Kumar viresh.kumar@linaro.org wrote:
On 22-06-22, 18:05, Oleksandr Tyshchenko wrote:
Even leaving aside the fact that restricted virtio memory access in the guest means
that
not all of guest memory can be accessed, so even having pre-maped guest memory in advance, we are not able to calculate a host pointer as we
don't
know which gpa the particular grant belongs to.
Ahh, I clearly missed that as well. We can't simply convert the address here on the requests :(
Exactly, the grant represents the granted guest page, but the backend doesn't know the guest physical address of that page and it shouldn't know it, that is the point. So the backend can only map granted pages, for which the guest explicitly calls dma_map_*(). The more, currently the backend shouldn't keep them mapped more than necessary, for example to cache mappings. Otherwise, when calling dma_unmap_*() guest will notice that grant is still in use by the backend and complain.
I am not sure that I understand this use-case. Well, let's consider the virtio-disk example, it demonstrates three possible memory mapping modes:
- All addresses are gpa, map/unmap at runtime using foreign mappings
- All addresses are gpa, map in advance using foreign mappings
- All addresses are grants, only map/unmap at runtime using grants
mappings
If you are asking about #4 which would imply map in advance together with using grants then I think, no. This won't work with the current stuff. These are conflicting opinions, either grants and map at runtime or gpa
and
map in advance. If there is a wish to optimize when using grants then "maybe" it is worth looking into how persistent grants work for PV block device for example (feature-persistent in blkif.h).
I though #4 may make it work for our setup, but it isn't what we need necessarily.
The deal is that we want hypervisor agnostic backends, they won't and shouldn't know what hypervisor they are running against. So ideally, no special handling.
I see and agree
To make it work, the simplest of the solutions can be to map all that we need in advance, when the vhost negotiations happen and memory regions are passed to the backend. It doesn't necessarily mean mapping entire guest, but just the regions we need.
With what I have understood about grants until now, I don't think it will work straight away.
yes
Below is my understanding, which might be wrong.
I am not sure about x86, there are some moments with its modes, for example PV guests should always use grants for virtio, but on Arm (which guest type is HVM): 1. If you run backend(s) in dom0 which is trusted by default, you don't necessarily need to use grants for the virtio so you will be able to map what you need in advance using foreign mappings. 2. If you run backend(s) in another domain *which you trust* and you don't want to use grants for the virtio, I think, you also will be able to map in advance using foreign mappings, but for that you will need a security policy to allow your backend's domain to map arbitrary guest pages. 3. If you run backend(s) in non-trusted domain, you will have to use grants for the virtio, so there is no way to map in advance, only to map at the runtime what was previously granted by the guest and umap right after using it.
These is another method how to restrict backend without modifying guest which is CONFIG_DMA_RESTRICTED_POOL in Linux, but this includes memcpy in the guest and requires some support in toolstack to make it work, but I wouldn't suggest it as the usage of grants for the virtio is better (and already in upsteam).
Regarding your previous attempt to map 512MB by using grants, what I understand from the error message is that Xen complains that the passed grant ref is bigger than the current number of grant table entries. Now I am wondering where do these 0x40000 - 0x5ffff grant refs (which backend tries to map in a single call) come from, are they really were previously granted by the guest and passed to the backend in a single request? If the answer is yes, then what does gnttab_usage_print_all() say (key 'g' in Xen console)? I expect there should be a lot of Xen messages like "common/grant_table.c:1882:d2v3 Expanding d2 grant table from 28 to 29 frames. Do you see them?
Yes, this is the correct environment. Please note that Juergen has
recently
pushed new version [1]
Yeah, I am following them up, will test the one you all agree on :)
Thanks.
-- viresh
On 23-06-22, 15:47, Oleksandr Tyshchenko wrote:
Below is my understanding, which might be wrong.
I am not sure about x86, there are some moments with its modes, for example PV guests should always use grants for virtio, but on Arm (which guest type is HVM):
- If you run backend(s) in dom0 which is trusted by default, you don't
necessarily need to use grants for the virtio so you will be able to map what you need in advance using foreign mappings. 2. If you run backend(s) in another domain *which you trust* and you don't want to use grants for the virtio, I think, you also will be able to map in advance using foreign mappings, but for that you will need a security policy to allow your backend's domain to map arbitrary guest pages. 3. If you run backend(s) in non-trusted domain, you will have to use grants for the virtio, so there is no way to map in advance, only to map at the runtime what was previously granted by the guest and umap right after using it.
These is another method how to restrict backend without modifying guest which is CONFIG_DMA_RESTRICTED_POOL in Linux, but this includes memcpy in the guest and requires some support in toolstack to make it work, but I wouldn't suggest it as the usage of grants for the virtio is better (and already in upsteam).
Yeah, above looks okay.
Regarding your previous attempt to map 512MB by using grants, what I understand from the error message is that Xen complains that the passed grant ref is bigger than the current number of grant table entries. Now I am wondering where do these 0x40000 - 0x5ffff grant refs (which backend tries to map in a single call) come from, are they really were previously granted by the guest and passed to the backend in a single request?
I just tried to map everything in one go, just like map in advance. Yeah, the whole idea is faulty :)
The guest never agreed to it.
If the answer is yes, then what does gnttab_usage_print_all() say (key 'g' in Xen console)? I expect there should be a lot of Xen messages like "common/grant_table.c:1882:d2v3 Expanding d2 grant table from 28 to 29 frames. Do you see them?
I am not sure if there were other messages, but anyway this doesn't bother me now as the whole thing was wrong to begin with. :)
On 14-04-22, 14:45, Viresh Kumar wrote:
Hello,
We verified our hypervisor-agnostic Rust based vhost-user backends with Qemu based setup earlier, and there was growing concern if they were truly hypervisor-agnostic.
In order to prove that, we decided to give it a try with Xen, a type-1 bare-metal hypervisor.
We are happy to announce that we were able to make progress on that front and have a working setup where we can test our existing Rust based backends, like I2C, GPIO, RNG (though only I2C is tested as of now) over Xen.
An update to this, I have successfully tested GPIO backend as well with this setup now and pushed out everything.
- GPIO required two virtqueues instead of one as in case of I2C.
- GPIO requires to do configuration exchange as well, while I2C didn't.
- The latest code supports MMIO V2, modern.
- The Xen vhost master is fully device independent now and a device type can be chosen based on just the command line itself. It would be simple to test RNG or other backends with this now (we just need to update "enum VirtioDeviceType" in vhost-user-master crate for this, with device specific information). Of course we need to emulate device in Xen too.
Hope this helps.
stratos-dev@op-lists.linaro.org