Hi
Miguel got in touch with a suggested fix for the Qemu TCG support for virtual-cpuhotplug/online-policy.
I recall we talked about this last week, and didn't intend to support TCG. It didn't work for me on x86 when I tried it, Jonathan suggested we don't know what its needed for.
Miguel - over to you!
Thanks,
James
Hello
On 7 Feb 2023, at 16:50, James Morse james.morse@arm.com wrote:
Hi
Miguel got in touch with a suggested fix for the Qemu TCG support for virtual-cpuhotplug/online-policy.
I recall we talked about this last week, and didn't intend to support TCG. It didn't work for me on x86 when I tried it, Jonathan suggested we don't know what its needed for.
Miguel - over to you!
Thanks, James.
I've been testing the virtual cpu hotplug/unplug feature with an aarch64 guest running on a x86_64 host, and noticed an issue when using QEMU's multi-threaded TCG acceleration where there's a broken assert on the current number of tcg threads after a certain amount of vCPU hotplugs.
QEMU can run in MTTCG mode and allows vCPU hotplug by creating a TCGContext thread every time a vCPU is hotplugged. It keeps track of the number of TCGContext threads in tcg_cur_ctxs, therefore cannot exceed tcg_max_ctxs which is the maximum number of vCPUs allowed on the QEMU instance.
When hotplugging a vCPU a new thread is created and tcg_cur_ctxs gets incremented, although if a vCPU gets unplugged, tcg_cur_ctxs doesn't get decremented and so reaching tcg_max_ctxs after a certain ammount of hotplugs and thus breaking the assert stating tcg_cur_ctxs < tcg_max_ctxs.
This scenario had been tested on a x86_64 host running an aarch64 guest with -smp cpus=4,maxcpus=6. This setup will create four TCGContext threads and up to six TCGContext threads meaning one is only able to hotplug vCPUs two times. More than that will break the assert as shown below:
(qemu) device_add driver=max-arm-cpu,core-id=4,id=core4 (qemu) device_del core4 (qemu) device_add driver=max-arm-cpu,core-id=4,id=core4 (qemu) device_del core4 (qemu) device_add driver=max-arm-cpu,core-id=4,id=core4 ** ERROR:../tcg/tcg.c:482:tcg_register_thread: assertion failed: (n < tcg_max_ctxs) Aborted (core dumped)
In order to fix this issue, tcg threads need to be able to unregister in tcg_ctxs[] and update tcg_cur_ctxs on tear down.
I tested this fix and it overcomes the problem.
I wonder if I could have your opinions.
Thanks in advance, Miguel
Thanks,
James
linaro-open-discussions@op-lists.linaro.org