[Linaro-open-discussions] Re: Enabling/Disabling vCPUs and Qemu 'coldplug'

4 Nov 2022


      Hi James,
...
From: James Morse [mailto:james.morse@arm.com]
Sent: Thursday, November 3, 2022 11:47 AM
To: Salil Mehta salil.mehta@huawei.com; salil.mehta@opnsrc.net;
mehta.salil.lnk@gmail.com
Cc: Russell King linux@armlinux.org.uk; Jonathan Cameron
jonathan.cameron@huawei.com; Lorenzo Pieralisi
lorenzo.pieralisi@linaro.org; Jean-Philippe Brucker
jean-philippe@linaro.org; linaro-open-discussions@op-lists.linaro.org
Subject: Enabling/Disabling vCPUs and Qemu 'coldplug'
Hi Salil(s),
You mentioned 'cold plug behaves differently'.
Yes, it was because cold plugged vCPUs will have GICC.flags.Enabled Bit set
and you mentioned below in one of the documentation patch:
<<excerpt from the patch[1]>>
[...]
+CPUs described as ``enabled`` in the static table, should not have their _STA
+modified dynamically by firmware. Soft-restart features such as kexec will
+re-read the static properties of the system from these static tables, and
+may malfunction if these no longer describe the running system.
[...]
[1] [RFC PATCH v0.1 25/25] arm64: document virtual CPU hotplug's expectations
...
From the last call you described 'coldplug' as starting Qemu with '-S' so it
doesn't
actually run the guest, then adding vCPUs before releasing Qemu to run the guest.
You said
CPUs added this way can't be disabled.
(am I right so far?)
Yes, that is one of the way cold plugged vCPUs could be added. I think on x86
there is a way to distinguish cold-booted vCPUs which are managed by applications
(i.e. will have 'Id') and the ones which will not be (so 'Id' get assigned
automatically)
The 'id' right now do not appear with very helpful naming so for the initial
patches I used some very naïve way of allocating the 'Ids' and which needs to
be eventually changed. Please check[2].
[2] https://lore.kernel.org/qemu-devel/20200604115430.029c488a@redhat.com/
...
This turns out to be a bit murkier than that. You can disable these vCPUs, but the first
call will fail. The reason is very simple: Qemu is sending a device-check for the first
call, not an eject-request.
Ok, it looks odd. I need to rest this case. I have not properly tested this case.
...
Linux prints a warning for the spurious device-check because the CPU already exists and is
even online.
An example flow, with the below debug[0], is:
# qemu -S smp cpus=1,maxcpus=3,cores=3,threads=1,sockets=1 ${REST_OF_OPTIONS}
On the Qemu monitor:
| device_add driver=host-arm-cpu,core-id=1,id=cpu1
| cont
[Qemu boots the guest]
acpi_processor_add() is called twice during boot, once for vCPU0 and once for the vCPU
that was 'coldplugged' vCPU1.
On the Qemu monitor:
| device_del cpu1
[   56.427089] ACPI: XYZZY:ACPI_NOTIFY_DEVICE_CHECK on ACPI0007:1
[   56.428239] ACPI: XYZZY: acpi_scan_device_check() 1 | 1
[   56.429335] CPU: 1 PID: 105 Comm: kworker/u6:2 Not tainted 6.1.0-rc2-00028-g6eaecb5ffd26-dirty #14644
[   56.431043] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[   56.432431] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
[   56.433520] Call trace:
[   56.434015]  dump_backtrace.part.0+0xe0/0xf0
[   56.434875]  show_stack+0x18/0x40
[   56.435546]  dump_stack_lvl+0x68/0x84
[   56.436308]  dump_stack+0x18/0x34
[   56.436983]  acpi_device_hotplug+0x234/0x4e0
[   56.437847]  acpi_hotplug_work_fn+0x24/0x40
[   56.438695]  process_one_work+0x1d0/0x320
[   56.439515]  worker_thread+0x14c/0x444
[   56.440283]  kthread+0x10c/0x110
[   56.440935]  ret_from_fork+0x10/0x20
[   56.441680] acpi ACPI0007:01: Already enumerated
[ This is because Qemu is adding a CPU that already exists ]
[Out of quick speculation]
Not sure why but just going through it quickly looks like it could
also happen since 'Ids' in Qemu are conflicting and the check to verify
if the 'Id' already exists is missing. AFAICR, this was one of the
pending items in Qemu. Please check the earlier discussion[1] on
this with Igor Mammedov.
Suggestion was to use below library for generating Ids. Maybe this change
could be common to both x86 and ARM eventually.
Patch: util - add automated ID generation utility
File: https://github.com/qemu/qemu/blob/master/util/id.c
Commit-id: https://github.com/qemu/qemu/commit/a0f1913637e6
...Will debug later today and get back to you with the confirmation.
[1] https://lore.kernel.org/qemu-devel/20200604115430.029c488a@redhat.com/
...
A definition of madness is doing the same thing and expecting a different result.
On the Qemu monitor:
| device_del cpu1
[   67.723708] ACPI: XYZZY:ACPI_NOTIFY_EJECT_REQUEST on ACPI0007:1
[   67.771014] psci: CPU1 killed (polled 0 ms)
[   67.773437] XYZZY: acpi_processor_post_eject()
It looks like Qemu creates the device-check when you cold-plug the vCPU, but doesn't
deliver it, instead it delivers it _instead_ of the next notification.
That’s odd. Will debug this.
...
Qemu v7.1.0 for doesn't do this for x86, nor does it deliver the spurious
ACPI_NOTIFY_DEVICE_CHECK early.
I'd suggest the arm64 changes are generating a ACPI_NOTIFY_DEVICE_CHECK when
it shouldn't.
ok. Point taken. Will verify this.
Thanks,
Salil

2025

2024

2023

2022

2021

2020

[Linaro-open-discussions] Re: Enabling/Disabling vCPUs and Qemu 'coldplug'