Hi Salil(s),
You mentioned 'cold plug behaves differently'.
From the last call you described 'coldplug' as starting Qemu with '-S' so it doesn't actually run the guest, then adding vCPUs before releasing Qemu to run the guest. You said CPUs added this way can't be disabled.
(am I right so far?)
This turns out to be a bit murkier than that. You can disable these vCPUs, but the first call will fail. The reason is very simple: Qemu is sending a device-check for the first call, not an eject-request.
Linux prints a warning for the spurious device-check because the CPU already exists and is even online.
An example flow, with the below debug[0], is:
# qemu -S smp cpus=1,maxcpus=3,cores=3,threads=1,sockets=1 ${REST_OF_OPTIONS}
On the Qemu monitor: | device_add driver=host-arm-cpu,core-id=1,id=cpu1 | cont
[Qemu boots the guest]
acpi_processor_add() is called twice during boot, once for vCPU0 and once for the vCPU that was 'coldplugged' vCPU1.
On the Qemu monitor: | device_del cpu1
[ 56.427089] ACPI: XYZZY:ACPI_NOTIFY_DEVICE_CHECK on ACPI0007:1 [ 56.428239] ACPI: XYZZY: acpi_scan_device_check() 1 | 1 [ 56.429335] CPU: 1 PID: 105 Comm: kworker/u6:2 Not tainted 6.1.0-rc2-00028-g6eaecb5ffd26-dirty #14644 [ 56.431043] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 56.432431] Workqueue: kacpi_hotplug acpi_hotplug_work_fn [ 56.433520] Call trace: [ 56.434015] dump_backtrace.part.0+0xe0/0xf0 [ 56.434875] show_stack+0x18/0x40 [ 56.435546] dump_stack_lvl+0x68/0x84 [ 56.436308] dump_stack+0x18/0x34 [ 56.436983] acpi_device_hotplug+0x234/0x4e0 [ 56.437847] acpi_hotplug_work_fn+0x24/0x40 [ 56.438695] process_one_work+0x1d0/0x320 [ 56.439515] worker_thread+0x14c/0x444 [ 56.440283] kthread+0x10c/0x110 [ 56.440935] ret_from_fork+0x10/0x20 [ 56.441680] acpi ACPI0007:01: Already enumerated
[ This is because Qemu is adding a CPU that already exists ]
A definition of madness is doing the same thing and expecting a different result. On the Qemu monitor: | device_del cpu1
[ 67.723708] ACPI: XYZZY:ACPI_NOTIFY_EJECT_REQUEST on ACPI0007:1 [ 67.771014] psci: CPU1 killed (polled 0 ms) [ 67.773437] XYZZY: acpi_processor_post_eject()
It looks like Qemu creates the device-check when you cold-plug the vCPU, but doesn't deliver it, instead it delivers it _instead_ of the next notification.
Qemu v7.1.0 for doesn't do this for x86, nor does it deliver the spurious ACPI_NOTIFY_DEVICE_CHECK early.
I'd suggest the arm64 changes are generating a ACPI_NOTIFY_DEVICE_CHECK when it shouldn't.
Thanks,
James
[0] Debug diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c index a4d58e0d1452..ec72f2d2a5fb 100644 --- a/drivers/acpi/acpi_processor.c +++ b/drivers/acpi/acpi_processor.c @@ -492,6 +496,8 @@ static void acpi_processor_post_eject(struct acpi_device *device) unsigned long long sta; acpi_status status;
+ pr_err("XYZZY: acpi_processor_post_eject()\n"); + if (!device) return;
diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c index d466c8195314..f9f4c7707886 100644 --- a/drivers/acpi/bus.c +++ b/drivers/acpi/bus.c @@ -467,43 +467,55 @@ static void acpi_bus_notify(acpi_handle handle, u32 type, void *data) u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; bool hotplug_event = false;
+ adev = acpi_get_acpi_dev(handle); + if (!adev) + goto err; + switch (type) { case ACPI_NOTIFY_BUS_CHECK: acpi_handle_debug(handle, "ACPI_NOTIFY_BUS_CHECK event\n"); + pr_err("XYZZY:ACPI_NOTIFY_BUS_CHECK on %s:%s\n", acpi_device_hid(adev), acpi_device_uid(adev)); hotplug_event = true; break;
case ACPI_NOTIFY_DEVICE_CHECK: acpi_handle_debug(handle, "ACPI_NOTIFY_DEVICE_CHECK event\n"); + pr_err("XYZZY:ACPI_NOTIFY_DEVICE_CHECK on %s:%s\n", acpi_device_hid(adev), acpi_device_uid(adev)); hotplug_event = true; break;
case ACPI_NOTIFY_DEVICE_WAKE: acpi_handle_debug(handle, "ACPI_NOTIFY_DEVICE_WAKE event\n"); + pr_err("XYZZY:ACPI_NOTIFY_DEVICE_WAKE on %s:%s\n", acpi_device_hid(adev), acpi_device_uid(adev)); break;
case ACPI_NOTIFY_EJECT_REQUEST: acpi_handle_debug(handle, "ACPI_NOTIFY_EJECT_REQUEST event\n"); + pr_err("XYZZY:ACPI_NOTIFY_EJECT_REQUEST on %s:%s\n", acpi_device_hid(adev), acpi_device_uid(adev)); hotplug_event = true; break;
case ACPI_NOTIFY_DEVICE_CHECK_LIGHT: acpi_handle_debug(handle, "ACPI_NOTIFY_DEVICE_CHECK_LIGHT event\n"); + pr_err("XYZZY:ACPI_NOTIFY_DEVICE_CHECK_LIGHT on %s:%s\n", acpi_device_hid(adev), acpi_device_uid(adev)); /* TBD: Exactly what does 'light' mean? */ break;
case ACPI_NOTIFY_FREQUENCY_MISMATCH: acpi_handle_err(handle, "Device cannot be configured due " "to a frequency mismatch\n"); + pr_err("XYZZY:ACPI_NOTIFY_FREQUENCY_MISMATCH on %s:%s\n", acpi_device_hid(adev), acpi_device_uid(adev)); break;
case ACPI_NOTIFY_BUS_MODE_MISMATCH: acpi_handle_err(handle, "Device cannot be configured due " "to a bus mode mismatch\n"); + pr_err("XYZZY:ACPI_NOTIFY_BUS_MODE_MISMATCH on %s:%s\n", acpi_device_hid(adev), acpi_device_uid(adev)); break;
case ACPI_NOTIFY_POWER_FAULT: acpi_handle_err(handle, "Device has suffered a power fault\n"); + pr_err("XYZZY:ACPI_NOTIFY_POWER_FAULT on %s:%s\n", acpi_device_hid(adev), acpi_device_uid(adev)); break;
default: @@ -511,10 +523,6 @@ static void acpi_bus_notify(acpi_handle handle, u32 type, void *data) break; }
- adev = acpi_get_acpi_dev(handle); - if (!adev) - goto err; - if (adev->dev.driver) { struct acpi_driver *driver = to_acpi_driver(adev->dev.driver);
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c index 2d1a82aa1607..b2bc1c611a83 100644 --- a/drivers/acpi/scan.c +++ b/drivers/acpi/scan.c @@ -380,6 +380,8 @@ static int acpi_scan_device_check(struct acpi_device *adev) { int error;
+ pr_err("XYZZY: acpi_scan_device_check() %u | %u\n", adev->status.present, adev->status.functional); + acpi_bus_get_status(adev); if (adev->status.present || adev->status.functional) { /* @@ -391,6 +393,7 @@ static int acpi_scan_device_check(struct acpi_device *adev) * again). */ if (adev->handler) { + dump_stack(); dev_warn(&adev->dev, "Already enumerated\n"); return -EALREADY; }
Hi James,
From: James Morse [mailto:james.morse@arm.com] Sent: Thursday, November 3, 2022 11:47 AM To: Salil Mehta salil.mehta@huawei.com; salil.mehta@opnsrc.net; mehta.salil.lnk@gmail.com Cc: Russell King linux@armlinux.org.uk; Jonathan Cameron jonathan.cameron@huawei.com; Lorenzo Pieralisi lorenzo.pieralisi@linaro.org; Jean-Philippe Brucker jean-philippe@linaro.org; linaro-open-discussions@op-lists.linaro.org Subject: Enabling/Disabling vCPUs and Qemu 'coldplug'
Hi Salil(s),
You mentioned 'cold plug behaves differently'.
Yes, it was because cold plugged vCPUs will have GICC.flags.Enabled Bit set and you mentioned below in one of the documentation patch:
<<excerpt from the patch[1]>> [...] +CPUs described as ``enabled`` in the static table, should not have their _STA +modified dynamically by firmware. Soft-restart features such as kexec will +re-read the static properties of the system from these static tables, and +may malfunction if these no longer describe the running system. [...]
[1] [RFC PATCH v0.1 25/25] arm64: document virtual CPU hotplug's expectations
From the last call you described 'coldplug' as starting Qemu with '-S' so it doesn't actually run the guest, then adding vCPUs before releasing Qemu to run the guest. You said CPUs added this way can't be disabled.
(am I right so far?)
Yes, that is one of the way cold plugged vCPUs could be added. I think on x86 there is a way to distinguish cold-booted vCPUs which are managed by applications (i.e. will have 'Id') and the ones which will not be (so 'Id' get assigned automatically)
The 'id' right now do not appear with very helpful naming so for the initial patches I used some very naïve way of allocating the 'Ids' and which needs to be eventually changed. Please check[2].
[2] https://lore.kernel.org/qemu-devel/20200604115430.029c488a@redhat.com/
This turns out to be a bit murkier than that. You can disable these vCPUs, but the first call will fail. The reason is very simple: Qemu is sending a device-check for the first call, not an eject-request.
Ok, it looks odd. I need to rest this case. I have not properly tested this case.
Linux prints a warning for the spurious device-check because the CPU already exists and is even online.
An example flow, with the below debug[0], is:
# qemu -S smp cpus=1,maxcpus=3,cores=3,threads=1,sockets=1 ${REST_OF_OPTIONS}
On the Qemu monitor: | device_add driver=host-arm-cpu,core-id=1,id=cpu1 | cont
[Qemu boots the guest]
acpi_processor_add() is called twice during boot, once for vCPU0 and once for the vCPU that was 'coldplugged' vCPU1.
On the Qemu monitor: | device_del cpu1
[ 56.427089] ACPI: XYZZY:ACPI_NOTIFY_DEVICE_CHECK on ACPI0007:1 [ 56.428239] ACPI: XYZZY: acpi_scan_device_check() 1 | 1 [ 56.429335] CPU: 1 PID: 105 Comm: kworker/u6:2 Not tainted 6.1.0-rc2-00028-g6eaecb5ffd26-dirty #14644 [ 56.431043] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 56.432431] Workqueue: kacpi_hotplug acpi_hotplug_work_fn [ 56.433520] Call trace: [ 56.434015] dump_backtrace.part.0+0xe0/0xf0 [ 56.434875] show_stack+0x18/0x40 [ 56.435546] dump_stack_lvl+0x68/0x84 [ 56.436308] dump_stack+0x18/0x34 [ 56.436983] acpi_device_hotplug+0x234/0x4e0 [ 56.437847] acpi_hotplug_work_fn+0x24/0x40 [ 56.438695] process_one_work+0x1d0/0x320 [ 56.439515] worker_thread+0x14c/0x444 [ 56.440283] kthread+0x10c/0x110 [ 56.440935] ret_from_fork+0x10/0x20 [ 56.441680] acpi ACPI0007:01: Already enumerated
[ This is because Qemu is adding a CPU that already exists ]
[Out of quick speculation] Not sure why but just going through it quickly looks like it could also happen since 'Ids' in Qemu are conflicting and the check to verify if the 'Id' already exists is missing. AFAICR, this was one of the pending items in Qemu. Please check the earlier discussion[1] on this with Igor Mammedov.
Suggestion was to use below library for generating Ids. Maybe this change could be common to both x86 and ARM eventually.
Patch: util - add automated ID generation utility File: https://github.com/qemu/qemu/blob/master/util/id.c Commit-id: https://github.com/qemu/qemu/commit/a0f1913637e6
...Will debug later today and get back to you with the confirmation.
[1] https://lore.kernel.org/qemu-devel/20200604115430.029c488a@redhat.com/
A definition of madness is doing the same thing and expecting a different result. On the Qemu monitor: | device_del cpu1
[ 67.723708] ACPI: XYZZY:ACPI_NOTIFY_EJECT_REQUEST on ACPI0007:1 [ 67.771014] psci: CPU1 killed (polled 0 ms) [ 67.773437] XYZZY: acpi_processor_post_eject()
It looks like Qemu creates the device-check when you cold-plug the vCPU, but doesn't deliver it, instead it delivers it _instead_ of the next notification.
That’s odd. Will debug this.
Qemu v7.1.0 for doesn't do this for x86, nor does it deliver the spurious ACPI_NOTIFY_DEVICE_CHECK early.
I'd suggest the arm64 changes are generating a ACPI_NOTIFY_DEVICE_CHECK when it shouldn't.
ok. Point taken. Will verify this.
Thanks, Salil
Hi Salil,
On 03/11/2022 15:30, Salil Mehta wrote:
Hi James,
From: James Morse [mailto:james.morse@arm.com] Sent: Thursday, November 3, 2022 11:47 AM To: Salil Mehta salil.mehta@huawei.com; salil.mehta@opnsrc.net; mehta.salil.lnk@gmail.com Cc: Russell King linux@armlinux.org.uk; Jonathan Cameron jonathan.cameron@huawei.com; Lorenzo Pieralisi lorenzo.pieralisi@linaro.org; Jean-Philippe Brucker jean-philippe@linaro.org; linaro-open-discussions@op-lists.linaro.org Subject: Enabling/Disabling vCPUs and Qemu 'coldplug'
You mentioned 'cold plug behaves differently'.
Yes, it was because cold plugged vCPUs will have GICC.flags.Enabled Bit set and you mentioned below in one of the documentation patch:
<<excerpt from the patch[1]>> [...] +CPUs described as ``enabled`` in the static table, should not have their _STA +modified dynamically by firmware. Soft-restart features such as kexec will +re-read the static properties of the system from these static tables, and +may malfunction if these no longer describe the running system. [...]
This would be a bug in Qemu, if those CPUs are meant to be removable, they should be 'not-enabled and online-capable' in the MADT:GICC. Just because they were present at boot doesn't affect whether they are removable.
Linux will try to bring them online during boot regardless, and will poke _STA to register the CPUs.
The problem is if you kexec an older kernel that doesn't understand PSCI_DENIED, you'll get a bunch or warnings printed at boot. If you kexec that other operating system, (I don't know why you would), it will choke on enabled CPUs that it can't online.
Linux may be changed in the future to track online-capable and use it to enforce what firmware is allowed to do. I'm not a fan of this sort of thing, but its worth making sure Qemu's firmware descriptions are correct.
[1] [RFC PATCH v0.1 25/25] arm64: document virtual CPU hotplug's expectations
From the last call you described 'coldplug' as starting Qemu with '-S' so it doesn't actually run the guest, then adding vCPUs before releasing Qemu to run the guest. You said CPUs added this way can't be disabled.
(am I right so far?)
Yes, that is one of the way cold plugged vCPUs could be added. I think on x86 there is a way to distinguish cold-booted vCPUs which are managed by applications (i.e. will have 'Id') and the ones which will not be (so 'Id' get assigned automatically)
The 'id' right now do not appear with very helpful naming so for the initial patches I used some very naïve way of allocating the 'Ids' and which needs to be eventually changed. Please check[2].
I'll file this under qemu-internals!
This turns out to be a bit murkier than that. You can disable these vCPUs, but the first call will fail. The reason is very simple: Qemu is sending a device-check for the first call, not an eject-request.
Ok, it looks odd. I need to rest this case. I have not properly tested this case.
Linux prints a warning for the spurious device-check because the CPU already exists and is even online.
An example flow, with the below debug[0], is:
# qemu -S smp cpus=1,maxcpus=3,cores=3,threads=1,sockets=1 ${REST_OF_OPTIONS}
On the Qemu monitor: | device_add driver=host-arm-cpu,core-id=1,id=cpu1 | cont
[Qemu boots the guest]
acpi_processor_add() is called twice during boot, once for vCPU0 and once for the vCPU that was 'coldplugged' vCPU1.
On the Qemu monitor: | device_del cpu1
[ 56.427089] ACPI: XYZZY:ACPI_NOTIFY_DEVICE_CHECK on ACPI0007:1 [ 56.428239] ACPI: XYZZY: acpi_scan_device_check() 1 | 1 [ 56.429335] CPU: 1 PID: 105 Comm: kworker/u6:2 Not tainted 6.1.0-rc2-00028-g6eaecb5ffd26-dirty #14644 [ 56.431043] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 56.432431] Workqueue: kacpi_hotplug acpi_hotplug_work_fn [ 56.433520] Call trace: [ 56.434015] dump_backtrace.part.0+0xe0/0xf0 [ 56.434875] show_stack+0x18/0x40 [ 56.435546] dump_stack_lvl+0x68/0x84 [ 56.436308] dump_stack+0x18/0x34 [ 56.436983] acpi_device_hotplug+0x234/0x4e0 [ 56.437847] acpi_hotplug_work_fn+0x24/0x40 [ 56.438695] process_one_work+0x1d0/0x320 [ 56.439515] worker_thread+0x14c/0x444 [ 56.440283] kthread+0x10c/0x110 [ 56.440935] ret_from_fork+0x10/0x20 [ 56.441680] acpi ACPI0007:01: Already enumerated
[ This is because Qemu is adding a CPU that already exists ]
[Out of quick speculation] Not sure why but just going through it quickly looks like it could also happen since 'Ids' in Qemu are conflicting and the check to verify if the 'Id' already exists is missing. AFAICR, this was one of the pending items in Qemu. Please check the earlier discussion[1] on this with Igor Mammedov.
Suggestion was to use below library for generating Ids. Maybe this change could be common to both x86 and ARM eventually.
Patch: util - add automated ID generation utility File: https://github.com/qemu/qemu/blob/master/util/id.c Commit-id: https://github.com/qemu/qemu/commit/a0f1913637e6
...Will debug later today and get back to you with the confirmation.
Thanks. No need for that to be today, I don't think its any more urgent than the other bits. (PSCI support in Qemu for PSCI_DENIED and _STA describing CPUs as present).
Thanks,
James
linaro-open-discussions@op-lists.linaro.org