Hi James/Salil,
On 31-08-2022 16:39, James Morse wrote:
acpi_processor_get_info() registers all present CPUs. Registering a CPU is what creates the sysfs entries and triggers the udev notifications.
arm64 virtual machines that support 'virtual cpu hotplug' use the enabled bit to indicate whether the CPU can be brought online, as the existing ACPI tables require all hardware to be described and present.
If firmware describes a CPU as present, but disabled, skip the registration. Such CPUs are present, but can't be brought online for whatever reason. (e.g. firmware/hypervisor policy).
Once firmware sets the enabled bit, the CPU can be registered and brought online by user-space. Online CPUs, or CPUs that are missing an _STA method must always be registered.
Signed-off-by: James Morse <james.morse(a)arm.com>
drivers/acpi/acpi_processor.c | 31 ++++++++++++++++++++++++++++++- 1 file changed, 30 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c index 1bd6e4b8ab66..42521d89c378 100644 --- a/drivers/acpi/acpi_processor.c +++ b/drivers/acpi/acpi_processor.c @@ -194,6 +194,32 @@ static int acpi_processor_make_present(struct acpi_processor *pr) return ret; } +static int acpi_processor_make_enabled(struct acpi_processor *pr) +{
- unsigned long long sta;
- acpi_status status;
- bool present, enabled;
- if (!acpi_has_method(pr->handle, "_STA"))
return arch_register_cpu(pr->id);
- status = acpi_evaluate_integer(pr->handle, "_STA", NULL, &sta);
- if (ACPI_FAILURE(status))
return -ENODEV;
- present = sta & ACPI_STA_DEVICE_PRESENT;
- enabled = sta & ACPI_STA_DEVICE_ENABLED;
- if (cpu_online(pr->id) && (!present || !enabled)) {
pr_err_once(FW_BUG "CPU %u is online, but described as not present or disabled!\n", pr->id);
add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
- } else if (!present || !enabled) {
return -ENODEV;
- }
- return arch_register_cpu(pr->id);
+}
- static int acpi_processor_get_info(struct acpi_device *device) { union acpi_object object = { 0 };
@@ -271,7 +297,7 @@ static int acpi_processor_get_info(struct acpi_device *device) } if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id)) {
int ret = arch_register_cpu(pr->id);
int ret = acpi_processor_make_enabled(pr);
if (ret) return ret; @@ -479,6 +505,9 @@ static void acpi_processor_remove(struct acpi_device *device) acpi_processor_make_not_present(device); return; }
- if (cpu_present(pr->id) && !(sta & ACPI_STA_DEVICE_ENABLED))
}arch_unregister_cpu(pr->id);
#ifdef CONFIG_X86
Enumeration failure while trying to hot plug vcpus after guest reboot
Following sequence of steps causes the issue
1. Boot VM with max 80 vcpus and current 72 vcpus(For example) 2. hot plug 3 more vcpus # virsh setvcpus vm2 75 --live 3. hot unplug 3 vcpus ( which were added earlier ) # virsh setvcpus vm2 72 --live 4. reboot VM # reboot 5. Again try to hot plug 3 vcpus -- ERROR # virsh setvcpus vm2 75 --live [ 47.367823] acpi ACPI0007:48: Enumeration failure [ 47.565173] acpi ACPI0007:49: Enumeration failure [ 47.703423] acpi ACPI0007:4a: Enumeration failure
I have tested the above scenario with following git repos
https://github.com/salil-mehta/qemu/tree/virt-cpuhp-armv8/rfc-v1-port2909202...
https://github.com/salil-mehta/linux/tree/virt-cpuhp-arm64/rfc-v2/jmorse-var... https://gitlab.arm.com/linux-arm/linux-jm/-/tree/virtual_cpu_hotplug/rfc/v/0...
I observed the following sequence while doing hotplug and unplug
When we do hotplug, making vcpu as *present*(_STA.Pres=1) and *enabled*(_STA.Enabled=1) When we do hot unplug, we just *disabled*(_STA.Enabled=0) vcpu but not *removed*(_STA.Pres=1)
In such a case CPU shall still be present at the firmware/VMM level and could be kept disabled i.e. _STA.Enable=0 and _STA.Present=1
After reboot, system is trying to enumerate vcpu(as *present*(_STA.Pres=1)) and it is marking flags flags.initialized = true and flags.visited = true
as the vcpu *disabled*(_STA.Enabled=0), vcpu has been made not present and flags flags.initialized and flags.visited were not cleared
And following fix resolves the issue
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c index 558664d169fc..e93b933cb8a4 100644 --- a/drivers/acpi/scan.c +++ b/drivers/acpi/scan.c @@ -2158,6 +2158,9 @@ static int acpi_scan_attach_handler(struct acpi_device *device) break;
device->handler = NULL; + device->flags.initialized = false; + acpi_device_clear_enumerated(device); + device->flags.power_manageable = 0; if (ret < 0) break; }
_Regards_, -Vishnu.