Hi Jianyong,
On 03/11/2022 11:40, jianyong.wu--- via Linaro-open-discussions wrote:
Thanks James. I learn a lot from your explanation here. But if online capable bit in GICC can't ensure the related CPU can be bring up, as it can't ensure GICR exist, what benefit we get from this bit?
Linux doesn't gain anything from this bit. It's needed for another operating system that can't tolerate errors being returned by PSCI.
Linux has to read the bit from the MADT:GICC so that firmware can have one MADT that works for Linux and this other operating system.
Clearing the Enabled bit from the MADT:GICC created a new problem, that the MADT:GICC:GICR isn't obviously accessible during boot. A side effect of the ACPI changes is that now we need to use the MADT:GICR entries to describe the redistributors.
This isn't a problem for a virtual machine as the redistributor is emulated by the hypervisor, and really is always present and always-on.
It seems the only way to enable a disabled CPU is that add GICR entry into MADT not by this bit. And how to understand the meaning conveyed by this bit that it indicates the related CPU can be enabled during OS runtime?
Describing the redistributors with an MADT:GICR entry is a prerequisite yes.
We have a hard requirement from the irqchip maintainer that he won't allow code that brings a redistributor online after boot - unless there is physical hardware that is doing that.
This is because linux accesses all the redistributors during boot to find the common set of features.
We don't know whether the GICR in a MADT:GICC:GICR is accessible if the CPU isn't marked as enabled, and isn't online. On real hardware that has CPUs that aren't present, it won't be accessible. If linux accesses the redistributor on this hardware, it would either crash due to an external-abort, or lockup. This would be very upsetting for users of stable-kernels on such hardware, so this series makes those CPUs impossible.
This isn't a problem for virtual machines, as the kernel emulates the redistributors, and they really are always present and always on. You can use the MADT:GICR to describe them.
Future physical hardware that does this won't be able to add CPUs that also need a redistributor until the ACPI spec is cleaned up to say whether the MADT:GICC:GICR is accessible - and linux supports bringing a redistributor online after boot, which it doesn't today.
As to what the 'online capable' bit means, it means those CPUs could return PSCI_DENIED. The distinction is only needed for operating systems that choke on errors from PSCI. Linux doesn't care, (you just get some weird messages during boot), hence this bit isn't relevant outside the irqchip code.
(does this answer your question!?)
Thanks,
James