 
            Hi Russell,
From: Russell King linux@armlinux.org.uk Sent: Friday, October 27, 2023 8:26 PM To: Salil Mehta salil.mehta@huawei.com
On Fri, Oct 27, 2023 at 03:02:25PM +0000, Salil Mehta wrote:
From: Russell King linux@armlinux.org.uk Sent: Friday, October 27, 2023 2:47 PM To: Salil Mehta salil.mehta@huawei.com
On Fri, Oct 20, 2023 at 04:56:20PM +0000, Salil Mehta wrote:
From: Russell King linux@armlinux.org.uk Sent: Friday, October 20, 2023 5:48 PM To: Salil Mehta salil.mehta@huawei.com Cc: James Morse james.morse@arm.com; linaro-open-discussions@op- lists.linaro.org; Joyce Qi joyce.qi@linaro.org; Lorenzo Pieralisi lorenzo.pieralisi@linaro.org; Jonathan Cameron jonathan.cameron@huawei.com; Shameerali Kolothum Thodi shameerali.kolothum.thodi@huawei.com; karl.heubaum@oracle.com; darren@os.amperecomputing.com; ilkka@os.amperecomputing.com; miguel.luis@oracle.com; vishnu@os.amperecomputing.com; Linuxarm linuxarm@huawei.com; salil.mehta@opnsrc.net Subject: Re: [Request] Regarding non-RFC patch-set of Virtual CPU Hotplug kernel support
On Fri, Oct 20, 2023 at 04:40:59PM +0000, Salil Mehta wrote:
> I'm afraid it isn't going to happen - going through the review comments > there's some ambiguities there (at least to me) that I can't solve > without input from the reviewers. I haven't even managed to get half > way through the patches yet.
Possible to share the link of the specific review comments and the patches you are referring to here?
I'm preferring to the comments on the RFC v2 posting on the 13th September by James. I think you already provided a link.
https://lore.kernel.org/linux-arm-kernel/20230913163823.7880-1-james.morse@a...
IICRC, you mentioned that this patch-set was ready for submission?
In case you missed it, Rafael Wysocki isn't happy with various patches in this series - we have his comments on one of the patches which sound like the approach with the "enabled" bit won't work - at the very least in an arch-independent context. He also says he has other concerns, but hasn't set out what those are, and says he will do so after the merge window.
Can you please share the link of the specific comments of Rafael you are worried about?
He summarises it as "and it doesn't look good" which rings alarm bells. I am hoping this doesn't mean it's back to the drawing board for this feature.
If there is anything other than 'enabled' and 'online-capable' bit discussion then I would suggest to discuss this as part of the mailing-list so that more eyes can be privy to it.
AFAICS, there are only 2 comments from Rafael on this patch-set. Maybe I am failing to see his point. It would be helpful if you can elaborate more.
You were copied on Rafael's comments on the series that I posted, and I was hoping that you have read them, but it seems you haven't. I won't bother answering each of your questions given that everything you're asking is covered in Rafael's comments (which I'll provide at the bottom.)
Aah, got it. I think you are referring to RFC V3 which you have recently posted on 24th October 2023 and my earlier reply was in context to RFC V2 where Rafael commented as well but there his comments were in no way blocker to the progress of the patch-set.
https://lore.kernel.org/all/ZTffkAdOqL2pI2la@shell.armlinux.org.uk/
Apologies, I did not see that. I have been off earlier last week due to medical condition and was not able to sit much in front of laptop either. Surprisingly, patches did not land in my outlook this time (trying to figure out why...but looks like they were held by server).
I'm also wondering why it's taken Rafael this long to state this concern - it is not like this has changed between RFC v2 and the series I posted, and Rafael did comment on the RFC v2 series. So I don't really understand why it's taken so long to bring up this concern.
I guess you are referring to some discussion which I am not privy of. What is that concern and where and when it has been discussed?
Hopefully we will find out more after the merge window is over in about two weeks time.
BTW, what it has in relation to merge window?
Salil - I suspect this means that the qemu patches need to be held as well, because if the kernel side approach has to change it may impact the qemu side as well.
I really do not understand all of this. Stance is vacillating and in fact has been contradictory to what you commented in the mailing-list.
Err, what (to all of the above)?
Anyway, I'm not going to argue with you.
This is Rafael's reply that covers everything that you've questioned me about above. It was also copied not only to you, but almost all the appropriate kernel mailing lists. Everything is out in the open. There is nothing being hidden. I've no idea where that paranoia is coming from.
Well, I assumed you were referring to RFC V2 comments because you did not mention the clear context (and Rafael did commented in RFC V2 as well) things got mixed up in my reply.
For the overall series, Rafael said:
"I've gone through the series and there is at least one thing in it that concerns me a lot and some others that at least appear to be really questionable.
"I need more time to send comments which I'm not going to do before the 6.7 merge window (sorry), but from what I can say right now, this is not looking good."
As you can see, Rafael says, "I'm not going to do this before the 6.7 merge window" where "this" is providing further feedback on the concerns he has with the series. This answers your question above "BTW, what it has in relation to merge window?"
As you can also see, Rafael says, "this is not looking good" which at least to me as a native english speaker means that there are serious issues that make this unmergable. The patch that makes use of the enabled bit received this reply from Rafael:
"TBH, I am expecting problems to be there.
"If something has been interpreted in a specific way for several years, then changing that interpretation is just incompatible with the entire installed base, at least potentially.
"It is not even possible to estimate the potential adverse impact of this change, as it causes a firmware-provided bit that has never been taken into account so far to become meaningful and it does so for every device in the system.
"It will be very hard to convince me that this change is a good idea."
Thanks for pointing. Yes I can see them now.
https://lore.kernel.org/all/CAJZ5v0hhEeyDEMHnVQEiXzaKK07TSnE6GJhuTW97-XEb9Co...
https://lore.kernel.org/all/CAJZ5v0j-73_+9U3ngDAf9w1ADDhBTKctJdWboqUk-okH2TQ...
AFAICS, what Rafael is pointing is nothing new. We roughly knew these issues from last year when we introduced 'enabled' concept and were discussed as well. Last year, I had also proposed a different kernel solution to mitigate these problems (by faking not-present in the cpumask) which solved some of these but had other problems and James was uncomfortable (because it changed architecture code).
That is quite damning. Rafael said, "very hard to convince me that this change is a good idea"... so using the enabled bit to control whether a CPU is enumerated doesn't sound like a patch that will be accepted into mainline even if all the others are.
Do these responses from Rafael sound to you like this is a patch series that is going to get imminently merged, or does it sound like the patch series needs some major rework and potentially needing a different approach?
Sure, they don't. Your cover letter said something similar :|
Excerpt from RFC V3 cover-letter: " I'm posting this as RFC v3 because there's still some unaddressed comments and it's clearly not ready for merging. Even if it was ready to be merged, it is too late in this development cycle to be taking this change in, so there would be little point posting it non-RFC. Also James stated that he's waiting for confirmation from the Kubernetes/Kata folk - I have no idea what the status is there."
@James, can you please update about what you are waiting for?
Regarding Qemu, we now have architecture agnostic and architecture dependent patch-set. We are aiming to get the former patch-set merged in this cycle as there are other companies vouching for it. Architecture dependent patch-set has been intentionally floated as a RFC because of the very reason you stated above i.e. the ambiguities within kernel patch-set and its acceptance.
This patch-set will keep on getting revised as a RFC. The problems being referred in the kernel will only affect the ACPI interface part and Qemu has lot of other aspects which can be reviewed but of course its eventual acceptance will depend upon kernel.
Many thanks!
Best regards Salil