Hello James, Salil,
I'm the guy you find that work on kata containers. Sorry to have not tested for your patch set for the past long time (I'm just waiting for the patch set a little stable, maybe I miss it). I will start the test it soon. Can you point me the latest code (including Kernel and Qemu) in case that I test based on the old one.
Thanks Jianyong Wu
-----Original Message----- From: James Morse via Linaro-open-discussions linaro-open-discussions@op-lists.linaro.org Sent: 2023年11月2日 22:31 To: Salil Mehta salil.mehta@huawei.com; Russell King linux@armlinux.org.uk Cc: linaro-open-discussions@op-lists.linaro.org; Joyce Qi joyce.qi@linaro.org; Lorenzo Pieralisi lorenzo.pieralisi@linaro.org; karl.heubaum@oracle.com; darren@os.amperecomputing.com; ilkka@os.amperecomputing.com; miguel.luis@oracle.com; vishnu@os.amperecomputing.com; Linuxarm linuxarm@huawei.com; salil.mehta@opnsrc.net Subject: [Linaro-open-discussions] Re: [Request] Regarding non-RFC patch-set of Virtual CPU Hotplug kernel support
Hi Salil,
On 30/10/2023 10:45, Salil Mehta wrote:
From: Russell King linux@armlinux.org.uk Sent: Friday, October 27, 2023 8:26 PM To: Salil Mehta salil.mehta@huawei.com
On Fri, Oct 27, 2023 at 03:02:25PM +0000, Salil Mehta wrote:
From: Russell King linux@armlinux.org.uk Sent: Friday, October 27, 2023 2:47 PM To: Salil Mehta salil.mehta@huawei.com
On Fri, Oct 20, 2023 at 04:56:20PM +0000, Salil Mehta wrote:
> From: Russell King linux@armlinux.org.uk > Sent: Friday, October 20, 2023 5:48 PM > To: Salil Mehta salil.mehta@huawei.com > Cc: James Morse james.morse@arm.com; > linaro-open-discussions@op- lists.linaro.org; Joyce Qi > joyce.qi@linaro.org; Lorenzo Pieralisi > lorenzo.pieralisi@linaro.org; Jonathan Cameron > jonathan.cameron@huawei.com; Shameerali Kolothum Thodi > shameerali.kolothum.thodi@huawei.com;
karl.heubaum@oracle.com;
> darren@os.amperecomputing.com; ilkka@os.amperecomputing.com; > miguel.luis@oracle.com; vishnu@os.amperecomputing.com; Linuxarm > linuxarm@huawei.com; salil.mehta@opnsrc.net > Subject: Re: [Request] Regarding non-RFC patch-set of Virtual CPU > Hotplug kernel support > > On Fri, Oct 20, 2023 at 04:40:59PM +0000, Salil Mehta wrote: >>> I'm afraid it isn't going to happen - going through the review >>> comments there's some ambiguities there (at least to me) that I >>> can't solve without input from the reviewers. I haven't even >>> managed to get half way through the patches yet. >> >> Possible to share the link of the specific review comments and >> the patches you are referring to here? > > I'm preferring to the comments on the RFC v2 posting on the 13th > September by James. I think you already provided a link.
https://lore.kernel.org/linux-arm-kernel/20230913163823.7880-1-jam es.morse@arm.com/
IICRC, you mentioned that this patch-set was ready for submission?
In case you missed it, Rafael Wysocki isn't happy with various patches in this series - we have his comments on one of the patches which sound like the approach with the "enabled" bit won't work - at the very least in an arch-independent context. He also says he has other concerns, but hasn't set out what those are, and says he will do so after the merge window.
Can you please share the link of the specific comments of Rafael you are worried about?
He summarises it as "and it doesn't look good" which rings alarm bells. I am hoping this doesn't mean it's back to the drawing board for this feature.
If there is anything other than 'enabled' and 'online-capable' bit discussion then I would suggest to discuss this as part of the mailing-list so that more eyes can be privy to it.
AFAICS, there are only 2 comments from Rafael on this patch-set. Maybe I am failing to see his point. It would be helpful if you can elaborate more.
You were copied on Rafael's comments on the series that I posted, and I was hoping that you have read them, but it seems you haven't. I won't bother answering each of your questions given that everything you're asking is covered in Rafael's comments (which I'll provide at the bottom.)
Aah, got it. I think you are referring to RFC V3 which you have recently posted on 24th October 2023 and my earlier reply was in context to RFC V2 where Rafael commented as well but there his comments were in no way blocker to the progress of the patch-set.
https://lore.kernel.org/all/ZTffkAdOqL2pI2la@shell.armlinux.org.uk/
Apologies, I did not see that. I have been off earlier last week due to medical condition and was not able to sit much in front of laptop either. Surprisingly, patches did not land in my outlook this time (trying to figure out why...but looks like they were held by server).
I'm also wondering why it's taken Rafael this long to state this concern - it is not like this has changed between RFC v2 and the series I posted, and Rafael did comment on the RFC v2 series. So I don't really understand why it's taken so long to bring up this concern.
I guess you are referring to some discussion which I am not privy of. What is that concern and where and when it has been discussed?
Hopefully we will find out more after the merge window is over in about two weeks time.
BTW, what it has in relation to merge window?
Salil - I suspect this means that the qemu patches need to be held as well, because if the kernel side approach has to change it may impact the qemu side as well.
I really do not understand all of this. Stance is vacillating and in fact has been contradictory to what you commented in the mailing-list.
Err, what (to all of the above)?
Anyway, I'm not going to argue with you.
This is Rafael's reply that covers everything that you've questioned me about above. It was also copied not only to you, but almost all the appropriate kernel mailing lists. Everything is out in the open. There is nothing being hidden. I've no idea where that paranoia is coming from.
Well, I assumed you were referring to RFC V2 comments because you did not mention the clear context (and Rafael did commented in RFC V2 as well) things got mixed up in my reply.
For the overall series, Rafael said:
"I've gone through the series and there is at least one thing in it that concerns me a lot and some others that at least appear to be really questionable.
"I need more time to send comments which I'm not going to do before the 6.7 merge window (sorry), but from what I can say right now, this is not looking good."
As you can see, Rafael says, "I'm not going to do this before the 6.7 merge window" where "this" is providing further feedback on the concerns he has with the series. This answers your question above "BTW, what it has in relation to merge window?"
As you can also see, Rafael says, "this is not looking good" which at least to me as a native english speaker means that there are serious issues that make this unmergable. The patch that makes use of the enabled bit received this reply from Rafael:
"TBH, I am expecting problems to be there.
"If something has been interpreted in a specific way for several years, then changing that interpretation is just incompatible with the entire installed base, at least potentially.
"It is not even possible to estimate the potential adverse impact of this change, as it causes a firmware-provided bit that has never been taken into account so far to become meaningful and it does so for every device in the system.
"It will be very hard to convince me that this change is a good idea."
Thanks for pointing. Yes I can see them now.
https://lore.kernel.org/all/CAJZ5v0hhEeyDEMHnVQEiXzaKK07TSnE6GJhuTW97-
XEb9CoSHQ@mail.gmail.com/
https://lore.kernel.org/all/CAJZ5v0j-73_+9U3ngDAf9w1ADDhBTKctJdWboqUk-
okH2TQGyg@mail.gmail.com/
AFAICS, what Rafael is pointing is nothing new. We roughly knew these issues from last year when we introduced 'enabled' concept and were discussed as well. Last year, I had also proposed a different kernel solution to mitigate these problems (by faking not-present in the cpumask) which solved some of these but had other problems and James was uncomfortable (because it changed architecture code).
James received a hard 'never' on any interaction with the present bit and the GIC unless there is physical hardware that does this.
If we know there is _never_ going to be physical hardware that does this, we can abuse the present bit. But we don't know that.
Faking it will have consequences for the kernel and user-space once hardware exists, and we will have painted ourselves into an ABI corner.
That is quite damning. Rafael said, "very hard to convince me that this change is a good idea"... so using the enabled bit to control whether a CPU is enumerated doesn't sound like a patch that will be accepted into mainline even if all the others are.
Do these responses from Rafael sound to you like this is a patch series that is going to get imminently merged, or does it sound like the patch series needs some major rework and potentially needing a different approach?
Sure, they don't. Your cover letter said something similar :|
Excerpt from RFC V3 cover-letter: " I'm posting this as RFC v3 because there's still some unaddressed comments and it's clearly not ready for merging. Even if it was ready to be merged, it is too late in this development cycle to be taking this change in, so there would be little point posting it non-RFC. Also James stated that he's waiting for confirmation from the Kubernetes/Kata folk - I have no idea what the status is there."
@James, can you please update about what you are waiting for?
(the @ is for twitter right?)
Some confirmation from the people that work on those projects that this works for them. Especially around the way CPUs are visible as present (because they are), and this is visible if you stick your nose in the kernel's sysfs business.
I've not had anything in private. I've not had time to catch up with this RFC until the merge window. (and next week is some internal conference where anything could happen - and the week after is plumbers)
Thanks,
James
Linaro-open-discussions mailing list -- linaro-open-discussions@op-lists.linaro.org https://collaborate.linaro.org/display/LOD/Linaro+Open+Discussions+Home
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.