Linaro Open Discussions monthly meeting Wednesday 5 Oct 2022 ⋅ 19:00 – 20:00 Hong Kong Standard Time
Location https://linaro-org.zoom.us/j/95682500341 https://www.google.com/url?q=https%3A%2F%2Flinaro-org.zoom.us%2Fj%2F95682500...
Joyce QI 邀请您参加预先安排的 Zoom 会议。
加入 Zoom 会议 https://linaro-org.zoom.us/j/95682500341
会议号:956 8250 0341 手机一键拨号 +16699009128,,95682500341# 美国 (San Jose) +13462487799,,95682500341# 美国 (Houston)
根据您的位置拨号 +1 669 900 9128 美国 (San Jose) +1 346 248 7799 美国 (Houston) +1 253 215 8782 美国 (Tacoma) +1 646 558 8656 美国 (New York) +1 301 715 8592 美国 (Washington DC) +1 312 626 6799 美国 (Chicago) 888 788 0099 美国 免费 877 853 5247 美国 免费 会议号:956 8250 0341 查找本地号码:https://linaro-org.zoom.us/u/ady2J9Zn7t
Guests linaro-open-discussions@op-lists.linaro.org james.morse@arm.com jonathan.cameron@huawei.com lorenzo.pieralisi@linaro.org ilkka@os.amperecomputing.com View all guest info https://calendar.google.com/calendar/event?action=VIEW&eid=Mjg1MjNrNWtiM...
Reply for linaro-open-discussions@op-lists.linaro.org and view more details https://calendar.google.com/calendar/event?action=VIEW&eid=Mjg1MjNrNWtiM... Your attendance is optional.
~~//~~ Invitation from Google Calendar: https://calendar.google.com/calendar/
You are receiving this email because you are an attendee of the event. To stop receiving future updates for this event, decline this event.
Forwarding this invitation could allow any recipient to send a response to the organiser, be added to the guest list, invite others regardless of their own invitation status or modify your RSVP.
Learn more https://support.google.com/calendar/answer/37135#forwarding
Hi Joyce, Meeting time is not clear. Yesterday meeting tine on the LOD site was being shown as 22:00 PM HKT. Now, it is being shown as 11:00 (BST?)
Link: https://linaro.atlassian.net/wiki/spaces/LOD/overview
--------------------------------------------------------- Linaro Open Discussions monthly meeting
When:Wednesday, 5 Oct, 11:00 – 12:00 (>>>>>>> IS THIS BST?)
Wherehttps://linaro-org.zoom.us/j/95682500341 (map)
11:00 Linaro Open Discussions monthly meeting
where:Join Zoom Meeting https://linaro-org.zoom.us/j/95682500341
Meeting ID: 956 8250 0341 手机一键拨号 +16699009128,,97315461353# 美国 (San Jose) +13462487799,,97315461353# 美国 (Houston) ---------------------------------------------------------
Below invitation has a meeting timings of (19:00 - 20:00 HKT) which should be 12:00PM BST?
Could you please confirm the time of the meeting?
Many thanks Salil
From: Google Calendar [mailto:calendar-notification@google.com] On Behalf Of Joyce Qi via Linaro-open-discussions Sent: Wednesday, October 5, 2022 9:31 AM To: linaro-open-discussions@op-lists.linaro.org; james.morse@arm.com; Jonathan Cameron jonathan.cameron@huawei.com; lorenzo.pieralisi@linaro.org; ilkka@os.amperecomputing.com Subject: [Linaro-open-discussions] Invitation: Linaro Open Discussions monthly meeting @ Wed 5 Oct 2022 19:00 - 20:00 (HKT) (linaro-open-discussions@op-lists.linaro.org)
Linaro Open Discussions monthly meeting Wednesday 5 Oct 2022 ⋅ 19:00 – 20:00 Hong Kong Standard Time
Location https://linaro-org.zoom.us/j/95682500341 https://www.google.com/url?q=https%3A%2F%2Flinaro-org.zoom.us%2Fj%2F9568250 0341&sa=D&source=calendar&usd=2&usg=AOvVaw2JDK9LgOcXl2WanQ86Y-6h
Joyce QI 邀请您参加预先安排的 Zoom 会议。
加入 Zoom 会议 https://linaro-org.zoom.us/j/95682500341
会议号:956 8250 0341 手机一键拨号 +16699009128,,95682500341# 美国 (San Jose) +13462487799,,95682500341# 美国 (Houston)
根据您的位置拨号 +1 669 900 9128 美国 (San Jose) +1 346 248 7799 美国 (Houston) +1 253 215 8782 美国 (Tacoma) +1 646 558 8656 美国 (New York) +1 301 715 8592 美国 (Washington DC) +1 312 626 6799 美国 (Chicago) 888 788 0099 美国 免费 877 853 5247 美国 免费 会议号:956 8250 0341 查找本地号码:https://linaro-org.zoom.us/u/ady2J9Zn7t
Guests linaro-open-discussions@op-lists.linaro.org james.morse@arm.com jonathan.cameron@huawei.com lorenzo.pieralisi@linaro.org ilkka@os.amperecomputing.com View all guest info https://calendar.google.com/calendar/event?action=VIEW&eid=Mjg1MjNrNWtiM... cGlsYW8zb2hwa3E5cWFfMjAyMjA5MjdUMTEwMDAwWiBsaW5hcm8tb3Blbi1kaXNjdXNzaW9uc0B vcC1saXN0cy5saW5hcm8ub3Jn&tok=NTQjY184anE0dGh2ZTNuN3NlaThhMGpmazlwdXI3c0Bnc m91cC5jYWxlbmRhci5nb29nbGUuY29tY2JlMDNmZTk3ZTYzMGMyN2ExZDE2N2QyYjcwOTYxMjNi ODIwMjA0ZQ&ctz=Asia%2FHong_Kong&hl=en_GB&es=0
Reply for linaro-open-discussions@op-lists.linaro.org and view more details https://calendar.google.com/calendar/event?action=VIEW&eid=Mjg1MjNrNWtiM... cGlsYW8zb2hwa3E5cWFfMjAyMjA5MjdUMTEwMDAwWiBsaW5hcm8tb3Blbi1kaXNjdXNzaW9uc0B vcC1saXN0cy5saW5hcm8ub3Jn&tok=NTQjY184anE0dGh2ZTNuN3NlaThhMGpmazlwdXI3c0Bnc m91cC5jYWxlbmRhci5nb29nbGUuY29tY2JlMDNmZTk3ZTYzMGMyN2ExZDE2N2QyYjcwOTYxMjNi ODIwMjA0ZQ&ctz=Asia%2FHong_Kong&hl=en_GB&es=0 Your attendance is optional.
~~//~~ Invitation from Google Calendar: https://calendar.google.com/calendar/
You are receiving this email because you are an attendee of the event. To stop receiving future updates for this event, decline this event.
Forwarding this invitation could allow any recipient to send a response to the organiser, be added to the guest list, invite others regardless of their own invitation status or modify your RSVP.
Learn more https://support.google.com/calendar/answer/37135#forwarding
Linaro-open-discussions mailing list -- linaro-open-discussions@op-lists.linaro.org https://collaborate.linaro.org/display/LOD/Linaro+Open+Discussions+Home
Hi Salil,
Our regular meeting is 11:00 BST(7:00PM HKT),if there are US colleague attend, I will switch to 22:00 PM HKT. I am ok if switching to 22:00PM HKT in future meetings.
Thanks:) Joyce
在 2022年10月5日,下午5:38,Salil Mehta salil.mehta@huawei.com 写道:
Hi Joyce, Meeting time is not clear. Yesterday meeting tine on the LOD site was being shown as 22:00 PM HKT. Now, it is being shown as 11:00 (BST?)
Link: https://linaro.atlassian.net/wiki/spaces/LOD/overview
Linaro Open Discussions monthly meeting
When:Wednesday, 5 Oct, 11:00 – 12:00 (>>>>>>> IS THIS BST?)
Wherehttps://linaro-org.zoom.us/j/95682500341 (map)
11:00 Linaro Open Discussions monthly meeting
where:Join Zoom Meeting https://linaro-org.zoom.us/j/95682500341
Meeting ID: 956 8250 0341 手机一键拨号 +16699009128,,97315461353# 美国 (San Jose)
+13462487799,,97315461353# 美国 (Houston)
Below invitation has a meeting timings of (19:00 - 20:00 HKT) which should be 12:00PM BST?
Could you please confirm the time of the meeting?
Many thanks Salil
From: Google Calendar [mailto:calendar-notification@google.com] On Behalf Of Joyce Qi via Linaro-open-discussions Sent: Wednesday, October 5, 2022 9:31 AM To: linaro-open-discussions@op-lists.linaro.org; james.morse@arm.com; Jonathan Cameron jonathan.cameron@huawei.com; lorenzo.pieralisi@linaro.org; ilkka@os.amperecomputing.com Subject: [Linaro-open-discussions] Invitation: Linaro Open Discussions monthly meeting @ Wed 5 Oct 2022 19:00 - 20:00 (HKT) (linaro-open-discussions@op-lists.linaro.org)
Linaro Open Discussions monthly meeting Wednesday 5 Oct 2022 ⋅ 19:00 – 20:00 Hong Kong Standard Time
Location https://linaro-org.zoom.us/j/95682500341 https://www.google.com/url?q=https%3A%2F%2Flinaro-org.zoom.us%2Fj%2F9568250 0341&sa=D&source=calendar&usd=2&usg=AOvVaw2JDK9LgOcXl2WanQ86Y-6h
Joyce QI 邀请您参加预先安排的 Zoom 会议。
加入 Zoom 会议 https://linaro-org.zoom.us/j/95682500341
会议号:956 8250 0341 手机一键拨号 +16699009128,,95682500341# 美国 (San Jose) +13462487799,,95682500341# 美国 (Houston)
根据您的位置拨号 +1 669 900 9128 美国 (San Jose) +1 346 248 7799 美国 (Houston) +1 253 215 8782 美国 (Tacoma) +1 646 558 8656 美国 (New York) +1 301 715 8592 美国 (Washington DC) +1 312 626 6799 美国 (Chicago) 888 788 0099 美国 免费 877 853 5247 美国 免费 会议号:956 8250 0341 查找本地号码:https://linaro-org.zoom.us/u/ady2J9Zn7t
Guests linaro-open-discussions@op-lists.linaro.org james.morse@arm.com jonathan.cameron@huawei.com lorenzo.pieralisi@linaro.org ilkka@os.amperecomputing.com View all guest info https://calendar.google.com/calendar/event?action=VIEW&eid=Mjg1MjNrNWtiM... cGlsYW8zb2hwa3E5cWFfMjAyMjA5MjdUMTEwMDAwWiBsaW5hcm8tb3Blbi1kaXNjdXNzaW9uc0B vcC1saXN0cy5saW5hcm8ub3Jn&tok=NTQjY184anE0dGh2ZTNuN3NlaThhMGpmazlwdXI3c0Bnc m91cC5jYWxlbmRhci5nb29nbGUuY29tY2JlMDNmZTk3ZTYzMGMyN2ExZDE2N2QyYjcwOTYxMjNi ODIwMjA0ZQ&ctz=Asia%2FHong_Kong&hl=en_GB&es=0
Reply for linaro-open-discussions@op-lists.linaro.org and view more details https://calendar.google.com/calendar/event?action=VIEW&eid=Mjg1MjNrNWtiM... cGlsYW8zb2hwa3E5cWFfMjAyMjA5MjdUMTEwMDAwWiBsaW5hcm8tb3Blbi1kaXNjdXNzaW9uc0B vcC1saXN0cy5saW5hcm8ub3Jn&tok=NTQjY184anE0dGh2ZTNuN3NlaThhMGpmazlwdXI3c0Bnc m91cC5jYWxlbmRhci5nb29nbGUuY29tY2JlMDNmZTk3ZTYzMGMyN2ExZDE2N2QyYjcwOTYxMjNi ODIwMjA0ZQ&ctz=Asia%2FHong_Kong&hl=en_GB&es=0 Your attendance is optional.
~~//~~ Invitation from Google Calendar: https://calendar.google.com/calendar/
You are receiving this email because you are an attendee of the event. To stop receiving future updates for this event, decline this event.
Forwarding this invitation could allow any recipient to send a response to the organiser, be added to the guest list, invite others regardless of their own invitation status or modify your RSVP.
Learn more https://support.google.com/calendar/answer/37135#forwarding
Linaro-open-discussions mailing list -- linaro-open-discussions@op-lists.linaro.org https://collaborate.linaro.org/display/LOD/Linaro+Open+Discussions+Home
Hi James, As discussed in the meeting below are the repositories which you might want to have a look.
[1] Forward ported QEMU with some fixes was shared (by Salil) https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v1-port29092022
[2] James Approach with online-capable and present==possible (some fixes) https://github.com/salil-mehta/linux.git virt-cpuhp-arm64/rfc-v2/jmorse-pres-eq-poss-cpu
[3] Variant of James approach with online-capable and conditionally present cpus https://github.com/salil-mehta/linux.git virt-cpuhp-arm64/rfc-v2/jmorse-variant-with-cond-present-cpu
I have also shared the LOD presentation with Joyce for uploading to the site.
Attachment: [1] Linaro Open Discussion Meeting Update - 05102022 - Salil_Mehta-fixed.pdf
Many thanks Salil
-----Original Message----- From: Google Calendar [mailto:calendar-notification@google.com] On Behalf Of Joyce Qi via Linaro-open-discussions Sent: Wednesday, October 5, 2022 9:31 AM To: linaro-open-discussions@op-lists.linaro.org; james.morse@arm.com; Jonathan Cameron jonathan.cameron@huawei.com; lorenzo.pieralisi@linaro.org; ilkka@os.amperecomputing.com Subject: [Linaro-open-discussions] Invitation: Linaro Open Discussions monthly meeting @ Wed 5 Oct 2022 19:00 - 20:00 (HKT) (linaro-open-discussions@op-lists.linaro.org)
Linaro Open Discussions monthly meeting Wednesday 5 Oct 2022 ⋅ 19:00 – 20:00 Hong Kong Standard Time
Location https://linaro-org.zoom.us/j/95682500341 https://www.google.com/url?q=https%3A%2F%2Flinaro-org.zoom.us%2Fj%2F9568250 0341&sa=D&source=calendar&usd=2&usg=AOvVaw2JDK9LgOcXl2WanQ86Y-6h
Joyce QI 邀请您参加预先安排的 Zoom 会议。
加入 Zoom 会议 https://linaro-org.zoom.us/j/95682500341
会议号:956 8250 0341 手机一键拨号 +16699009128,,95682500341# 美国 (San Jose) +13462487799,,95682500341# 美国 (Houston)
根据您的位置拨号 +1 669 900 9128 美国 (San Jose) +1 346 248 7799 美国 (Houston) +1 253 215 8782 美国 (Tacoma) +1 646 558 8656 美国 (New York) +1 301 715 8592 美国 (Washington DC) +1 312 626 6799 美国 (Chicago) 888 788 0099 美国 免费 877 853 5247 美国 免费 会议号:956 8250 0341 查找本地号码:https://linaro-org.zoom.us/u/ady2J9Zn7t
Guests linaro-open-discussions@op-lists.linaro.org james.morse@arm.com jonathan.cameron@huawei.com lorenzo.pieralisi@linaro.org ilkka@os.amperecomputing.com View all guest info https://calendar.google.com/calendar/event?action=VIEW&eid=Mjg1MjNrNWtiM... cGlsYW8zb2hwa3E5cWFfMjAyMjA5MjdUMTEwMDAwWiBsaW5hcm8tb3Blbi1kaXNjdXNzaW9uc0B vcC1saXN0cy5saW5hcm8ub3Jn&tok=NTQjY184anE0dGh2ZTNuN3NlaThhMGpmazlwdXI3c0Bnc m91cC5jYWxlbmRhci5nb29nbGUuY29tY2JlMDNmZTk3ZTYzMGMyN2ExZDE2N2QyYjcwOTYxMjNi ODIwMjA0ZQ&ctz=Asia%2FHong_Kong&hl=en_GB&es=0
Reply for linaro-open-discussions@op-lists.linaro.org and view more details https://calendar.google.com/calendar/event?action=VIEW&eid=Mjg1MjNrNWtiM... cGlsYW8zb2hwa3E5cWFfMjAyMjA5MjdUMTEwMDAwWiBsaW5hcm8tb3Blbi1kaXNjdXNzaW9uc0B vcC1saXN0cy5saW5hcm8ub3Jn&tok=NTQjY184anE0dGh2ZTNuN3NlaThhMGpmazlwdXI3c0Bnc m91cC5jYWxlbmRhci5nb29nbGUuY29tY2JlMDNmZTk3ZTYzMGMyN2ExZDE2N2QyYjcwOTYxMjNi ODIwMjA0ZQ&ctz=Asia%2FHong_Kong&hl=en_GB&es=0 Your attendance is optional.
~~//~~ Invitation from Google Calendar: https://calendar.google.com/calendar/
You are receiving this email because you are an attendee of the event. To stop receiving future updates for this event, decline this event.
Forwarding this invitation could allow any recipient to send a response to the organiser, be added to the guest list, invite others regardless of their own invitation status or modify your RSVP.
Learn more https://support.google.com/calendar/answer/37135#forwarding
Linaro-open-discussions mailing list -- linaro-open-discussions@op-lists.linaro.org https://collaborate.linaro.org/display/LOD/Linaro+Open+Discussions+Home
Hi Salil,
Thanks for this, I've fixed a couple of bugs in the tree.
On the qemu side: Using KVM and your Qemu tree, PSCI returns success, when it should return denied for CPUs that are forbidden due to firmware policy: [ 0.027597] smp: Bringing up secondary CPUs ... [ 5.108204] CPU1: failed to come online [ 5.109198] CPU1: failed in unknown state : 0x0 [ 5.110384] smp: Brought up 1 node, 1 CPU
This is mostly harmless for linux, but using PSCI_DENIED squashes the error, and avoids the timeout waiting for the secondary to turn up. Mostly harmless as this may cause 'cpus stuck in kernel' to get set, which would prevent kexec.
This probably means you're not using the HVC_TO_USER support for KVM that gets added as part of this series. You can then use the PSCI_TO_USER cap to disable the kernel's PSCI handler, which lets the VMM manage this directly.
Still using KVM, firmware should clear the enabled bit from _STA for any call after the eject-request. I see this is still 0xF before _EJx is called. This means the cpu remains registered, and user-space doesn't get notified that the CPU is no longer available. User-space can still try to enable the CPU, which will fail.
I've not tried TCG mode yet.
Thanks,
James
On 17/10/2022 17:12, Salil Mehta via Linaro-open-discussions wrote:
Hi James, As discussed in the meeting below are the repositories which you might want to have a look.
[1] Forward ported QEMU with some fixes was shared (by Salil) https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v1-port29092022
[2] James Approach with online-capable and present==possible (some fixes) https://github.com/salil-mehta/linux.git virt-cpuhp-arm64/rfc-v2/jmorse-pres-eq-poss-cpu
[3] Variant of James approach with online-capable and conditionally present cpus https://github.com/salil-mehta/linux.git virt-cpuhp-arm64/rfc-v2/jmorse-variant-with-cond-present-cpu
I have also shared the LOD presentation with Joyce for uploading to the site.
Attachment: [1] Linaro Open Discussion Meeting Update - 05102022 - Salil_Mehta-fixed.pdf
Many thanks Salil
-----Original Message----- From: Google Calendar [mailto:calendar-notification@google.com] On Behalf Of Joyce Qi via Linaro-open-discussions Sent: Wednesday, October 5, 2022 9:31 AM To: linaro-open-discussions@op-lists.linaro.org; james.morse@arm.com; Jonathan Cameron jonathan.cameron@huawei.com; lorenzo.pieralisi@linaro.org; ilkka@os.amperecomputing.com Subject: [Linaro-open-discussions] Invitation: Linaro Open Discussions monthly meeting @ Wed 5 Oct 2022 19:00 - 20:00 (HKT) (linaro-open-discussions@op-lists.linaro.org)
Linaro Open Discussions monthly meeting Wednesday 5 Oct 2022 ⋅ 19:00 – 20:00 Hong Kong Standard Time
Location https://linaro-org.zoom.us/j/95682500341 https://www.google.com/url?q=https%3A%2F%2Flinaro-org.zoom.us%2Fj%2F9568250 0341&sa=D&source=calendar&usd=2&usg=AOvVaw2JDK9LgOcXl2WanQ86Y-6h
Joyce QI 邀请您参加预先安排的 Zoom 会议。
加入 Zoom 会议 https://linaro-org.zoom.us/j/95682500341
会议号:956 8250 0341 手机一键拨号 +16699009128,,95682500341# 美国 (San Jose) +13462487799,,95682500341# 美国 (Houston)
根据您的位置拨号 +1 669 900 9128 美国 (San Jose) +1 346 248 7799 美国 (Houston) +1 253 215 8782 美国 (Tacoma) +1 646 558 8656 美国 (New York) +1 301 715 8592 美国 (Washington DC) +1 312 626 6799 美国 (Chicago) 888 788 0099 美国 免费 877 853 5247 美国 免费 会议号:956 8250 0341 查找本地号码:https://linaro-org.zoom.us/u/ady2J9Zn7t
Guests linaro-open-discussions@op-lists.linaro.org james.morse@arm.com jonathan.cameron@huawei.com lorenzo.pieralisi@linaro.org ilkka@os.amperecomputing.com View all guest info https://calendar.google.com/calendar/event?action=VIEW&eid=Mjg1MjNrNWtiM... cGlsYW8zb2hwa3E5cWFfMjAyMjA5MjdUMTEwMDAwWiBsaW5hcm8tb3Blbi1kaXNjdXNzaW9uc0B vcC1saXN0cy5saW5hcm8ub3Jn&tok=NTQjY184anE0dGh2ZTNuN3NlaThhMGpmazlwdXI3c0Bnc m91cC5jYWxlbmRhci5nb29nbGUuY29tY2JlMDNmZTk3ZTYzMGMyN2ExZDE2N2QyYjcwOTYxMjNi ODIwMjA0ZQ&ctz=Asia%2FHong_Kong&hl=en_GB&es=0
Reply for linaro-open-discussions@op-lists.linaro.org and view more details https://calendar.google.com/calendar/event?action=VIEW&eid=Mjg1MjNrNWtiM... cGlsYW8zb2hwa3E5cWFfMjAyMjA5MjdUMTEwMDAwWiBsaW5hcm8tb3Blbi1kaXNjdXNzaW9uc0B vcC1saXN0cy5saW5hcm8ub3Jn&tok=NTQjY184anE0dGh2ZTNuN3NlaThhMGpmazlwdXI3c0Bnc m91cC5jYWxlbmRhci5nb29nbGUuY29tY2JlMDNmZTk3ZTYzMGMyN2ExZDE2N2QyYjcwOTYxMjNi ODIwMjA0ZQ&ctz=Asia%2FHong_Kong&hl=en_GB&es=0 Your attendance is optional.
~~//~~ Invitation from Google Calendar: https://calendar.google.com/calendar/
You are receiving this email because you are an attendee of the event. To stop receiving future updates for this event, decline this event.
Forwarding this invitation could allow any recipient to send a response to the organiser, be added to the guest list, invite others regardless of their own invitation status or modify your RSVP.
Learn more https://support.google.com/calendar/answer/37135#forwarding
Linaro-open-discussions mailing list -- linaro-open-discussions@op-lists.linaro.org https://collaborate.linaro.org/display/LOD/Linaro+Open+Discussions+Home
Hi James,
From: James Morse [mailto:james.morse@arm.com] Sent: Friday, October 28, 2022 5:17 PM To: Salil Mehta salil.mehta@huawei.com; joyce.qi@linaro.org Cc: linaro-open-discussions@op-lists.linaro.org; lorenzo.pieralisi@linaro.org; ilkka@os.amperecomputing.com; Jean-Philippe Brucker jean-philippe.brucker@arm.com; salil.mehta@opnsrc.net Subject: Re: [Linaro-open-discussions] Re: Invitation: Linaro Open Discussions monthly meeting @ Wed 5 Oct 2022 19:00 - 20:00 (HKT) (linaro-open-discussions@op-lists.linaro.org)
Hi Salil,
Thanks for this, I've fixed a couple of bugs in the tree.
On the qemu side: Using KVM and your Qemu tree, PSCI returns success, when it should return denied for CPUs that are forbidden due to firmware policy: [ 0.027597] smp: Bringing up secondary CPUs ... [ 5.108204] CPU1: failed to come online [ 5.109198] CPU1: failed in unknown state : 0x0 [ 5.110384] smp: Brought up 1 node, 1 CPU
Did you try to use below branch(with some suggested fixes) which I shared with you?
[1] James Approach with online-capable and present==possible (some fixes) https://github.com/salil-mehta/linux.git virt-cpuhp-arm64/rfc-v2/jmorse-pres-eq-poss-cpu
This is mostly harmless for linux, but using PSCI_DENIED squashes the error, and avoids the timeout waiting for the secondary to turn up. Mostly harmless as this may cause 'cpus stuck in kernel' to get set, which would prevent kexec.
This probably means you're not using the HVC_TO_USER support for KVM that gets added as part of this series. You can then use the PSCI_TO_USER cap to disable the kernel's PSCI handler, which lets the VMM manage this directly.
Yes, that is true . I intentionally removed the PSCI-to-userspace patches as I was wondering if the CPU devices are not available in the guest kernel for the disabled CPUs how can they be made online?
Still using KVM, firmware should clear the enabled bit from _STA for any call after the eject-request. I see this is still 0xF before _EJx is called.
This is how the existing handshake protocol works. Firmware intimates the OS about ejection request which confirms its status about ejection-in-progress. Firmware/VMM then sends the ACPI event to remove the CPU from the kernel(which effectively is try-to-offline process and making not-present), Until this point _STA.ENABLED=1 (which is a correct behavior since firmware/VMM cannot disable the CPU before it has been made offline by the OS - can we snatch the physical resources before completing the housekeeping at OS?). Once offline'ing/removal is done, OS then informs the firmware/VMM about this completion using _EJ0 method (Which is effective way of giving green signal to the firmware/VMM to go ahead and remove the physical resources).
Firmware then removes the device and any further evaluation of _STA would show _STA.ENABLED=0 and ideally should show is not present(but this is something which I have not changed in the QEMU - maybe we need to do or does it even matter at QEMU level?).
This means the cpu remains registered, and user-space doesn't get notified that the CPU is no longer available. User-space can still try to enable the CPU, which will fail.
That is true and to fix this we could use the online-capable Bit instead of using the _STA.ENABLED Bit status as discussed in the 5th Oct LOD meeting. I have done those changes in the above branch [1].
Thanks Salil
I've not tried TCG mode yet.
Thanks,
James
On 17/10/2022 17:12, Salil Mehta via Linaro-open-discussions wrote:
Hi James, As discussed in the meeting below are the repositories which you might want
to have a look.
[1] Forward ported QEMU with some fixes was shared (by Salil) https://github.com/salil-mehta/qemu.git
virt-cpuhp-armv8/rfc-v1-port29092022
[2] James Approach with online-capable and present==possible (some fixes) https://github.com/salil-mehta/linux.git
virt-cpuhp-arm64/rfc-v2/jmorse-pres-eq-poss-cpu
[3] Variant of James approach with online-capable and conditionally present
cpus
https://github.com/salil-mehta/linux.git
virt-cpuhp-arm64/rfc-v2/jmorse-variant-with-cond-present-cpu
I have also shared the LOD presentation with Joyce for uploading to the site.
Attachment: [1] Linaro Open Discussion Meeting Update - 05102022 - Salil_Mehta-fixed.pdf
Many thanks Salil
-----Original Message----- From: Google Calendar [mailto:calendar-notification@google.com] On Behalf
Of
Joyce Qi via Linaro-open-discussions Sent: Wednesday, October 5, 2022 9:31 AM To: linaro-open-discussions@op-lists.linaro.org; james.morse@arm.com; Jonathan Cameron jonathan.cameron@huawei.com;
lorenzo.pieralisi@linaro.org;
ilkka@os.amperecomputing.com Subject: [Linaro-open-discussions] Invitation: Linaro Open Discussions
monthly
meeting @ Wed 5 Oct 2022 19:00 - 20:00 (HKT) (linaro-open-discussions@op-lists.linaro.org)
Linaro Open Discussions monthly meeting Wednesday 5 Oct 2022 ⋅ 19:00 – 20:00 Hong Kong Standard Time
https://www.google.com/url?q=https%3A%2F%2Flinaro-org.zoom.us%2Fj%2F9568250
0341&sa=D&source=calendar&usd=2&usg=AOvVaw2JDK9LgOcXl2WanQ86Y-6h
Joyce QI 邀请您参加预先安排的 Zoom 会议。
加入 Zoom 会议 https://linaro-org.zoom.us/j/95682500341
会议号:956 8250 0341 手机一键拨号 +16699009128,,95682500341# 美国 (San Jose) +13462487799,,95682500341# 美国 (Houston)
根据您的位置拨号 +1 669 900 9128 美国 (San Jose) +1 346 248 7799 美国 (Houston) +1 253 215 8782 美国 (Tacoma) +1 646 558 8656 美国 (New York) +1 301 715 8592 美国 (Washington DC) +1 312 626 6799 美国 (Chicago) 888 788 0099 美国 免费 877 853 5247 美国 免费 会议号:956 8250 0341 查找本地号码:https://linaro-org.zoom.us/u/ady2J9Zn7t
Guests linaro-open-discussions@op-lists.linaro.org james.morse@arm.com jonathan.cameron@huawei.com lorenzo.pieralisi@linaro.org ilkka@os.amperecomputing.com View all guest info
https://calendar.google.com/calendar/event?action=VIEW&eid=Mjg1MjNrNWtiM...
cGlsYW8zb2hwa3E5cWFfMjAyMjA5MjdUMTEwMDAwWiBsaW5hcm8tb3Blbi1kaXNjdXNzaW9uc0B
vcC1saXN0cy5saW5hcm8ub3Jn&tok=NTQjY184anE0dGh2ZTNuN3NlaThhMGpmazlwdXI3c0Bnc
m91cC5jYWxlbmRhci5nb29nbGUuY29tY2JlMDNmZTk3ZTYzMGMyN2ExZDE2N2QyYjcwOTYxMjNi
ODIwMjA0ZQ&ctz=Asia%2FHong_Kong&hl=en_GB&es=0
Reply for linaro-open-discussions@op-lists.linaro.org and view more details
https://calendar.google.com/calendar/event?action=VIEW&eid=Mjg1MjNrNWtiM...
cGlsYW8zb2hwa3E5cWFfMjAyMjA5MjdUMTEwMDAwWiBsaW5hcm8tb3Blbi1kaXNjdXNzaW9uc0B
vcC1saXN0cy5saW5hcm8ub3Jn&tok=NTQjY184anE0dGh2ZTNuN3NlaThhMGpmazlwdXI3c0Bnc
m91cC5jYWxlbmRhci5nb29nbGUuY29tY2JlMDNmZTk3ZTYzMGMyN2ExZDE2N2QyYjcwOTYxMjNi
ODIwMjA0ZQ&ctz=Asia%2FHong_Kong&hl=en_GB&es=0 Your attendance is optional.
~~//~~ Invitation from Google Calendar: https://calendar.google.com/calendar/
You are receiving this email because you are an attendee of the event. To stop receiving future updates for this event, decline this event.
Forwarding this invitation could allow any recipient to send a response to the organiser, be added to the guest list, invite others regardless of their own invitation status or modify your RSVP.
Learn more https://support.google.com/calendar/answer/37135#forwarding
Linaro-open-discussions mailing list -- linaro-open-discussions@op-lists.linaro.org https://collaborate.linaro.org/display/LOD/Linaro+Open+Discussions+Home
Hi Salil,
On 31/10/2022 12:40, Salil Mehta wrote:
From: James Morse [mailto:james.morse@arm.com] Sent: Friday, October 28, 2022 5:17 PM To: Salil Mehta salil.mehta@huawei.com; joyce.qi@linaro.org Cc: linaro-open-discussions@op-lists.linaro.org; lorenzo.pieralisi@linaro.org; ilkka@os.amperecomputing.com; Jean-Philippe Brucker jean-philippe.brucker@arm.com; salil.mehta@opnsrc.net Subject: Re: [Linaro-open-discussions] Re: Invitation: Linaro Open Discussions monthly meeting @ Wed 5 Oct 2022 19:00 - 20:00 (HKT) (linaro-open-discussions@op-lists.linaro.org)
On the qemu side: Using KVM and your Qemu tree, PSCI returns success, when it should return denied for CPUs that are forbidden due to firmware policy: [ 0.027597] smp: Bringing up secondary CPUs ... [ 5.108204] CPU1: failed to come online [ 5.109198] CPU1: failed in unknown state : 0x0 [ 5.110384] smp: Brought up 1 node, 1 CPU
Did you try to use below branch(with some suggested fixes) which I shared with you?
[1] James Approach with online-capable and present==possible (some fixes) https://github.com/salil-mehta/linux.git virt-cpuhp-arm64/rfc-v2/jmorse-pres-eq-poss-cpu
No, its the qemu changes I'm after.
Returning "success" to CPU_ON, but not actually bringing the CPU online is an abomination. I can't imagine what the maintainer will throw at you if you suggest it.
This is mostly harmless for linux, but using PSCI_DENIED squashes the error, and avoids the timeout waiting for the secondary to turn up. Mostly harmless as this may cause 'cpus stuck in kernel' to get set, which would prevent kexec.
This probably means you're not using the HVC_TO_USER support for KVM that gets added as part of this series. You can then use the PSCI_TO_USER cap to disable the kernel's PSCI handler, which lets the VMM manage this directly.
Yes, that is true . I intentionally removed the PSCI-to-userspace patches as I was wondering if the CPU devices are not available in the guest kernel for the disabled CPUs how can they be made online?
... and if your guest kernel is malicious?
PSCI has to be able to return an error if you try to online a disabled CPU, (as without enforcement the feature is pointless). So there is no need to track which CPUs are disabled in the MADT - just online the lot, and handle the error gracefully.
Doing this then lines up nicely with the ACPI changes, where if firmware isn't doing any enforcement, then the CPUs get registered and a nasty firmware-bug message is printed. This is what makes those changes safe on x86 where the _STA enabled bit may be wrong on mass-produced laptops. If their 'firmware' isn't enforcing any policy, then the _STA bit gets ignored.
Still using KVM, firmware should clear the enabled bit from _STA for any call after the eject-request. I see this is still 0xF before _EJx is called.
This is how the existing handshake protocol works. Firmware intimates the OS about ejection request which confirms its status about ejection-in-progress. Firmware/VMM then sends the ACPI event to remove the CPU from the kernel(which effectively is try-to-offline process and making not-present), Until this point _STA.ENABLED=1 (which is a correct behavior since firmware/VMM cannot disable the CPU before it has been made offline by the OS - can we snatch the physical resources before completing the housekeeping at OS?). Once offline'ing/removal is done, OS then informs the firmware/VMM about this completion using _EJ0 method (Which is effective way of giving green signal to the firmware/VMM to go ahead and remove the physical resources).
Okay, the scope is narrower than after the eject-request... The problem is apci_processor's remove method calls _STA, and finds "no change", so it does nothing. But it appears this is what Qemu does on x86 too. Once _EJ0 has been called, I'd expect _STA to reflect the conclusion of the eject-request.
Linux needs to see _STA change, as otherwise it can't know which bits in _STA have changed.
Firmware then removes the device and any further evaluation of _STA would show _STA.ENABLED=0 and ideally should show is not present(but this is something which I have not changed in the QEMU - maybe we need to do or does it even matter at QEMU level?).
No, it looks like linux is calling these in the wrong order. The ACPI processor driver needs to have its remove call made once the CPU is offline and _EJ0 has been called,...
(I thought this is what qemu was doing on x86, but now that I test it again...!)
[..]
It looks like a post_eject callback on the scan handler is the right way to do this.
This means the cpu remains registered, and user-space doesn't get notified that the CPU is no longer available. User-space can still try to enable the CPU, which will fail.
That is true and to fix this we could use the online-capable Bit instead of using the _STA.ENABLED Bit status as discussed in the 5th Oct LOD meeting. I have done those changes in the above branch [1].
But the MADT online-capable bit is static, it only ever has one value. We need something that can be changed by firmware, (and the OS _knows_ is being changed by firmware). This is what _STA is for.
You can't guess from an eject-request that the CPU is being disabled, what happens when we need to add support for physical hot-remove?
Thanks,
James IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi Salil,
On 31/10/2022 16:48, James Morse via Linaro-open-discussions wrote:
On 31/10/2022 12:40, Salil Mehta wrote:
From: James Morse [mailto:james.morse@arm.com] Sent: Friday, October 28, 2022 5:17 PM To: Salil Mehta salil.mehta@huawei.com; joyce.qi@linaro.org Cc: linaro-open-discussions@op-lists.linaro.org; lorenzo.pieralisi@linaro.org; ilkka@os.amperecomputing.com; Jean-Philippe Brucker jean-philippe.brucker@arm.com; salil.mehta@opnsrc.net Subject: Re: [Linaro-open-discussions] Re: Invitation: Linaro Open Discussions monthly meeting @ Wed 5 Oct 2022 19:00 - 20:00 (HKT) (linaro-open-discussions@op-lists.linaro.org)
Still using KVM, firmware should clear the enabled bit from _STA for any call after the eject-request. I see this is still 0xF before _EJx is called.
This is how the existing handshake protocol works. Firmware intimates the OS about ejection request which confirms its status about ejection-in-progress. Firmware/VMM then sends the ACPI event to remove the CPU from the kernel(which effectively is try-to-offline process and making not-present), Until this point _STA.ENABLED=1 (which is a correct behavior since firmware/VMM cannot disable the CPU before it has been made offline by the OS - can we snatch the physical resources before completing the housekeeping at OS?). Once offline'ing/removal is done, OS then informs the firmware/VMM about this completion using _EJ0 method (Which is effective way of giving green signal to the firmware/VMM to go ahead and remove the physical resources).
Okay, the scope is narrower than after the eject-request... The problem is apci_processor's remove method calls _STA, and finds "no change", so it does nothing. But it appears this is what Qemu does on x86 too. Once _EJ0 has been called, I'd expect _STA to reflect the conclusion of the eject-request.
Linux needs to see _STA change, as otherwise it can't know which bits in _STA have changed.
Firmware then removes the device and any further evaluation of _STA would show _STA.ENABLED=0 and ideally should show is not present(but this is something which I have not changed in the QEMU - maybe we need to do or does it even matter at QEMU level?).
No, it looks like linux is calling these in the wrong order. The ACPI processor driver needs to have its remove call made once the CPU is offline and _EJ0 has been called,...
(I thought this is what qemu was doing on x86, but now that I test it again...!)
[..]
It looks like a post_eject callback on the scan handler is the right way to do this.
This works - but now it looks like you are returning 0 from _STA after _EJ0. Please don't do this, the present bit needs to remain set as while the CPU has been disabled, the redistributor, any RAS ERR nodes, and anything else that hangs from the CPU is still present.
Clearing that bit takes extra work, which has not been defined for the firmware or the kernel. The only way to win is not to play!
Does anyone think we need _OSI strings for this? If so, now is the time to define them. My suggestions would be "ACPI0007 present updates" and "ACPI0007 enabled updates", which makes it fairly clear this is about toggling bits in the processor object.
It sounds like kubernetes might have some kind of gui that shows something based on the cpu present mask. If its needed, my current thinking is to have a cpu_enabled_mask in the kernel, (that defaults to be the cpu_present_mask on all but arm64), and use that to filter cpus out of the sysfs 'offline' file. This avoids creating new files in sysfs, and may even fix whatever it is that kubernetes is doing...
Thanks,
James IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
On Mon, 31 Oct 2022 17:51:13 +0000 James Morse via Linaro-open-discussions linaro-open-discussions@op-lists.linaro.org wrote:
Hi Salil,
On 31/10/2022 16:48, James Morse via Linaro-open-discussions wrote:
On 31/10/2022 12:40, Salil Mehta wrote:
From: James Morse [mailto:james.morse@arm.com] Sent: Friday, October 28, 2022 5:17 PM To: Salil Mehta salil.mehta@huawei.com; joyce.qi@linaro.org Cc: linaro-open-discussions@op-lists.linaro.org; lorenzo.pieralisi@linaro.org; ilkka@os.amperecomputing.com; Jean-Philippe Brucker jean-philippe.brucker@arm.com; salil.mehta@opnsrc.net Subject: Re: [Linaro-open-discussions] Re: Invitation: Linaro Open Discussions monthly meeting @ Wed 5 Oct 2022 19:00 - 20:00 (HKT) (linaro-open-discussions@op-lists.linaro.org)
Still using KVM, firmware should clear the enabled bit from _STA for any call after the eject-request. I see this is still 0xF before _EJx is called.
This is how the existing handshake protocol works. Firmware intimates the OS about ejection request which confirms its status about ejection-in-progress. Firmware/VMM then sends the ACPI event to remove the CPU from the kernel(which effectively is try-to-offline process and making not-present), Until this point _STA.ENABLED=1 (which is a correct behavior since firmware/VMM cannot disable the CPU before it has been made offline by the OS - can we snatch the physical resources before completing the housekeeping at OS?). Once offline'ing/removal is done, OS then informs the firmware/VMM about this completion using _EJ0 method (Which is effective way of giving green signal to the firmware/VMM to go ahead and remove the physical resources).
Okay, the scope is narrower than after the eject-request... The problem is apci_processor's remove method calls _STA, and finds "no change", so it does nothing. But it appears this is what Qemu does on x86 too. Once _EJ0 has been called, I'd expect _STA to reflect the conclusion of the eject-request.
Linux needs to see _STA change, as otherwise it can't know which bits in _STA have changed.
Firmware then removes the device and any further evaluation of _STA would show _STA.ENABLED=0 and ideally should show is not present(but this is something which I have not changed in the QEMU
- maybe we need to do or does it even matter at QEMU level?).
No, it looks like linux is calling these in the wrong order. The ACPI processor driver needs to have its remove call made once the CPU is offline and _EJ0 has been called,...
(I thought this is what qemu was doing on x86, but now that I test it again...!)
[..]
It looks like a post_eject callback on the scan handler is the right way to do this.
This works - but now it looks like you are returning 0 from _STA after _EJ0. Please don't do this, the present bit needs to remain set as while the CPU has been disabled, the redistributor, any RAS ERR nodes, and anything else that hangs from the CPU is still present.
Clearing that bit takes extra work, which has not been defined for the firmware or the kernel. The only way to win is not to play!
Does anyone think we need _OSI strings for this? If so, now is the time to define them. My suggestions would be "ACPI0007 present updates" and "ACPI0007 enabled updates", which makes it fairly clear this is about toggling bits in the processor object.
I think we benefit from something along those lines. Wrote an email to relevant people, then got distracted and forgot to send it until this week so thought I'd leave it until after the meeting.
Jonathan
It sounds like kubernetes might have some kind of gui that shows something based on the cpu present mask. If its needed, my current thinking is to have a cpu_enabled_mask in the kernel, (that defaults to be the cpu_present_mask on all but arm64), and use that to filter cpus out of the sysfs 'offline' file. This avoids creating new files in sysfs, and may even fix whatever it is that kubernetes is doing...
Thanks,
James IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi James,
From: James Morse [mailto:james.morse@arm.com] Sent: Monday, October 31, 2022 5:51 PM To: Salil Mehta salil.mehta@huawei.com; joyce.qi@linaro.org Cc: linaro-open-discussions@op-lists.linaro.org; lorenzo.pieralisi@linaro.org; ilkka@os.amperecomputing.com; Jean-Philippe Brucker jean-philippe.brucker@arm.com; salil.mehta@opnsrc.net Subject: Re: [Linaro-open-discussions] Re: Invitation: Linaro Open Discussions monthly meeting @ Wed 5 Oct 2022 19:00 - 20:00 (HKT) (linaro-open-discussions@op-lists.linaro.org)
Hi Salil,
On 31/10/2022 16:48, James Morse via Linaro-open-discussions wrote:
On 31/10/2022 12:40, Salil Mehta wrote:
From: James Morse [mailto:james.morse@arm.com] Sent: Friday, October 28, 2022 5:17 PM To: Salil Mehta salil.mehta@huawei.com; joyce.qi@linaro.org Cc: linaro-open-discussions@op-lists.linaro.org; lorenzo.pieralisi@linaro.org; ilkka@os.amperecomputing.com; Jean-Philippe Brucker jean-philippe.brucker@arm.com; salil.mehta@opnsrc.net Subject: Re: [Linaro-open-discussions] Re: Invitation: Linaro Open Discussions monthly meeting @ Wed 5 Oct 2022 19:00 - 20:00 (HKT) (linaro-open-discussions@op-lists.linaro.org)
Still using KVM, firmware should clear the enabled bit from _STA for any call after the eject-request. I see this is still 0xF before _EJx is called.
This is how the existing handshake protocol works. Firmware intimates the OS about ejection request which confirms its status about ejection-in-progress. Firmware/VMM then sends the ACPI event to remove the CPU from the kernel(which effectively is try-to-offline process and making not-present), Until this point _STA.ENABLED=1 (which is a correct behavior since firmware/VMM cannot disable the CPU before it has been made offline by the OS - can we snatch the physical resources before completing the housekeeping at OS?). Once offline'ing/removal is done, OS then informs the firmware/VMM about this completion using _EJ0 method (Which is effective way of giving green signal to the firmware/VMM to go ahead and remove the physical resources).
Okay, the scope is narrower than after the eject-request... The problem is apci_processor's remove method calls _STA, and finds "no change", so it does nothing. But it appears this is what Qemu does on x86 too. Once _EJ0 has been called, I'd expect _STA to reflect the conclusion of the eject-request.
Linux needs to see _STA change, as otherwise it can't know which bits in _STA have changed.
Firmware then removes the device and any further evaluation of _STA would show _STA.ENABLED=0 and ideally should show is not present(but this is something which I have not changed in the QEMU - maybe we need to do or does it even matter at QEMU level?).
No, it looks like linux is calling these in the wrong order. The ACPI processor driver needs to have its remove call made once the CPU is offline and _EJ0 has been called,...
(I thought this is what qemu was doing on x86, but now that I test it again...!)
[..]
It looks like a post_eject callback on the scan handler is the right way to do this.
This works - but now it looks like you are returning 0 from _STA after _EJ0. Please don't do this, the present bit needs to remain set as while the CPU has been disabled, the redistributor, any RAS ERR nodes, and anything else that hangs from the CPU is still present.
Sure, as mentioned in the earlier update, I intentionally deferred this change in the QEMU. I will introduce this i.e. after _EJ0, flags would be _STA.Enabled=0 AND _STA.PRESENT=1
Does anyone think we need _OSI strings for this? If so, now is the time to define them. My suggestions would be "ACPI0007 present updates" and "ACPI0007 enabled updates", which makes it fairly clear this is about toggling bits in the processor object.
It sounds like kubernetes might have some kind of gui that shows something based on the cpu present mask. If its needed, my current thinking is to have a cpu_enabled_mask in the kernel, (that defaults to be the cpu_present_mask on all but arm64), and use that to filter cpus out of the sysfs 'offline' file. This avoids creating new files in sysfs, and may even fix whatever it is that kubernetes is doing...
Maybe, but if possible, I would still prefer to use cpu_present_mask. It is dynamic. I do understand, and as you mentioned earlier, that right now present mask might be tied up with the presence of CPU in the ACPI.
File: https://elixir.bootlin.com/linux/latest/source/include/linux/cpumask.h
[...] * If HOTPLUG is enabled, then cpu_present_mask varies dynamically, * depending on what ACPI reports as currently plugged in, otherwise * cpu_present_mask is just a copy of cpu_possible_mask. {...]
Any upper layer using present mask should already know that it is dynamic and should have proper handling when it changes. It's about the association between the CPU being present in the Linux and being present at the ACPI. Can we make this association conditional?
After all, we are already making presence of CPU devices conditional(by calling arch_{un}register_cpu() selectively) although CPU will always be present at the ACPI level?
AFAICS nothing breaks inside the kernel?
But yes, maybe it might be more safe to introduce a new mask. If we decide to use another mask then there would be more changes around the corner than just hiding sysfs entries/conditionally making CPU devices appear/disappear.
For example, any upper layer which now rely on the present mask might need to refer to this new enabled/available mask as well like CPUHP state machine in kernel?
https://elixir.bootlin.com/linux/latest/source/kernel/cpu.c#L1335
static int _cpu_up(unsigned int cpu, int tasks_frozen, enum cpuhp_state target) { [...]
if (!cpu_present(cpu)) { /* and similar check in _cpu_down() */ ret = -EINVAL; goto out; }
[...] }
Also, all future changes would now need to be careful with this new meaning of vCPU 'being enabled' while it is present inside the kernel.
Thanks Salil
Hi James,
From: James Morse [mailto:james.morse@arm.com] Sent: Monday, October 31, 2022 4:49 PM To: Salil Mehta salil.mehta@huawei.com; joyce.qi@linaro.org Cc: linaro-open-discussions@op-lists.linaro.org; Jonathan Cameron jonathan.cameron@huawei.com; lorenzo.pieralisi@linaro.org; ilkka@os.amperecomputing.com; Jean-Philippe Brucker jean-philippe.brucker@arm.com; salil.mehta@opnsrc.net; mehta.salil.lnk@gmail.com Subject: Re: [Linaro-open-discussions] Re: Invitation: Linaro Open Discussions monthly meeting @ Wed 5 Oct 2022 19:00 - 20:00 (HKT) (linaro-open-discussions@op-lists.linaro.org)
Hi Salil,
On 31/10/2022 12:40, Salil Mehta wrote:
From: James Morse [mailto:james.morse@arm.com] Sent: Friday, October 28, 2022 5:17 PM To: Salil Mehta salil.mehta@huawei.com; joyce.qi@linaro.org Cc: linaro-open-discussions@op-lists.linaro.org; lorenzo.pieralisi@linaro.org; ilkka@os.amperecomputing.com; Jean-Philippe Brucker jean-philippe.brucker@arm.com; salil.mehta@opnsrc.net Subject: Re: [Linaro-open-discussions] Re: Invitation: Linaro Open Discussions monthly meeting @ Wed 5 Oct 2022 19:00 - 20:00 (HKT) (linaro-open-discussions@op-lists.linaro.org)
On the qemu side: Using KVM and your Qemu tree, PSCI returns success, when it should return denied for CPUs that are forbidden due to firmware policy: [ 0.027597] smp: Bringing up secondary CPUs ... [ 5.108204] CPU1: failed to come online [ 5.109198] CPU1: failed in unknown state : 0x0 [ 5.110384] smp: Brought up 1 node, 1 CPU
Did you try to use below branch(with some suggested fixes) which I shared with you?
[1] James Approach with online-capable and present==possible (some fixes) https://github.com/salil-mehta/linux.git virt-cpuhp-arm64/rfc-v2/jmorse-pres-eq-poss-cpu
No, its the qemu changes I'm after.
Returning "success" to CPU_ON, but not actually bringing the CPU online is an abomination. I can't imagine what the maintainer will throw at you if you suggest it.
This is mostly harmless for linux, but using PSCI_DENIED squashes the error, and avoids the timeout waiting for the secondary to turn up. Mostly harmless as this may cause 'cpus stuck in kernel' to get set, which would prevent kexec.
This probably means you're not using the HVC_TO_USER support for KVM that gets added as part of this series. You can then use the PSCI_TO_USER cap to disable the kernel's PSCI handler, which lets the VMM manage this directly.
Yes, that is true . I intentionally removed the PSCI-to-userspace patches as I was wondering if the CPU devices are not available in the guest kernel for the disabled CPUs how can they be made online?
... and if your guest kernel is malicious?
Ok for the security, Yes, it makes sense but wanted to hear clearly from you the reasoning so took off those patches. I will add back those patches both in the linux and the QEMU(changes to handle HVC exit calls in userspace and enable capabilities in the KVM at arch init) repositories.
We would need to explain this reason clearly in the patches.
PSCI has to be able to return an error if you try to online a disabled CPU, (as without enforcement the feature is pointless). So there is no need to track which CPUs are disabled in the MADT - just online the lot, and handle the error gracefully.
Without security being compromised(and CPU devices not existing in kernel) CPU online'ing cannot be done. Security and feature functionality are two different aspects of the policy enforcement.
Do you mean no need to track disabled CPUs as in MADT !GICC.flags.Enabled?
Doing this then lines up nicely with the ACPI changes, where if firmware isn't doing any enforcement, then the CPUs get registered and a nasty firmware-bug message is printed. This is what makes those changes safe on x86 where the _STA enabled bit may be wrong on mass-produced laptops. If their 'firmware' isn't enforcing any policy, then the _STA bit gets ignored.
Still using KVM, firmware should clear the enabled bit from _STA for any call after the eject-request. I see this is still 0xF before _EJx is called.
This is how the existing handshake protocol works. Firmware intimates the OS about ejection request which confirms its status about ejection-in-progress. Firmware/VMM then sends the ACPI event to remove the CPU from the kernel(which effectively is try-to-offline process and making not-present), Until this point _STA.ENABLED=1 (which is a correct behavior since firmware/VMM cannot disable the CPU before it has been made offline by the OS - can we snatch the physical resources before completing the housekeeping at OS?). Once offline'ing/removal is done, OS then informs the firmware/VMM about this completion using _EJ0 method (Which is effective way of giving green signal to the firmware/VMM to go ahead and remove the physical resources).
Okay, the scope is narrower than after the eject-request... The problem is apci_processor's remove method calls _STA, and finds "no change", so it does nothing. But it appears this is what Qemu does on x86 too.
Exactly. and this was the reason why below crash was happening during remove leg as arch_unregister_cpu() was not getting called and then during next processor_add() arch_register_cpu() would fail with below
[ 73.647991] sysfs_warn_dup+0x60/0x80 [ 73.648414] sysfs_create_dir_ns+0xe4/0x100 [ 73.648885] kobject_add_internal+0x98/0x220 [ 73.649367] kobject_add+0x94/0x108 [ 73.649759] device_add+0xf8/0x8a8 [ 73.650145] device_register+0x20/0x30 [ 73.650569] register_cpu+0xf0/0x1b0 [ 73.650974] arch_register_cpu+0x5c/0x70 [ 73.651415] acpi_processor_add+0x410/0x680 [ 73.651886] acpi_bus_attach+0x12c/0x228 [ 73.652334] acpi_bus_scan+0x58/0x110 [ 73.652745] acpi_device_hotplug+0x208/0x470 [ 73.653227] acpi_hotplug_work_fn+0x24/0x40 [ 73.653697] process_one_work+0x1d0/0x320 [ 73.654148] worker_thread+0x4c/0x400 [ 73.654561] kthread+0x110/0x120 [ 73.654924] ret_from_fork+0x10/0x20
Once _EJ0 has been called, I'd expect _STA to reflect the conclusion of the eject-request.
Yes, agreed. and as I mentioned earlier this change is pending in QEMU. I deferred it intentionally since it did not matter with respect to your changes as the places where it was being used in your code would always have _STA.Ebaled=1.
Linux needs to see _STA change, as otherwise it can't know which bits in _STA have changed.
Ok we can do that now. But even with this change in the QEMU, the remove leg in the kernel where it is being referred, flag _STA.Enabled would still be 1.
It is better to use online-capable Bit there or we would need to hack the existing handshake Hotplug protocol to fit the _STA.Enabled=0 which is not very clean as the eject-in-progress handling in OS should be exactly same as that of the physical Hotplug feature. The only difference is what we do with the _STA.PRESENT Bits later at the ACPI level in the firmware/VMM. It should be exposed as present by the VMM/firmware after _EJ0 method has been evaluated by the OS.
Firmware then removes the device and any further evaluation of _STA would show _STA.ENABLED=0 and ideally should show is not present(but this is something which I have not changed in the QEMU - maybe we need to do or does it even matter at QEMU level?).
No, it looks like linux is calling these in the wrong order. The ACPI processor driver needs to have its remove call made once the CPU is offline and _EJ0 has been called,...
I beg to differ on this. I think order during eject-in-progress handling in OS should be exactly same as that of the physical Hotplug feature.
Hence, the acpi_bus_trim() should come before acpi_evaluate_ej0().
acpi_scan_hot_remove(struct acpi_device *device) { [...] 1. acpi_scan_try_to_offline(device); [...] 2. acpi_bus_trim(device); /* processor_remove() gets called here */ [...] 3. acpi_evaluate_ej0(handle); /* firmware will make _STA.Enabled=0, _STA.PRESENT=1(not doing right now) */ [...] 4. acpi_evaluate_integer(handle, "_STA", NULL, &sta); /* this is not checking PRESENT Bit right now */ [...] 5. acpi_evaluate_ost(...); /* NOT required still? */ [...] }
(I thought this is what qemu was doing on x86, but now that I test it again...!)
[..]
It looks like a post_eject callback on the scan handler is the right way to do this.
Would look like a hack to be frank.
This means the cpu remains registered, and user-space doesn't get notified that the CPU is no longer available. User-space can still try to enable the CPU, which will fail.
That is true and to fix this we could use the online-capable Bit instead of using the _STA.ENABLED Bit status as discussed in the 5th Oct LOD meeting. I have done those changes in the above branch [1].
But the MADT online-capable bit is static, it only ever has one value. We need something that can be changed by firmware, (and the OS _knows_ is being changed by firmware). This is what _STA is for.
You can't guess from an eject-request that the CPU is being disabled, what happens when we need to add support for physical hot-remove?
Isn’t presence of online-capability(for different architectures this could be interpreted differently and they just need to add their bit) enough to take different route than the physical Hotplug path of hot-remove?
You might want to check below changes which achieves above
Repository: https://github.com/salil-mehta/linux.git virt-cpuhp-arm64/rfc-v2/jmorse-pres-eq-poss-cpu
Patches: ACPI: APIs to check online-capability of the processor ACPI: Fix: Introduce online-capable check to fix remove cpu crash
- if (cpu_present(pr->id) && !(sta & ACPI_STA_DEVICE_ENABLED)) + if (cpu_present(pr->id) && acpi_check_online_capable(pr)) { + arch_unregister_cpu(pr->id); + } else { + pr_err_once(FW_BUG "CPU%u not online-capable is being made not present\n", pr->id); + add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK); + }
Thanks Salil
On Tue, Nov 01, 2022 at 11:09:51AM +0000, Salil Mehta wrote:
PSCI has to be able to return an error if you try to online a disabled CPU, (as without enforcement the feature is pointless). So there is no need to track which CPUs are disabled in the MADT - just online the lot, and handle the error gracefully.
To reconfirm this point...
"without enforcement the feature is pointless" I completely agree - without that, having vCPU hotplug is an utterly pointless feature and a waste of time.
One of the whole points for vCPU hotplug is for the environment outside of the guest to control how many CPUs guests can use, and if the guest can online CPUs that the external environment has thought it has taken offline, then really this is a waste of time and is not what people want when they talk about vCPU hotplug.
There has to be enforcement - and that enforcement can not be done solely by the guest kernel, it has to be done by the environment outside of the guest for the feature to have any meaningful application beyond an academic exercise.
Hi Russel,
From: Russell King [mailto:linux@armlinux.org.uk] Sent: Tuesday, November 1, 2022 11:24 AM To: Salil Mehta salil.mehta@huawei.com Cc: James Morse james.morse@arm.com; joyce.qi@linaro.org; linaro-open-discussions@op-lists.linaro.org; Jonathan Cameron jonathan.cameron@huawei.com; lorenzo.pieralisi@linaro.org; ilkka@os.amperecomputing.com; Jean-Philippe Brucker jean-philippe.brucker@arm.com; salil.mehta@opnsrc.net; mehta.salil.lnk@gmail.com Subject: Re: [Linaro-open-discussions] Re: Invitation: Linaro Open Discussions monthly meeting @ Wed 5 Oct 2022 19:00 - 20:00 (HKT) (linaro-open-discussions@op-lists.linaro.org)
On Tue, Nov 01, 2022 at 11:09:51AM +0000, Salil Mehta wrote:
PSCI has to be able to return an error if you try to online a disabled CPU,
(as without
enforcement the feature is pointless). So there is no need to track which
CPUs are
disabled in the MADT - just online the lot, and handle the error gracefully.
To reconfirm this point...
"without enforcement the feature is pointless" I completely agree - without that, having vCPU hotplug is an utterly pointless feature and a waste of time.
Sure, I agreed as well. There is no contention in that.
One of the whole points for vCPU hotplug is for the environment outside of the guest to control how many CPUs guests can use, and if the guest can online CPUs that the external environment has thought it has taken offline, then really this is a waste of time and is not what people want when they talk about vCPU hotplug.
Yes, of course. that's the key feature of vCPU Hotplug.
There has to be enforcement - and that enforcement can not be done solely by the guest kernel, it has to be done by the environment outside of the guest for the feature to have any meaningful application beyond an academic exercise.
Sure, agreed. We cannot trust guest as it can be compromised so there has to be a check in the VMM/firmware as well.
Thanks Salil
Hi Salil,
On 01/11/2022 11:09, Salil Mehta wrote:
From: James Morse [mailto:james.morse@arm.com] Sent: Monday, October 31, 2022 4:49 PM To: Salil Mehta salil.mehta@huawei.com; joyce.qi@linaro.org Cc: linaro-open-discussions@op-lists.linaro.org; Jonathan Cameron jonathan.cameron@huawei.com; lorenzo.pieralisi@linaro.org; ilkka@os.amperecomputing.com; Jean-Philippe Brucker jean-philippe.brucker@arm.com; salil.mehta@opnsrc.net; mehta.salil.lnk@gmail.com Subject: Re: [Linaro-open-discussions] Re: Invitation: Linaro Open Discussions monthly meeting @ Wed 5 Oct 2022 19:00 - 20:00 (HKT) (linaro-open-discussions@op-lists.linaro.org)
This probably means you're not using the HVC_TO_USER support for KVM that gets added as part of this series. You can then use the PSCI_TO_USER cap to disable the kernel's PSCI handler, which lets the VMM manage this directly.
Yes, that is true . I intentionally removed the PSCI-to-userspace patches as I was wondering if the CPU devices are not available in the guest kernel for the disabled CPUs how can they be made online?
... and if your guest kernel is malicious?
Ok for the security,
I wouldn't call it security, just you can't trust the guest to play nice.
Yes, it makes sense but wanted to hear clearly from you the reasoning so took off those patches. I will add back those patches both in the linux and the QEMU(changes to handle HVC exit calls in userspace and enable capabilities in the KVM at arch init) repositories.
We would need to explain this reason clearly in the patches.
Its described in the documentation patch as: | Firmware can enforce its policy via PSCI's return codes. e.g. `DENIED``
I'm coming from a position of "PSCI has always been able to return errors", but this is the second time I've been surprised by people not seeing it this way - I'll try and expand the commit messages to explain this some more...
PSCI has to be able to return an error if you try to online a disabled CPU, (as without enforcement the feature is pointless). So there is no need to track which CPUs are disabled in the MADT - just online the lot, and handle the error gracefully.
Without security being compromised(and CPU devices not existing in kernel) CPU online'ing cannot be done. Security and feature functionality are two different aspects of the policy enforcement.
Do you mean no need to track disabled CPUs as in MADT !GICC.flags.Enabled?
Linux doesn't need those GICC flags, they were added for another operating system.
For linux, its simpler to try to enable everything, and handle the failure gracefully. This lets us disappear the feature if it turns out firmware isn't enforcing any policy. (as described below:)
Doing this then lines up nicely with the ACPI changes, where if firmware isn't doing any enforcement, then the CPUs get registered and a nasty firmware-bug message is printed. This is what makes those changes safe on x86 where the _STA enabled bit may be wrong on mass-produced laptops. If their 'firmware' isn't enforcing any policy, then the _STA bit gets ignored.
Still using KVM, firmware should clear the enabled bit from _STA for any call after the eject-request. I see this is still 0xF before _EJx is called.
This is how the existing handshake protocol works. Firmware intimates the OS about ejection request which confirms its status about ejection-in-progress. Firmware/VMM then sends the ACPI event to remove the CPU from the kernel(which effectively is try-to-offline process and making not-present), Until this point _STA.ENABLED=1 (which is a correct behavior since firmware/VMM cannot disable the CPU before it has been made offline by the OS - can we snatch the physical resources before completing the housekeeping at OS?). Once offline'ing/removal is done, OS then informs the firmware/VMM about this completion using _EJ0 method (Which is effective way of giving green signal to the firmware/VMM to go ahead and remove the physical resources).
Okay, the scope is narrower than after the eject-request... The problem is apci_processor's remove method calls _STA, and finds "no change", so it does nothing. But it appears this is what Qemu does on x86 too.
Exactly. and this was the reason why below crash was happening during remove leg as arch_unregister_cpu() was not getting called and then during next processor_add() arch_register_cpu() would fail with below
Yes, linux was calling process_remove and _EJ0 in the wrong order. I've fixed that, but now Qemu is reported 'not present' via _STA, so acpi_processor_make_not_present() prints a nastygram about this not being supported, and arch_unregister_cpu() is never called.
The work to make a CPU not-present has not been defined, we must not paper over this.
For arm64 we know there are no physical hotplug systems, if we can get the OSI strings defined, we can call this a firmware-bug if it tries when the OS didn't advertise support. (for x86 that ship has sailed, so a nasty print message about support being disabled is all we can do)
Once _EJ0 has been called, I'd expect _STA to reflect the conclusion of the eject-request.
Yes, agreed. and as I mentioned earlier this change is pending in QEMU. I deferred it intentionally since it did not matter with respect to your changes as the places where it was being used in your code would always have _STA.Ebaled=1.
Fair enough, but I'd really expect _STA to be a reflection of Qemu's device-model. Maybe its more complex than I think.
Linux needs to see _STA change, as otherwise it can't know which bits in _STA have changed.
Ok we can do that now. But even with this change in the QEMU, the remove leg in the kernel where it is being referred, flag _STA.Enabled would still be 1.
I've added a post_eject handler to ACPI's bus scanning stuff. That fixes it for me. https://gitlab.arm.com/linux-arm/linux-jm/-/tree/virtual_cpu_hotplug/rfc/v0....
It is better to use online-capable Bit
But its static! It can't change. All you can use this for is to _guess_ that an eject-request is to make a CPU disabled.
Its much better to change linux to call _STA at the right point!
there or we would need to hack the existing handshake Hotplug protocol to fit the _STA.Enabled=0 which is not very clean as the eject-in-progress handling in OS should be exactly same as that of the physical Hotplug feature.
The existing processor remove code is equally broken: It assume an eject-request is to make the CPU not-present.
The only difference is what we do with the _STA.PRESENT Bits later at the ACPI level in the firmware/VMM. It should be exposed as present by the VMM/firmware after _EJ0 method has been evaluated by the OS.
For arm64 yes.
Firmware then removes the device and any further evaluation of _STA would show _STA.ENABLED=0 and ideally should show is not present(but this is something which I have not changed in the QEMU - maybe we need to do or does it even matter at QEMU level?).
No, it looks like linux is calling these in the wrong order. The ACPI processor driver needs to have its remove call made once the CPU is offline and _EJ0 has been called,...
I beg to differ on this. I think order during eject-in-progress handling in OS should be exactly same as that of the physical Hotplug feature.
Hence, the acpi_bus_trim() should come before acpi_evaluate_ej0().
I agree - but the reason is for all the other drivers that have a detach callback that needs to happen while the device is still present so that they can quiesce it.
This is really another phase as some drivers need work to be done 'post eject'. the ACPI processor driver currently gets away with it by making assumptions.
acpi_scan_hot_remove(struct acpi_device *device) { [...]
- acpi_scan_try_to_offline(device); [...]
- acpi_bus_trim(device); /* processor_remove() gets called here */ [...]
- acpi_evaluate_ej0(handle); /* firmware will make _STA.Enabled=0, _STA.PRESENT=1(not doing right now) */ [...]
- acpi_evaluate_integer(handle, "_STA", NULL, &sta); /* this is not checking PRESENT Bit right now */ [...]
- acpi_evaluate_ost(...); /* NOT required still? */ [...]
_OST is very important for real systems that do this. _OST is the only way the firmware can know that the OS received the eject-request and has started work on it. _OST is the only way firmware can know if the eject-request failed for some reason. Without it, firmware has to guess (a timeout - how long?) whether the eject failed ... or is just taking longer ...
(I thought this is what qemu was doing on x86, but now that I test it again...!)
[..]
It looks like a post_eject callback on the scan handler is the right way to do this.
Would look like a hack to be frank.
I think its fair that drivers have some cleanup work that needs to be done after their device has been removed. Especially if they don't know how its been 'ejected' until _EJ0 has completed. _STA is already called after _EJ0 to 'see' if the eject worked.
I think the alternative would be to add a new disable-request that calls _DIS on completion. It would be even harder to make this look like hotplug to user-space.
(then again, all the world is not a VHS)
This means the cpu remains registered, and user-space doesn't get notified that the CPU is no longer available. User-space can still try to enable the CPU, which will fail.
That is true and to fix this we could use the online-capable Bit instead of using the _STA.ENABLED Bit status as discussed in the 5th Oct LOD meeting. I have done those changes in the above branch [1].
But the MADT online-capable bit is static, it only ever has one value. We need something that can be changed by firmware, (and the OS _knows_ is being changed by firmware). This is what _STA is for.
You can't guess from an eject-request that the CPU is being disabled, what happens when we need to add support for physical hot-remove?
Isn’t presence of online-capability(for different architectures this could be interpreted differently and they just need to add their bit) enough to take different route than the physical Hotplug path of hot-remove?
No - that bit was only needed by another operating system to prevent their boot code from choking on errors returned by PSCI.
We never did need it for linux, I don't think we should add code to carry it around to quirk how ACPI behaves.
You might want to check below changes which achieves above
Repository: https://github.com/salil-mehta/linux.git virt-cpuhp-arm64/rfc-v2/jmorse-pres-eq-poss-cpu Patches: ACPI: APIs to check online-capability of the processor ACPI: Fix: Introduce online-capable check to fix remove cpu crash
- if (cpu_present(pr->id) && !(sta & ACPI_STA_DEVICE_ENABLED))
- if (cpu_present(pr->id) && acpi_check_online_capable(pr)) {
arch_unregister_cpu(pr->id);
- } else {
pr_err_once(FW_BUG "CPU%u not online-capable is being made not present\n", pr->id);
add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
- }
But if you don't read _STA, you don't know whether the present bit or the enabled bit is being toggled. How are stable kernels with this code supposed to know on hardware that supports physical hot-remove? What about platforms that support both!?
Don't guess - call _STA to know for sure. acpi_scan_hot_remove() already calls _STA after _EJ0.
Thanks,
James
linaro-open-discussions@op-lists.linaro.org