> From: Linaro-open-discussions
> [mailto:linaro-open-discussions-bounces@op-lists.linaro.org] On Behalf Of
> Lorenzo Pieralisi via Linaro-open-discussions
> Sent: Tuesday, June 22, 2021 10:39 AM
>
> On Mon, Jun 21, 2021 at 09:33:42PM +0000, Song Bao Hua (Barry Song) wrote:
> >
> >
> > > -----Original Message-----
> > > From: Linaro-open-discussions
> > > [mailto:linaro-open-discussions-bounces@op-lists.linaro.org] On Behalf Of
> > > Lorenzo Pieralisi via Linaro-open-discussions
> > > Sent: Monday, June 21, 2021 9:32 PM
> > > To: Jammy Zhou <jammy.zhou(a)linaro.org>
> > > Cc: Lorenzo Pieralisi via Linaro-open-discussions
> > > <linaro-open-discussions(a)op-lists.linaro.org>
> > > Subject: Re: [Linaro-open-discussions] LOD Meeting Agenda for June 28
> > >
> > > On Mon, Jun 21, 2021 at 12:37:03AM +0000, Jammy Zhou via Linaro-open-discussions
> > > wrote:
> > > > Hi all,
> > > >
> > > > We are only one week away from the next LOD meeting on June 28. If you
> > > > have any topic to discuss, please let me know.
> > >
> > > We could discuss the virt CPU hotplug status/forward plan.
> > >
> > > I don't know if Barry's scheduler discussion can benefit from an LOD
> > > session, please chime in if that's the case.
> >
> > Hi Lorenzo,
> > Thanks for asking. I have no important updates at this time. That
> > discussion is mainly for co-working with Tim Chen on the plan of
> > upstreaming and benchmarking, also figuring out whether the patches
> > are leading to the same result on arm and x86 platforms.
> >
> > >
> > > AOB ?
>
> Ok. So, anything else to discuss other than the PSCI KVM patches
> that are currently on ML ? Jonathan, Salil, anyone please do let
> me know.
Hi Lorenzo,
Do you think it is useful & possible for you guys to present the kernel changes in
the upcoming Linux Plumbers Conference 2021 in context to vcpu Hotplug?
There are issues like below which might need involvement of larger audience.
1. Kernel sizing over possible vcpus (some patches by Bharath Rao floating around)
2. Any user cases which might be impacted because of this change?
3. Issues related to QEMU.(legal?) etc.
Date of conference is in September so there is plenty of time to push the patches in
between.
Proposal submission cutoff date is 25th June, 2021
Thanks
Salil
>
> Thanks,
> Lorenzo
> --
> Linaro-open-discussions mailing list
> https://collaborate.linaro.org/display/LOD/Linaro+Open+Discussions+Home
> https://op-lists.linaro.org/mailman/listinfo/linaro-open-discussions
Hi Yicong and Tim,
This is the 2nd patchset following the 1st one:
https://op-lists.linaro.org/pipermail/linaro-open-discussions/2021-June/000…
While the 1st patchset was only focusing on spreading path, this patchset
is mainly for packing path.
I have only tested tbench4 on one numa. Without this patchset, I am seeing
up to 5% performance decrease on tbench4 by spreading path only; but with it,
I can see up to 28% performance increase, compared to the case of w/o cluster
scheduler.
I am running the benchmark by Mel's mmtests with this config file:
configs/config-scheduler-schbench-1numa
# MM Test Parameters
export MMTESTS="tbench4"
# List of monitors
export RUN_MONITOR=yes
export MONITORS_ALWAYS=
export MONITORS_GZIP="proc-vmstat mpstat"
export MONITORS_WITH_LATENCY="vmstat"
export MONITOR_UPDATE_FREQUENCY=10
# TBench
export TBENCH_DURATION=60
export TBENCH_MIN_CLIENTS=1
export TBENCH_MAX_CLIENTS=96
with commands like:
numactl -N 0 -m 0 ./run-mmtests.sh --no-monitor -c configs/config-scheduler-schbench-1numa testtag
my machine has 4 numa, each numa has 24 cores(6 clusters).
Hopefully, we are going to have more benchmark cases like pgbench, hackbench
etc on both one numa and four numa.
Hi Yicong,
Note we might need to test the case jumplabel is disabled.
Thanks
Barry
Barry Song (4):
sched: Add infrastructure to describe if cluster scheduler is really
running
sched: Add per_cpu cluster domain info and cpus_share_cluster API
sched/fair: Scan cluster before scanning llc in wake-up path
sched/fair: Use cpus_share_cluster to further pull wakee
include/linux/sched/cluster.h | 19 ++++++++++++++
include/linux/sched/sd_flags.h | 9 +++++++
include/linux/sched/topology.h | 8 +++++-
kernel/sched/core.c | 28 ++++++++++++++++++++
kernel/sched/fair.c | 58 +++++++++++++++++++++++++++++++++++++++---
kernel/sched/sched.h | 3 +++
kernel/sched/topology.c | 11 ++++++++
7 files changed, 131 insertions(+), 5 deletions(-)
create mode 100644 include/linux/sched/cluster.h
--
1.8.3.1
On Tue, 22 Jun 2021 12:17:33 +0000
Jonathan Cameron via Linaro-open-discussions <linaro-open-discussions(a)op-lists.linaro.org> wrote:
> On Tue, 22 Jun 2021 09:39:15 +0000
> Lorenzo Pieralisi via Linaro-open-discussions <linaro-open-discussions(a)op-lists.linaro.org> wrote:
>
> > On Mon, Jun 21, 2021 at 09:33:42PM +0000, Song Bao Hua (Barry Song) wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Linaro-open-discussions
> > > > [mailto:linaro-open-discussions-bounces@op-lists.linaro.org] On Behalf Of
> > > > Lorenzo Pieralisi via Linaro-open-discussions
> > > > Sent: Monday, June 21, 2021 9:32 PM
> > > > To: Jammy Zhou <jammy.zhou(a)linaro.org>
> > > > Cc: Lorenzo Pieralisi via Linaro-open-discussions
> > > > <linaro-open-discussions(a)op-lists.linaro.org>
> > > > Subject: Re: [Linaro-open-discussions] LOD Meeting Agenda for June 28
> > > >
> > > > On Mon, Jun 21, 2021 at 12:37:03AM +0000, Jammy Zhou via Linaro-open-discussions
> > > > wrote:
> > > > > Hi all,
> > > > >
> > > > > We are only one week away from the next LOD meeting on June 28. If you
> > > > > have any topic to discuss, please let me know.
> > > >
> > > > We could discuss the virt CPU hotplug status/forward plan.
> > > >
> > > > I don't know if Barry's scheduler discussion can benefit from an LOD
> > > > session, please chime in if that's the case.
> > >
> > > Hi Lorenzo,
> > > Thanks for asking. I have no important updates at this time. That
> > > discussion is mainly for co-working with Tim Chen on the plan of
> > > upstreaming and benchmarking, also figuring out whether the patches
> > > are leading to the same result on arm and x86 platforms.
> > >
> > > >
> > > > AOB ?
> >
> > Ok. So, anything else to discuss other than the PSCI KVM patches
> > that are currently on ML ? Jonathan, Salil, anyone please do let
> > me know.
> >
>
> On the near term horizon is the confidential compute stuff that there
> is an event about tomorrow, but next week feels a little early to try
> and have a detailed discussion about that (I assume)?
>
> A few other things will surface by next month, but not next week ;)
>
> Otherwise, nothing immediate jumps out to me, beyond any useful
> discussion that can be had around the PSCI KVM patches you mention.
>
> Thanks,
>
> Jonathan
>
Leaving that one aside, it seems the main open question from our side
is around planning / progress on steps for the vCPU hotplug.
Perhaps that's something best formulated over email...
My current understanding is that next item would be the Guest Kernel
changes? Perhaps a good target for sending out an RFC on that would
be just after the merge window?
Thanks,
Jonathan
>
>
>
> > Thanks,
> > Lorenzo
>
As planned during the vCPU hot-add discussions from previous LOD
meetings, this prototype lets userspace handle PSCI calls from a guest.
The vCPU hot-add model preferred by Arm presents all possible resources
through ACPI at boot time, only marking unavailable vCPUs as hidden.
The VMM prevents bringing up those vCPUs by rejecting PSCI CPU_ON calls.
This allows to keep things simple for vCPU scaling enablement, while
leaving the door open for hardware CPU hot-add.
This series focuses on moving PSCI support into userspace. Patches 1-3
allow userspace to request WFI to be executed by KVM. That way the VMM
can easily implement the CPU_SUSPEND function, which is mandatory from
PSCI v0.2 onwards (even if it doesn't have a more useful implementation
than WFI, natively available to the guest). An alternative would be to
poll the vGIC implemented in KVM for interrupts, but I haven't explored
that solution. Patches 4 and 5 let the VMM request PSCI calls.
The guest needs additional support to deal with hidden CPUs and to
gracefully handle the "NOT_PRESENT" return value from PSCI CPU_ON.
The full prototype can be found here:
https://jpbrucker.net/git/linux/log/?h=cpuhp/develhttps://jpbrucker.net/git/qemu/log/?h=cpuhp/devel
Jean-Philippe Brucker (5):
KVM: arm64: Replace power_off with mp_state in struct kvm_vcpu_arch
KVM: arm64: Move WFI execution to check_vcpu_requests()
KVM: arm64: Allow userspace to request WFI
KVM: arm64: Pass hypercalls to userspace
KVM: arm64: Pass PSCI calls to userspace
Documentation/virt/kvm/api.rst | 46 +++++++++++++++----
Documentation/virt/kvm/arm/psci.rst | 1 +
arch/arm64/include/asm/kvm_host.h | 10 ++++-
include/kvm/arm_hypercalls.h | 1 +
include/kvm/arm_psci.h | 4 ++
include/uapi/linux/kvm.h | 3 ++
arch/arm64/kvm/arm.c | 66 +++++++++++++++++++--------
arch/arm64/kvm/handle_exit.c | 3 +-
arch/arm64/kvm/hypercalls.c | 28 +++++++++++-
arch/arm64/kvm/psci.c | 69 ++++++++++++++---------------
10 files changed, 165 insertions(+), 66 deletions(-)
--
2.31.1
On Tue, 22 Jun 2021 09:39:15 +0000
Lorenzo Pieralisi via Linaro-open-discussions <linaro-open-discussions(a)op-lists.linaro.org> wrote:
> On Mon, Jun 21, 2021 at 09:33:42PM +0000, Song Bao Hua (Barry Song) wrote:
> >
> >
> > > -----Original Message-----
> > > From: Linaro-open-discussions
> > > [mailto:linaro-open-discussions-bounces@op-lists.linaro.org] On Behalf Of
> > > Lorenzo Pieralisi via Linaro-open-discussions
> > > Sent: Monday, June 21, 2021 9:32 PM
> > > To: Jammy Zhou <jammy.zhou(a)linaro.org>
> > > Cc: Lorenzo Pieralisi via Linaro-open-discussions
> > > <linaro-open-discussions(a)op-lists.linaro.org>
> > > Subject: Re: [Linaro-open-discussions] LOD Meeting Agenda for June 28
> > >
> > > On Mon, Jun 21, 2021 at 12:37:03AM +0000, Jammy Zhou via Linaro-open-discussions
> > > wrote:
> > > > Hi all,
> > > >
> > > > We are only one week away from the next LOD meeting on June 28. If you
> > > > have any topic to discuss, please let me know.
> > >
> > > We could discuss the virt CPU hotplug status/forward plan.
> > >
> > > I don't know if Barry's scheduler discussion can benefit from an LOD
> > > session, please chime in if that's the case.
> >
> > Hi Lorenzo,
> > Thanks for asking. I have no important updates at this time. That
> > discussion is mainly for co-working with Tim Chen on the plan of
> > upstreaming and benchmarking, also figuring out whether the patches
> > are leading to the same result on arm and x86 platforms.
> >
> > >
> > > AOB ?
>
> Ok. So, anything else to discuss other than the PSCI KVM patches
> that are currently on ML ? Jonathan, Salil, anyone please do let
> me know.
>
On the near term horizon is the confidential compute stuff that there
is an event about tomorrow, but next week feels a little early to try
and have a detailed discussion about that (I assume)?
A few other things will surface by next month, but not next week ;)
Otherwise, nothing immediate jumps out to me, beyond any useful
discussion that can be had around the PSCI KVM patches you mention.
Thanks,
Jonathan
> Thanks,
> Lorenzo
> -----Original Message-----
> From: Linaro-open-discussions
> [mailto:linaro-open-discussions-bounces@op-lists.linaro.org] On Behalf Of
> Lorenzo Pieralisi via Linaro-open-discussions
> Sent: Monday, June 21, 2021 9:32 PM
> To: Jammy Zhou <jammy.zhou(a)linaro.org>
> Cc: Lorenzo Pieralisi via Linaro-open-discussions
> <linaro-open-discussions(a)op-lists.linaro.org>
> Subject: Re: [Linaro-open-discussions] LOD Meeting Agenda for June 28
>
> On Mon, Jun 21, 2021 at 12:37:03AM +0000, Jammy Zhou via Linaro-open-discussions
> wrote:
> > Hi all,
> >
> > We are only one week away from the next LOD meeting on June 28. If you
> > have any topic to discuss, please let me know.
>
> We could discuss the virt CPU hotplug status/forward plan.
>
> I don't know if Barry's scheduler discussion can benefit from an LOD
> session, please chime in if that's the case.
Hi Lorenzo,
Thanks for asking. I have no important updates at this time. That
discussion is mainly for co-working with Tim Chen on the plan of
upstreaming and benchmarking, also figuring out whether the patches
are leading to the same result on arm and x86 platforms.
>
> AOB ?
>
> Lorenzo
> --
> Linaro-open-discussions mailing list
> https://collaborate.linaro.org/display/LOD/Linaro+Open+Discussions+Home
> https://op-lists.linaro.org/mailman/listinfo/linaro-open-discussions
Thanks
Barry
On Mon, Jun 21, 2021 at 12:37:03AM +0000, Jammy Zhou via Linaro-open-discussions wrote:
> Hi all,
>
> We are only one week away from the next LOD meeting on June 28. If you
> have any topic to discuss, please let me know.
We could discuss the virt CPU hotplug status/forward plan.
I don't know if Barry's scheduler discussion can benefit from an LOD
session, please chime in if that's the case.
AOB ?
Lorenzo
> -----Original Message-----
> From: Linaro-open-discussions
> [mailto:linaro-open-discussions-bounces@op-lists.linaro.org] On Behalf Of
> James Morse via Linaro-open-discussions
> Sent: 04 June 2021 17:56
> To: Jean-Philippe Brucker <jean-philippe(a)linaro.org>;
> linaro-open-discussions(a)op-lists.linaro.org
> Subject: Re: [Linaro-open-discussions] [RFC linux 0/5] KVM: arm64: Let
> userspace handle PSCI
>
> 'lo
>
> On 20/05/2021 14:07, Jean-Philippe Brucker wrote:
> > As planned during the vCPU hot-add discussions from previous LOD
> > meetings, this prototype lets userspace handle PSCI calls from a guest.
> >
> > The vCPU hot-add model preferred by Arm presents all possible resources
> > through ACPI at boot time, only marking unavailable vCPUs as hidden.
> > The VMM prevents bringing up those vCPUs by rejecting PSCI CPU_ON calls.
> > This allows to keep things simple for vCPU scaling enablement, while
> > leaving the door open for hardware CPU hot-add.
> >
> > This series focuses on moving PSCI support into userspace. Patches 1-3
> > allow userspace to request WFI to be executed by KVM. That way the VMM
> > can easily implement the CPU_SUSPEND function, which is mandatory from
> > PSCI v0.2 onwards (even if it doesn't have a more useful implementation
> > than WFI, natively available to the guest). An alternative would be to
> > poll the vGIC implemented in KVM for interrupts, but I haven't explored
> > that solution. Patches 4 and 5 let the VMM request PSCI calls.
>
> As mentioned on the call, I've tested the udev output on x86 and arm64, as
> expected its
> the same:
> | root@vm:~# udevadm monitor
> | monitor will print the received events for:
> | UDEV - the event which udev sends out after rule processing
> | KERNEL - the kernel uevent
> |
> | KERNEL[33.935817] add /devices/system/cpu/cpu1 (cpu)
> | KERNEL[33.946333] bind /devices/system/cpu/cpu1 (cpu)
> | UDEV [33.953251] add /devices/system/cpu/cpu1 (cpu)
> | UDEV [33.958676] bind /devices/system/cpu/cpu1 (cpu)
>
>
> (I've not played with the KVM changes yet)
I also had a little play on my setup with the cpuhp kernel and Qemu.
Also added the below udev rule to online cpu by default.
SUBSYSTEM=="cpu", ACTION=="add", TEST=="online", ATTR{online}=="0", ATTR{online}="1"
And on,
(qemu) device_add host-arm-cpu,id=core2,core-id=2
KERNEL[266.623545] add /devices/system/cpu/cpu2 (cpu)
KERNEL[266.686160] online /devices/system/cpu/cpu2 (cpu)
UDEV [266.691808] add /devices/system/cpu/cpu2 (cpu)
UDEV [266.692216] online /devices/system/cpu/cpu2 (cpu)
the new cpu now becomes online without explicitly doing so.
But with a Guest kernel Image without cpuhp support the behavior
is different(obviously!).
On boot, it reports failures,
[ 0.981712] psci: failed to boot CPU2 (-22)
[ 0.982438] CPU2: failed to boot: -22
But all the cpus are visible under,
root@ubuntu:~# cat /sys/devices/system/cpu/
cpu0/ cpu5/ kernel_max power/
cpu1/ cpufreq/ modalias present
cpu2/ cpuidle/ offline smt/
cpu3/ hotplug/ online uevent
cpu4/ isolated possible vulnerabilities/
And on,
(qemu) device_add host-arm-cpu,id=core2,core-id=2
No udev event "add" is reported.
So you have to explicitly make the new cpu online in this case,
root@ubuntu:~# echo 1 >/sys/devices/system/cpu/cpu2/online
KERNEL[357.211520] online /devices/system/cpu/cpu2 (cpu)
UDEV [357.213550] online /devices/system/cpu/cpu2 (cpu)
And the new cpu becomes available to VM.
Not sure, this is major concern or not and there are any better ways to
handle this more gracefully in Qemu( A warning or preventing hot add
cpus etc).
Thanks,
Shameer
On Thu, 29 Apr 2021 15:17:36 +0000
Jonathan Cameron via Linaro-open-discussions <linaro-open-discussions(a)op-lists.linaro.org> wrote:
> On Thu, 29 Apr 2021 15:25:38 +0100
> Lorenzo Pieralisi <lorenzo.pieralisi(a)arm.com> wrote:
>
> > On Thu, Apr 29, 2021 at 09:50:06AM +0100, Jonathan Cameron wrote:
> > > > is not really working for me. If you have a command script to share
> > > > it is welcome - looking forward to testing and reviewing the DOE
> > > > patches.
> > >
> > > I'm mostly running aarch64 emulated on top of x86 - I should sanity check
> > > it on KVM at somepoint.
> > >
> > > I'm copy typing this across machines, so whilst I hope there are no typos
> > > there might be. I've stripped back my normal case (which has a big
> > > complex topology to try and hit corner cases...)
> > > First I'd suggest checking that have right EDK2 etc to bring up pxb with normal pci
> > > rp and a device.
> > >
> > > qemu-system-aarch64 -M virt,nvdimm=on -m 4g,maxmem=8G,slots=2 -cpu max -smp 4 \
> > > -kernel Image \
> > > -drive if=non,file=full.qcow2,format=qcow2,id=hd \
> > > -nographic -no-reboot -append 'earlycon root=/dev/vda2 fsck.mode=skip' \
> > > -bios QEMU_EFI.fd \ #note this needs the pxb enablement patches -maybe upstream by now.
> > > -object memory-backend-ram,size=4G,id=mem0 \
> > > -numa node,nodeid=0,cpus=0-3,memdev=mem0 \
> > > -object memory-backend-file,id=cxl-meme1,share,mem-path=/tmp/cxltest.raw,size=2G,align=2G \
> > > -device pxb-cxl,bus_nr=128,id=cxl1.1,uid=0,len-window-base=1,window-base[0]=0x4c0000000,memdev[0]=cxl-mem1 \
> > > # above range just needs to not trample on anything.
> > > -device cxl-rp,bus=cxl1.1,id=root_port13,chassis=0,slot=1 \
> > > -device cxl-type3,bus=root_port13,memdev=cxl-mem1,lsa=cxl-mem1,id=cxl-pmem0,size=2G
> >
> > Done, thank you very much. Is there a commit base on top of which I
> > can apply Dan's CXL port enumeration patches + your DOE series ?
>
> Nope. Looks like I need to send out a rebase as some of Dan's other patches hit
> mainline. Just did the merge to see how bad it is and other than a goto label
> having disappeared (and a bunch of fuzz) seems to go in fairly cleanly.
>
> I'm not sure which order the various Intel patches currently on list would apply in.
> I tried a few possible orders and got issues. Looks like they need to rebase as well.
>
> Best of all mainline is booting on this setup mid merge window which is always
> a pleasant surprise :)
>
> I'll wait to send the rebase until near the end or after the merge window.
> We'll probably still have some issues with merging until we have an order in which
> various series will merge.
>
I sent the updated DOE patches out today.
https://lore.kernel.org/linux-pci/20210524133938.2815206-1-Jonathan.Cameron…
There is still a bit of merge mess if you also pick up Dan's Port series.
I was waiting for that to merge, but as it seems to be going slowly I went
ahead with the DOE update. There is a lot of churn in the CXL code at the moment.
Also, a highly dubious blog / setup guide can be found at:
https://people.kernel.org/jic23/howto-test-cxl-enablement-on-arm64-using-qe…
May well eat babies.
Jonathan
> J
>
> >
> > Thanks a lot Jonathan,
> > Lorenzo
> >
> > > Hmm. Perhaps I should just write a blog post and include all the random corners needed
> > > to get this up. Problem then is I'd actually have to figure out what some of the
> > > parts are doing having long forgotten the answer so might take a day or two.
> > >
> > > In meantime we can carry on here.
> > >
> > > J
> > >
> > >
> > >
> > >
> > > >
> > > > Thanks,
> > > > Lorenzo
> > > >
> > > > > >
> > > > > > I remember reading there is an IRC channel cxl related (#cxl @ OFTC ?) -
> > > > > > if there is happy to switch to it rather than bothering you with these
> > > > > > queries.
> > > > >
> > > > > I tend to avoid IRC because of potential auditing issues (no logs) so
> > > > > email is the way to go.
> > > > >
> > > > > >
> > > > > > Thanks a lot !
> > > > > > Lorenzo
> > > > > >
> > > > > > > As there are some US folks who are interested in this topic (but super busy),
> > > > > > > can we do a straw poll of whether any of them can make a call on Monday?
> > > > > > >
> > > > > > > If not go for a more China/Europe friendly time?
> > > > > > >
> > > > > > > We had a few more topics brewing, but I'm not sure they will be in a state
> > > > > > > to discuss next week (or to give anyone else time to think about them
> > > > > > > in advance).
> > > > > > >
> > > > > > > Obviously good to touch on any updates to older topics as well if anyone
> > > > > > > has any!
> > > > > > >
> > > > > > > Jonathan
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > Lorenzo
> > > > > > >
> > > > >
> > >
>