Hi Yicong and Tim, This is the 2nd patchset following the 1st one: https://op-lists.linaro.org/pipermail/linaro-open-discussions/2021-June/0001...
While the 1st patchset was only focusing on spreading path, this patchset is mainly for packing path. I have only tested tbench4 on one numa. Without this patchset, I am seeing up to 5% performance decrease on tbench4 by spreading path only; but with it, I can see up to 28% performance increase, compared to the case of w/o cluster scheduler. I am running the benchmark by Mel's mmtests with this config file: configs/config-scheduler-schbench-1numa # MM Test Parameters export MMTESTS="tbench4"
# List of monitors export RUN_MONITOR=yes export MONITORS_ALWAYS= export MONITORS_GZIP="proc-vmstat mpstat" export MONITORS_WITH_LATENCY="vmstat" export MONITOR_UPDATE_FREQUENCY=10
# TBench export TBENCH_DURATION=60 export TBENCH_MIN_CLIENTS=1 export TBENCH_MAX_CLIENTS=96
with commands like: numactl -N 0 -m 0 ./run-mmtests.sh --no-monitor -c configs/config-scheduler-schbench-1numa testtag
my machine has 4 numa, each numa has 24 cores(6 clusters).
Hopefully, we are going to have more benchmark cases like pgbench, hackbench etc on both one numa and four numa.
Hi Yicong, Note we might need to test the case jumplabel is disabled.
Thanks Barry
Barry Song (4): sched: Add infrastructure to describe if cluster scheduler is really running sched: Add per_cpu cluster domain info and cpus_share_cluster API sched/fair: Scan cluster before scanning llc in wake-up path sched/fair: Use cpus_share_cluster to further pull wakee
include/linux/sched/cluster.h | 19 ++++++++++++++ include/linux/sched/sd_flags.h | 9 +++++++ include/linux/sched/topology.h | 8 +++++- kernel/sched/core.c | 28 ++++++++++++++++++++ kernel/sched/fair.c | 58 +++++++++++++++++++++++++++++++++++++++--- kernel/sched/sched.h | 3 +++ kernel/sched/topology.c | 11 ++++++++ 7 files changed, 131 insertions(+), 5 deletions(-) create mode 100644 include/linux/sched/cluster.h