Hi Hesham,
On 24/11/2022 12:29, James Morse via Linaro-open-discussions wrote:
On 24/11/2022 12:21, Hesham Almatary wrote:
Thanks for explaining that. This is indeed my case; 4 different NUMA nodes, each with a memory controller and an MSC. No RIS support and no channel/slices. So for my understanding, and please correct me if I am wrong, you are saying that each memory controller/MSC (each with a different 'proximity domain') in my case should be on its own class? If yes, this means that each one should have the same type (MPAM_CLASS_MEMORY), but different class id?
AH - you're right. I think I was off by one in how the driver structures these things. (Maybe I need to expand those comments for my own benefit!)
Let me try and reproduce what you are seeing to work out what it should be doing. (then I'll get back to your reply). I have ACPI tables to test this with combining/splitting the FVP's 'cache's...
I've reproduced something like your setup in the FVP, but I can't reproduce the problem you describe: | root@(none):/sys/fs/resctrl# cat schemata | MB:0=100;1=100
I have found one issue with cpumask_of_node() - it only knows about online CPUs meaning the sanity checks that a class covers all possible CPUs fails if not all CPUs are online, and no MB resources are created. This doesn't match what you described, but it might be relevant. Does [0] help?
Your diff is creating domains when any CPU comes online ... this is the wrong thing to do, and would be an ABI break for resctrl. If all the CPUs in a domain are offline, it is supposed to disappear from the schemata file. When a CPU in the domain returns, it should re-create the domain, or at least reset all the configuration.
If [0] doesn't change things, could you share your MPAM and SRAT ACPI tables? I suspect something is going wrong when working out the MSC affinity. The output of: | find /sys/kernel/debug/mpam/ -type f -print -exec cat {} ; might help me decode your tables. Feel free to send these things in private if that's going to be easier. (I just won't be allowed to reply)
Thanks,
James
[0] Don't use cpumask_of_node, it only works for online CPUs --------------------%<-------------------- diff --git a/drivers/platform/mpam/mpam_devices.c b/drivers/platform/mpam/mpam_d evices.c index c1b2bd9caa29..d4853c567b4e 100644 --- a/drivers/platform/mpam/mpam_devices.c +++ b/drivers/platform/mpam/mpam_devices.c @@ -418,6 +418,21 @@ static int get_cpumask_from_cache_id(u32 cache_id, u32 cach e_level, return 0; }
+ +/* + * cpumask_of_node() only knows about online CPUs. This can't tell us whether + * a class is represented on all possible CPUs. + */ +static void get_cpumask_from_node_id(u32 node_id, cpumask_t *affinity) +static void get_cpumask_from_node_id(u32 node_id, cpumask_t *affinity) +{ + int cpu; + + for_each_possible_cpu(cpu) { + if (node_id == cpu_to_node(cpu)) + cpumask_set_cpu(cpu, affinity); + } +} + static int get_cpumask_from_cache(struct device_node *cache, cpumask_t *affinity) { @@ -460,7 +475,7 @@ static int mpam_ris_get_affinity(struct mpam_msc *msc, cpumask_t *affinity,
break; case MPAM_CLASS_MEMORY: - *affinity = *cpumask_of_node(comp->comp_id); + get_cpumask_from_node_id(comp->comp_id, affinity); if (cpumask_empty(affinity)) pr_warn_once("%s no CPUs associated with memory node", dev_name(&msc->pdev->dev)); --------------------%<--------------------