Hello James,
Thanks for getting back to me on this. I have done some changes to my ACPI tables and got your code working fine without this patch. In particular, I matched the proximity domain field with the NUMA node ID for each memory controller. If they differ, the code won't work (as it has the assumption that the proximity domain is the same as NUMA id, from which the affinity/accessibility is set). This leaves me with a few questions regarding the design and implementation of the driver. I'd appreciate your input on that.
1) What does a memory MSC correspond to? A class (with a unique ID) or a component? From the code, it seems like it maps to a component to me.
2) Could we have a use case in which we have different class IDs with the same class type? If yes could you please give an example?
3) What should a component ID for a memory MSC be/represent? The code assumes it's a (NUMA?) node ID.
4) What should a class ID represent for a memory MSC? Which is different from the class type itself.
5) How would 4 memory MSCs (with different proximity domains) map to classes and components?
6) How would 2 Memory MSCs with(in) the same proximity domain and/or same NUMA node work, if at all?
7) Should the ACPI/MPAM MSC's "identifier" field be mapped to class IDs or component IDs at all?
Regards, Hesham
On Wed, 21 Dec 2022 13:54:01 +0000 James Morse james.morse@arm.com wrote:
Hi Hesham,
On 24/11/2022 12:29, James Morse via Linaro-open-discussions wrote:
On 24/11/2022 12:21, Hesham Almatary wrote:
Thanks for explaining that. This is indeed my case; 4 different NUMA nodes, each with a memory controller and an MSC. No RIS support and no channel/slices. So for my understanding, and please correct me if I am wrong, you are saying that each memory controller/MSC (each with a different 'proximity domain') in my case should be on its own class? If yes, this means that each one should have the same type (MPAM_CLASS_MEMORY), but different class id?
AH - you're right. I think I was off by one in how the driver structures these things. (Maybe I need to expand those comments for my own benefit!)
Let me try and reproduce what you are seeing to work out what it should be doing. (then I'll get back to your reply). I have ACPI tables to test this with combining/splitting the FVP's 'cache's...
I've reproduced something like your setup in the FVP, but I can't reproduce the problem you describe: | root@(none):/sys/fs/resctrl# cat schemata | MB:0=100;1=100
I have found one issue with cpumask_of_node() - it only knows about online CPUs meaning the sanity checks that a class covers all possible CPUs fails if not all CPUs are online, and no MB resources are created. This doesn't match what you described, but it might be relevant. Does [0] help?
Your diff is creating domains when any CPU comes online ... this is the wrong thing to do, and would be an ABI break for resctrl. If all the CPUs in a domain are offline, it is supposed to disappear from the schemata file. When a CPU in the domain returns, it should re-create the domain, or at least reset all the configuration.
If [0] doesn't change things, could you share your MPAM and SRAT ACPI tables? I suspect something is going wrong when working out the MSC affinity. The output of: | find /sys/kernel/debug/mpam/ -type f -print -exec cat {} ; might help me decode your tables. Feel free to send these things in private if that's going to be easier. (I just won't be allowed to reply)
Thanks,
James
[0] Don't use cpumask_of_node, it only works for online CPUs --------------------%<-------------------- diff --git a/drivers/platform/mpam/mpam_devices.c b/drivers/platform/mpam/mpam_d evices.c index c1b2bd9caa29..d4853c567b4e 100644 --- a/drivers/platform/mpam/mpam_devices.c +++ b/drivers/platform/mpam/mpam_devices.c @@ -418,6 +418,21 @@ static int get_cpumask_from_cache_id(u32 cache_id, u32 cach e_level, return 0; }
+/*
- cpumask_of_node() only knows about online CPUs. This can't tell
us whether
- a class is represented on all possible CPUs.
- */
+static void get_cpumask_from_node_id(u32 node_id, cpumask_t *affinity) +static void get_cpumask_from_node_id(u32 node_id, cpumask_t *affinity) +{
int cpu;
for_each_possible_cpu(cpu) {
if (node_id == cpu_to_node(cpu))
cpumask_set_cpu(cpu, affinity);
}
+}
static int get_cpumask_from_cache(struct device_node *cache, cpumask_t *affinity) { @@ -460,7 +475,7 @@ static int mpam_ris_get_affinity(struct mpam_msc *msc, cpumask_t *affinity,
break; case MPAM_CLASS_MEMORY:
*affinity = *cpumask_of_node(comp->comp_id);
get_cpumask_from_node_id(comp->comp_id, affinity); if (cpumask_empty(affinity)) pr_warn_once("%s no CPUs associated with
memory node", dev_name(&msc->pdev->dev)); --------------------%<--------------------