Given you mentioned rebasing MPAM :) One trivial bug that might still be hanging around. When adding resources parsed from the MPAM ACPI table I think you are adding one byte too much for IORESOURCE_MEM https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/tree/drivers... should be a -1 similar to IORT handling. https://elixir.bootlin.com/linux/latest/source/drivers/acpi/arm64/iort.c#L15...
As a side note, I'm emulating MPAM on QEMU so I can poke the corners of the code beyond the real implementations we have access to and in a fashion I can talk about more freely.
Other than the above, mostly seems to be working fine beyond a bug in the upstream PPTT parsing code that doesn't affect what you have implemented so far.
There is an infinite loop / deadlock hiding somewhere though in one of the error paths though that I haven't chased down yet. Maybe I'll be lucky and you've fixed it already :)
It's holiday season so might take a while to get the QEMU implementation finished / through internal review - but I'll post any bugs I find in the meantime.
Jonathan
Hi Jonathan,
On 7/27/23 11:36, Jonathan Cameron wrote:
Given you mentioned rebasing MPAM :)
It's the same thing I do every night ....
One trivial bug that might still be hanging around. When adding resources parsed from the MPAM ACPI table I think you are adding one byte too much for IORESOURCE_MEM https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/tree/drivers... should be a -1 similar to IORT handling. https://elixir.bootlin.com/linux/latest/source/drivers/acpi/arm64/iort.c#L15...
Thanks, fixed.
I still haven't gotten to the bottom of the difference between proximity-domain numbers and numa-nodes that Hesham reported.
As a side note, I'm emulating MPAM on QEMU so I can poke the corners of the code beyond the real implementations we have access to and in a fashion I can talk about more freely.
Awesome! Something else to test with is always good.
Other than the above, mostly seems to be working fine beyond a bug in the upstream PPTT parsing code that doesn't affect what you have implemented so far.
There is an infinite loop / deadlock hiding somewhere though in one of the error paths though that I haven't chased down yet. Maybe I'll be lucky and you've fixed it already :)
Oooeer, that sounds new .
There were some false-positive lockdep warnings that I've fixed/replaced-with-comments.
Otherwise the response to the error interrupt isn't fantastic at the moment, resctrl mostly manages to tear itself down, but there is evidently a reference counting problem somewhere as a few of the files live on. I can't find a way of triggering that code on x86, so its not too surprising. I haven't tested that for a while...
Thanks,
James
On Fri, 28 Jul 2023 11:56:44 +0100 James Morse via Linaro-open-discussions linaro-open-discussions@op-lists.linaro.org wrote:
Hi Jonathan,
On 7/27/23 11:36, Jonathan Cameron wrote:
Given you mentioned rebasing MPAM :)
It's the same thing I do every night ....
:)
One trivial bug that might still be hanging around. When adding resources parsed from the MPAM ACPI table I think you are adding one byte too much for IORESOURCE_MEM https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/tree/drivers... should be a -1 similar to IORT handling. https://elixir.bootlin.com/linux/latest/source/drivers/acpi/arm64/iort.c#L15...
Thanks, fixed.
I still haven't gotten to the bottom of the difference between proximity-domain numbers and numa-nodes that Hesham reported.
I'll poke that one later... Too much fun to be had adding features first and generating piles of JSON so I can review the state as I poke it.
As a side note, I'm emulating MPAM on QEMU so I can poke the corners of the code beyond the real implementations we have access to and in a fashion I can talk about more freely.
Awesome! Something else to test with is always good.
Speaking of which... I got some unexpected write values in the bandwidth bitmap registers. I had a small bit map - 16 entries and got 0x1ffff written wd is coming from CPBM_WD which (unlike many register fields in the spec) is not offset by 1.
Bug looks to be an off by one.
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/tree/drivers...
static void mpam_reset_msc_bitmap(struct mpam_msc *msc, u16 reg, u16 wd) { u32 bm = ~0; int i;
lockdep_assert_held(&msc->part_sel_lock);
/* * Write all but the last full-32bit-word. * The number of portions is 1<->wd, start at 1 instead of 0. */ for (i = 1; i < wd / 32; i++, reg += sizeof(bm)) __mpam_write_reg(msc, reg, bm);
/* and the last partial 32bit word */ bm = GENMASK(wd % 32, 0);
wd % 32 - 1
I think.
if (bm) __mpam_write_reg(msc, reg, bm); }
Other than the above, mostly seems to be working fine beyond a bug in the upstream PPTT parsing code that doesn't affect what you have implemented so far.
There is an infinite loop / deadlock hiding somewhere though in one of the error paths though that I haven't chased down yet. Maybe I'll be lucky and you've fixed it already :)
Oooeer, that sounds new .
There were some false-positive lockdep warnings that I've fixed/replaced-with-comments.
Otherwise the response to the error interrupt isn't fantastic at the moment, resctrl mostly manages to tear itself down, but there is evidently a reference counting problem somewhere as a few of the files live on. I can't find a way of triggering that code on x86, so its not too surprising. I haven't tested that for a while...
I was being lazy on interrupts - will wire that up at some point. If I get really bored I might even wire up the PPC channel interfaces but don't hold your breath on that one. By coincidence I wired up PCC in QEMU for something else last week so it wouldn't be too tricky - it's just not that high on my list unless you'll find it 'really useful'.
I hope everyone else is reading along ;) There will be a test on Thursday.
Jonathan
Thanks,
James
On Fri, 28 Jul 2023 14:39:15 +0100 Jonathan Cameron via Linaro-open-discussions linaro-open-discussions@op-lists.linaro.org wrote:
On Fri, 28 Jul 2023 11:56:44 +0100 James Morse via Linaro-open-discussions linaro-open-discussions@op-lists.linaro.org wrote:
Hi Jonathan,
On 7/27/23 11:36, Jonathan Cameron wrote:
Given you mentioned rebasing MPAM :)
It's the same thing I do every night ....
:)
One trivial bug that might still be hanging around. When adding resources parsed from the MPAM ACPI table I think you are adding one byte too much for IORESOURCE_MEM https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/tree/drivers... should be a -1 similar to IORT handling. https://elixir.bootlin.com/linux/latest/source/drivers/acpi/arm64/iort.c#L15...
Thanks, fixed.
I still haven't gotten to the bottom of the difference between proximity-domain numbers and numa-nodes that Hesham reported.
I'll poke that one later... Too much fun to be had adding features first and generating piles of JSON so I can review the state as I poke it.
As a side note, I'm emulating MPAM on QEMU so I can poke the corners of the code beyond the real implementations we have access to and in a fashion I can talk about more freely.
Awesome! Something else to test with is always good.
Speaking of which... I got some unexpected write values in the bandwidth bitmap registers. I had a small bit map - 16 entries and got 0x1ffff written wd is coming from CPBM_WD which (unlike many register fields in the spec) is not offset by 1.
Bug looks to be an off by one.
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/tree/drivers...
static void mpam_reset_msc_bitmap(struct mpam_msc *msc, u16 reg, u16 wd) { u32 bm = ~0; int i;
lockdep_assert_held(&msc->part_sel_lock);
/* * Write all but the last full-32bit-word. * The number of portions is 1<->wd, start at 1 instead of 0. */ for (i = 1; i < wd / 32; i++, reg += sizeof(bm))
for (i = 1; i < (wd + 31) /32; i++, reg += sizeof(bm))
DIV_ROUND_UP()
as you want to round up rather than down here.
Testing with a width of 48 we want this to run once and the below to pick up last 16 bits.
__mpam_write_reg(msc, reg, bm);
/* and the last partial 32bit word */ bm = GENMASK(wd % 32, 0);
wd % 32 - 1
I think.
if (bm) __mpam_write_reg(msc, reg, bm); }
Other than the above, mostly seems to be working fine beyond a bug in the upstream PPTT parsing code that doesn't affect what you have implemented so far.
There is an infinite loop / deadlock hiding somewhere though in one of the error paths though that I haven't chased down yet. Maybe I'll be lucky and you've fixed it already :)
Oooeer, that sounds new .
There were some false-positive lockdep warnings that I've fixed/replaced-with-comments.
Otherwise the response to the error interrupt isn't fantastic at the moment, resctrl mostly manages to tear itself down, but there is evidently a reference counting problem somewhere as a few of the files live on. I can't find a way of triggering that code on x86, so its not too surprising. I haven't tested that for a while...
I was being lazy on interrupts - will wire that up at some point. If I get really bored I might even wire up the PPC channel interfaces but don't hold your breath on that one. By coincidence I wired up PCC in QEMU for something else last week so it wouldn't be too tricky - it's just not that high on my list unless you'll find it 'really useful'.
I hope everyone else is reading along ;) There will be a test on Thursday.
Jonathan
Thanks,
James
Hi Jonathan,
On 28/07/2023 15:15, Jonathan Cameron wrote:
On Fri, 28 Jul 2023 14:39:15 +0100 Jonathan Cameron via Linaro-open-discussions linaro-open-discussions@op-lists.linaro.org wrote:
Speaking of which... I got some unexpected write values in the bandwidth bitmap registers. I had a small bit map - 16 entries and got 0x1ffff written wd is coming from CPBM_WD which (unlike many register fields in the spec) is not offset by 1.
Bug looks to be an off by one.
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/tree/drivers...
static void mpam_reset_msc_bitmap(struct mpam_msc *msc, u16 reg, u16 wd) { u32 bm = ~0; int i;
lockdep_assert_held(&msc->part_sel_lock);
/* * Write all but the last full-32bit-word. * The number of portions is 1<->wd, start at 1 instead of 0. */ for (i = 1; i < wd / 32; i++, reg += sizeof(bm))
for (i = 1; i < (wd + 31) /32; i++, reg += sizeof(bm))
DIV_ROUND_UP()
as you want to round up rather than down here.
Rounding down was so this always stops early, but starting at 1 covers that ... which just makes this code more confusing.
Testing with a width of 48 we want this to run once and the below to pick up last 16 bits.
You're winning here - I'm still waiting for the FVP to boot, and it won't emulate CPOR bitmaps larger than 32...
__mpam_write_reg(msc, reg, bm);
/* and the last partial 32bit word */ bm = GENMASK(wd % 32, 0);
wd % 32 - 1
I think.
Blerg. This needs to produce '31' when wd is a multiple of 32, so yes.
That gave me enough of a headache that this thing now looks like this: ----------%<---------- /* * Write ~0 to all but the last 32bit-word, which may * have fewer bits... */ num_words = DIV_ROUND_UP(wd, 32); for (i = 0; i < num_words - 1; i++, reg += sizeof(bm)) __mpam_write_reg(msc, reg, bm);
/* * ....and then the last (maybe) partial 32bit word. When wd is a * multiple of 32, msb should be 31 to write a full 32bit word. */ msb = (wd - 1) % 32; bm = GENMASK(msb , 0); if (bm) __mpam_write_reg(msc, reg, bm); ----------%<----------
As this is an interaction I've not explored, beware that resctrl only supports 32bit bitmaps. Arbitrary size bitmaps would mean invasive changes to resctrl, but bumping it to 64bits is probably easy enough to do.
Other than the above, mostly seems to be working fine beyond a bug in the upstream PPTT parsing code that doesn't affect what you have implemented so far.
There is an infinite loop / deadlock hiding somewhere though in one of the error paths though that I haven't chased down yet. Maybe I'll be lucky and you've fixed it already :)
Oooeer, that sounds new .
There were some false-positive lockdep warnings that I've fixed/replaced-with-comments.
Otherwise the response to the error interrupt isn't fantastic at the moment, resctrl mostly manages to tear itself down, but there is evidently a reference counting problem somewhere as a few of the files live on. I can't find a way of triggering that code on x86, so its not too surprising. I haven't tested that for a while...
I was being lazy on interrupts - will wire that up at some point.
If I get really bored I might even wire up the PPC channel interfaces but don't hold your breath on that one. By coincidence I wired up PCC in QEMU for something else last week so it wouldn't be too tricky - it's just not that high on my list unless you'll find it 'really useful'.
I've not touched that thing, I was waiting for someone to show up with a model to prove the spec had everything that was needed. Last I heard this was likely to get deprecated in favour of PCC type... 3? I'd hold off for now.
If you want to find more bugs, the monitor code and the SMMU stuff are two things I can't test!
I'm about to repost the x86 patches at the bottom of all this, is it useful to CC linuxarm@huawei.com for that? Currently Shameerali and Bobo get CC'd... I'm certain my list is not up-to-date for the folk that care about this.
Thanks!
James
On Fri, 28 Jul 2023 16:28:35 +0100 James Morse via Linaro-open-discussions linaro-open-discussions@op-lists.linaro.org wrote:
Hi Jonathan,
On 28/07/2023 15:15, Jonathan Cameron wrote:
On Fri, 28 Jul 2023 14:39:15 +0100 Jonathan Cameron via Linaro-open-discussions linaro-open-discussions@op-lists.linaro.org wrote:
Speaking of which... I got some unexpected write values in the bandwidth bitmap registers. I had a small bit map - 16 entries and got 0x1ffff written wd is coming from CPBM_WD which (unlike many register fields in the spec) is not offset by 1.
Bug looks to be an off by one.
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/tree/drivers...
static void mpam_reset_msc_bitmap(struct mpam_msc *msc, u16 reg, u16 wd) { u32 bm = ~0; int i;
lockdep_assert_held(&msc->part_sel_lock);
/* * Write all but the last full-32bit-word. * The number of portions is 1<->wd, start at 1 instead of 0. */ for (i = 1; i < wd / 32; i++, reg += sizeof(bm))
for (i = 1; i < (wd + 31) /32; i++, reg += sizeof(bm))
DIV_ROUND_UP()
as you want to round up rather than down here.
Rounding down was so this always stops early, but starting at 1 covers that ... which just makes this code more confusing.
Testing with a width of 48 we want this to run once and the below to pick up last 16 bits.
You're winning here - I'm still waiting for the FVP to boot, and it won't emulate CPOR bitmaps larger than 32...
*sniggers* I'll see if I can expedite pushing a tree out somewhere next week
__mpam_write_reg(msc, reg, bm);
/* and the last partial 32bit word */ bm = GENMASK(wd % 32, 0);
wd % 32 - 1
I think.
Blerg. This needs to produce '31' when wd is a multiple of 32, so yes.
That gave me enough of a headache that this thing now looks like this: ----------%<---------- /* * Write ~0 to all but the last 32bit-word, which may * have fewer bits... */ num_words = DIV_ROUND_UP(wd, 32); for (i = 0; i < num_words - 1; i++, reg += sizeof(bm)) __mpam_write_reg(msc, reg, bm);
/* * ....and then the last (maybe) partial 32bit word. When wd is a * multiple of 32, msb should be 31 to write a full 32bit word. */ msb = (wd - 1) % 32; bm = GENMASK(msb , 0); if (bm)
bm always true I think (unless wd == 0 which I don't think needs to work?)
__mpam_write_reg(msc, reg, bm);
I think the above works... Needs a unit test :)
Or perhaps something like (untested and probably wrong)
num_whole_words = wd / 32; for (i = 0; i < num_whole_words; i++, reg += sizeof(bm)) __mpam_write_reg(msc, reg, bm);
bits_left = wd % 32; if (bits_left) __mapm_write_reg(msc, reg, GENMASK(bits_left - 1, 0));
----------%<----------
As this is an interaction I've not explored, beware that resctrl only supports 32bit bitmaps. Arbitrary size bitmaps would mean invasive changes to resctrl, but bumping it to 64bits is probably easy enough to do.
I noticed. Caches controls don't come up with > 32 bits though it does get to setting them to fully on.
Other than the above, mostly seems to be working fine beyond a bug in the upstream PPTT parsing code that doesn't affect what you have implemented so far.
There is an infinite loop / deadlock hiding somewhere though in one of the error paths though that I haven't chased down yet. Maybe I'll be lucky and you've fixed it already :)
Oooeer, that sounds new .
There were some false-positive lockdep warnings that I've fixed/replaced-with-comments.
Otherwise the response to the error interrupt isn't fantastic at the moment, resctrl mostly manages to tear itself down, but there is evidently a reference counting problem somewhere as a few of the files live on. I can't find a way of triggering that code on x86, so its not too surprising. I haven't tested that for a while...
I was being lazy on interrupts - will wire that up at some point.
If I get really bored I might even wire up the PPC channel interfaces but don't hold your breath on that one. By coincidence I wired up PCC in QEMU for something else last week so it wouldn't be too tricky - it's just not that high on my list unless you'll find it 'really useful'.
I've not touched that thing, I was waiting for someone to show up with a model to prove the spec had everything that was needed. Last I heard this was likely to get deprecated in favour of PCC type... 3? I'd hold off for now.
If you want to find more bugs, the monitor code and the SMMU stuff are two things I can't test!
I'll get the basic control stuff out first, but can circle back to those after that.
I'm about to repost the x86 patches at the bottom of all this, is it useful to CC linuxarm@huawei.com for that? Currently Shameerali and Bobo get CC'd... I'm certain my list is not up-to-date for the folk that care about this.
Whilst in theory that's the list we use to track submissions from our side, it would do no harm to CC that address.
Maybe just ccing linux-arm-kernel is enough though as I think everyone who cares will pick it up from there (even if it is all x86 related :)
Jonathan
Thanks!
James
Hi James,
The way I read the spec, these IDs are inclusive. PARTID_MAX : Maximum supported value of PARTID so 0..PARTID_MAX and hence PARTID_MAX + 1 partitions....
whereas a lot of the structures you allocate and the interactions over them don't add 1 to get the number of elements.
For example I think it should be valid (if odd) to narrow to 1 ID. Currently that results in a zero sized allocation and cat /sys/fs/resctrl/schemata goes boom with predictable null pointer derefernce in resctrl_arch_get_config
The note about the fact that partid 0 should be hidden from the userspace control (as it potentially includes the kernel) is fair enough, but having a situation where it goes boom isn't good in the meantime.
Looks like in a lot of cases the memory allocations are big enough (by luck) that we don't get a segfault making the issue harder to spot, but they are still running over the range that was intentionally initialized.
Of course I might be missing some subtlety in my emulation.
Jonathan
linaro-open-discussions@op-lists.linaro.org