[linux-morello-ltp] Re: [RFC PATCH 1/1] lib/tst_pid: Add a new file to get pid_max

10 Jan 2023

On Wed, Dec 21, 2022 at 05:39:32PM +0000, Teo Couprie Diaz wrote:
...
On 21/12/2022 17:00, Beata Michalska wrote:
...
On Wed, Dec 21, 2022 at 09:59:04AM +0000, Teo Couprie Diaz wrote:
...
On 20/12/2022 17:18, Kevin Brodsky wrote:
...
On 20/12/2022 10:45, Teo Couprie Diaz wrote:
...
In our debian-based distribution, the two files used in lib/tst_pid
are not available, but systemd still imposes a task limit far lesser than
the kernel pid_max.
Add another file that seems to be always available to read the maximum
number of PIDs.
This fixed msgstress04, but it appeared that msgstress03 didn't account
for all of its PIDs, so it still hit the limit.
Reduce the number of free PIDs by 10% in msgstress03 to account for it.
Signed-off-by: Teo Couprie Diaz teo.coupriediaz@arm.com
lib/tst_pid.c                                         | 4 ++++
   testcases/kernel/syscalls/ipc/msgstress/msgstress03.c | 2 +-
   2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/tst_pid.c b/lib/tst_pid.c
index 21cadef2a..3c10e0298 100644
--- a/lib/tst_pid.c
+++ b/lib/tst_pid.c
@@ -33,6 +33,7 @@
   #define PID_MAX_PATH "/proc/sys/kernel/pid_max"
   #define CGROUPS_V1_SLICE_FMT "/sys/fs/cgroup/pids/user.slice/user-%d.slice/pids.max"
   #define CGROUPS_V2_SLICE_FMT "/sys/fs/cgroup/user.slice/user-%d.slice/pids.max"
+#define CGROUPS_V2_INIT_SCOPE "/sys/fs/cgroup/pids/init.scope/pids.max"
On Arch I don't get pids/ under cgroup/, what I have is init.scope/
directly underneath, that is:
ï¿½ /sys/fs/cgroup/init.scope/pids.max
Not sure if that's a configuration difference? In any case the idea
sounds sensible.
Interesting, thanks for pointing it out. I do have the same on my personal
Arch machine, but on my Ubuntu laptop I have the same as in our Debian
image.
Looking at the defines with a bit more attention, it seems like it could be
a cgroups v1/v2 difference. The path I added looking like a v1 thing
(similar to CGRUPS_V1_SLICE_FMT), rather than a v2.
I'll try to have a look and see if I can get our image to use cgroups v2.
Would it make sense to add two paths if I don't find another solution, to
cover all of our bases ?
...
...
/* Leave some available processes for the OS */
   #define PIDS_RESERVE 50
@@ -103,6 +104,9 @@ static int get_session_pids_limit(void (*cleanup_fn) (void))
   	if (max_pids < 0)
   		max_pids = read_session_pids_limit(CGROUPS_V1_SLICE_FMT, uid,
   						   cleanup_fn);

if (max_pids < 0)
max_pids = read_session_pids_limit(CGROUPS_V2_INIT_SCOPE, uid,


				   cleanup_fn);

if (max_pids < 0)
	return -1;

diff --git a/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c b/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c
index 3cb70ab18..f0a631479 100644
--- a/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c
+++ b/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c
@@ -109,7 +109,7 @@ int main(int argc, char **argv)
   		}
   	}

free_pids = tst_get_free_pids(cleanup);


free_pids = tst_get_free_pids(cleanup) * 0.9;

Floating point calculations on integers are typically avoided if
possible, you could simply use * 9 / 10 here.
Interesting, I didn't know : thanks for pointing it out !
...
Otherwise that idea sounds sensible to me too, I found it quite strange that the test attempts to
use literally all the remaining PIDs in fact.
It's one of the more aggressive tests, but in theory it leaves 50 PIDs free
(that's directly handled by tst_get_free_pids_. But it goes over that quite
a bit anyway, as explained.
I think I am missing smth still:
"....it seems like msgstress03
had some accounting issues, the real number of PIDs in use always being
5-10% greater than what it thought it was."
Isn't that expected ? Especially if the system is busy?
I think it _would_ indeed be expected if the system was busy. The error
message does highlight it !
However, this is clearly not the case of my system during the test : before
running msgstress03, the current task count in the scope was always less
than 10, and quite stable. As the test is the only thing loading the system
in my case, it should be quite close to the real task count. However, when
it hit the limit (which was about 4900 PIDs), it was forking its 4500-4600th
children, which is what lead me to this conclusion.
msgstress04 doesn't seem to suffer from such a large disparity. (It being
much more cautious and halving the free_pids probably helps !)
Any reason for not incorporating free_pids in msgstress03 ?
(Just a quick thought)
...
...
tst_get_free_pids_ takes a 'snapshot' of the number of processes running
at the time that function is being called but that does not mean that number
will stand still - number of running processes may change behind the scenes.
Yep, completely agree.
...
The worrying part is that the buffer of 50 is not big enough to accommodate that.
Is my understanding correct that the test reports a failure if it fails to
spawn expected number of processes ?
The test fails if any of the forks fail, so in a way yes ? (It will report a
TFAIL for each failing fork to be exact)
Is there a chance the fork fails for a different reason that exceeding the max
pids limit ?
...
...
If so, wouldn't it be better to, at that
point, compare number of running processes - number of test forks against
number of expected forks to see if there are other processes eating up max pid
allowance ? Or did I misundertand the issue in the first place ?
I believe that's what I tried to do, in my explanation above. Do tell me if
I didn't understand the suggestion properly.
I could always go back and be more thorough in investigating the why there
is such a disparity, but there's not really much going on on the system when
doing those tests, and others tests don't appear to have such an issue.
So what I was thinking is, at the point of failure, check the number of active
processes (within a cgroup) and compare that with the max pids - for debugging
purposes only. There seems to be also an event counter in the pids controller
reflecting number of forks failed due to imposed limit - might be worth to
check that to (again for debugging purposes, can be done outside of the test
itself).
I must admit I am not a big fan of arbitrary limits which might work for some
cases, but not the others, and it does slightly feel like being more of
covering the issue instead of fixing it. On the other hand, the test failure
might be somewhat 'expected'. BTW: did we try to run that on cgrup-free system ?
---
BR
B.
...
Thanks for the comments,
Téo
...

BR
B.
...
...
Kevin
Thanks for the review !
Tï¿½o
...
...
if (nprocs >= free_pids) {
	tst_resm(TINFO,
		 "Requested number of processes higher than limit (%d > %d), "


linux-morello-ltp mailing list -- linux-morello-ltp@op-lists.linaro.org
To unsubscribe send an email to linux-morello-ltp-leave@op-lists.linaro.org

linux-morello-ltp mailing list -- linux-morello-ltp@op-lists.linaro.org
To unsubscribe send an email to linux-morello-ltp-leave@op-lists.linaro.org

    

2026

2025

2024

2023

2022

[linux-morello-ltp] Re: [RFC PATCH 1/1] lib/tst_pid: Add a new file to get pid_max

Signed-off-by: Teo Couprie Diaz teo.coupriediaz@arm.com