This fixes two tests that were failing in our debian distribution because of a PID limit imposed by systemd which was not properly detected.
This was fixed by adding another file to read the max_pid from that should be available everywhere.
This worked for msgstress04, but doing some testing msgstress03 still hit the limit. Printing the number of PIDs in use, it seems like msgstress03 had some accounting issues, the real number of PIDs in use always being 5-10% greater than what it thought it was. Not finding where this accounting error was, I reduced the number of free PIDs by 10% which proved to do the trick.
I'm not too sure about these solutions and would be happy to receive some feedback, hence the RFC state. (It might make sense to split the patch as well, but it would be two very small diffs.)
Thanks in advance !
Teo Couprie Diaz (1): lib/tst_pid: Add a new file to get pid_max
lib/tst_pid.c | 4 ++++ testcases/kernel/syscalls/ipc/msgstress/msgstress03.c | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-)
In our debian-based distribution, the two files used in lib/tst_pid are not available, but systemd still imposes a task limit far lesser than the kernel pid_max. Add another file that seems to be always available to read the maximum number of PIDs.
This fixed msgstress04, but it appeared that msgstress03 didn't account for all of its PIDs, so it still hit the limit. Reduce the number of free PIDs by 10% in msgstress03 to account for it.
Signed-off-by: Teo Couprie Diaz teo.coupriediaz@arm.com --- lib/tst_pid.c | 4 ++++ testcases/kernel/syscalls/ipc/msgstress/msgstress03.c | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/lib/tst_pid.c b/lib/tst_pid.c index 21cadef2a..3c10e0298 100644 --- a/lib/tst_pid.c +++ b/lib/tst_pid.c @@ -33,6 +33,7 @@ #define PID_MAX_PATH "/proc/sys/kernel/pid_max" #define CGROUPS_V1_SLICE_FMT "/sys/fs/cgroup/pids/user.slice/user-%d.slice/pids.max" #define CGROUPS_V2_SLICE_FMT "/sys/fs/cgroup/user.slice/user-%d.slice/pids.max" +#define CGROUPS_V2_INIT_SCOPE "/sys/fs/cgroup/pids/init.scope/pids.max" /* Leave some available processes for the OS */ #define PIDS_RESERVE 50
@@ -103,6 +104,9 @@ static int get_session_pids_limit(void (*cleanup_fn) (void)) if (max_pids < 0) max_pids = read_session_pids_limit(CGROUPS_V1_SLICE_FMT, uid, cleanup_fn); + if (max_pids < 0) + max_pids = read_session_pids_limit(CGROUPS_V2_INIT_SCOPE, uid, + cleanup_fn);
if (max_pids < 0) return -1; diff --git a/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c b/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c index 3cb70ab18..f0a631479 100644 --- a/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c +++ b/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c @@ -109,7 +109,7 @@ int main(int argc, char **argv) } }
- free_pids = tst_get_free_pids(cleanup); + free_pids = tst_get_free_pids(cleanup) * 0.9; if (nprocs >= free_pids) { tst_resm(TINFO, "Requested number of processes higher than limit (%d > %d), "
On 20/12/2022 10:45, Teo Couprie Diaz wrote:
In our debian-based distribution, the two files used in lib/tst_pid are not available, but systemd still imposes a task limit far lesser than the kernel pid_max. Add another file that seems to be always available to read the maximum number of PIDs.
This fixed msgstress04, but it appeared that msgstress03 didn't account for all of its PIDs, so it still hit the limit. Reduce the number of free PIDs by 10% in msgstress03 to account for it.
Signed-off-by: Teo Couprie Diaz teo.coupriediaz@arm.com
lib/tst_pid.c | 4 ++++ testcases/kernel/syscalls/ipc/msgstress/msgstress03.c | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/lib/tst_pid.c b/lib/tst_pid.c index 21cadef2a..3c10e0298 100644 --- a/lib/tst_pid.c +++ b/lib/tst_pid.c @@ -33,6 +33,7 @@ #define PID_MAX_PATH "/proc/sys/kernel/pid_max" #define CGROUPS_V1_SLICE_FMT "/sys/fs/cgroup/pids/user.slice/user-%d.slice/pids.max" #define CGROUPS_V2_SLICE_FMT "/sys/fs/cgroup/user.slice/user-%d.slice/pids.max" +#define CGROUPS_V2_INIT_SCOPE "/sys/fs/cgroup/pids/init.scope/pids.max"
On Arch I don't get pids/ under cgroup/, what I have is init.scope/ directly underneath, that is:
/sys/fs/cgroup/init.scope/pids.max
Not sure if that's a configuration difference? In any case the idea sounds sensible.
/* Leave some available processes for the OS */ #define PIDS_RESERVE 50 @@ -103,6 +104,9 @@ static int get_session_pids_limit(void (*cleanup_fn) (void)) if (max_pids < 0) max_pids = read_session_pids_limit(CGROUPS_V1_SLICE_FMT, uid, cleanup_fn);
- if (max_pids < 0)
max_pids = read_session_pids_limit(CGROUPS_V2_INIT_SCOPE, uid,
cleanup_fn);
if (max_pids < 0) return -1; diff --git a/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c b/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c index 3cb70ab18..f0a631479 100644 --- a/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c +++ b/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c @@ -109,7 +109,7 @@ int main(int argc, char **argv) } }
- free_pids = tst_get_free_pids(cleanup);
- free_pids = tst_get_free_pids(cleanup) * 0.9;
Floating point calculations on integers are typically avoided if possible, you could simply use * 9 / 10 here. Otherwise that idea sounds sensible to me too, I found it quite strange that the test attempts to use literally all the remaining PIDs in fact.
Kevin
if (nprocs >= free_pids) { tst_resm(TINFO, "Requested number of processes higher than limit (%d > %d), "
On 20/12/2022 17:18, Kevin Brodsky wrote:
On 20/12/2022 10:45, Teo Couprie Diaz wrote:
In our debian-based distribution, the two files used in lib/tst_pid are not available, but systemd still imposes a task limit far lesser than the kernel pid_max. Add another file that seems to be always available to read the maximum number of PIDs.
This fixed msgstress04, but it appeared that msgstress03 didn't account for all of its PIDs, so it still hit the limit. Reduce the number of free PIDs by 10% in msgstress03 to account for it.
Signed-off-by: Teo Couprie Diaz teo.coupriediaz@arm.com
lib/tst_pid.c | 4 ++++ testcases/kernel/syscalls/ipc/msgstress/msgstress03.c | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/lib/tst_pid.c b/lib/tst_pid.c index 21cadef2a..3c10e0298 100644 --- a/lib/tst_pid.c +++ b/lib/tst_pid.c @@ -33,6 +33,7 @@ #define PID_MAX_PATH "/proc/sys/kernel/pid_max" #define CGROUPS_V1_SLICE_FMT "/sys/fs/cgroup/pids/user.slice/user-%d.slice/pids.max" #define CGROUPS_V2_SLICE_FMT "/sys/fs/cgroup/user.slice/user-%d.slice/pids.max" +#define CGROUPS_V2_INIT_SCOPE "/sys/fs/cgroup/pids/init.scope/pids.max"
On Arch I don't get pids/ under cgroup/, what I have is init.scope/ directly underneath, that is:
/sys/fs/cgroup/init.scope/pids.max
Not sure if that's a configuration difference? In any case the idea sounds sensible.
Interesting, thanks for pointing it out. I do have the same on my personal Arch machine, but on my Ubuntu laptop I have the same as in our Debian image. Looking at the defines with a bit more attention, it seems like it could be a cgroups v1/v2 difference. The path I added looking like a v1 thing (similar to CGRUPS_V1_SLICE_FMT), rather than a v2.
I'll try to have a look and see if I can get our image to use cgroups v2.
Would it make sense to add two paths if I don't find another solution, to cover all of our bases ?
/* Leave some available processes for the OS */ #define PIDS_RESERVE 50 @@ -103,6 +104,9 @@ static int get_session_pids_limit(void (*cleanup_fn) (void)) if (max_pids < 0) max_pids = read_session_pids_limit(CGROUPS_V1_SLICE_FMT, uid, cleanup_fn);
- if (max_pids < 0)
max_pids = read_session_pids_limit(CGROUPS_V2_INIT_SCOPE, uid,
cleanup_fn);
if (max_pids < 0) return -1; diff --git a/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c b/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c index 3cb70ab18..f0a631479 100644 --- a/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c +++ b/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c @@ -109,7 +109,7 @@ int main(int argc, char **argv) } }
- free_pids = tst_get_free_pids(cleanup);
- free_pids = tst_get_free_pids(cleanup) * 0.9;
Floating point calculations on integers are typically avoided if possible, you could simply use * 9 / 10 here.
Interesting, I didn't know : thanks for pointing it out !
Otherwise that idea sounds sensible to me too, I found it quite strange that the test attempts to use literally all the remaining PIDs in fact.
It's one of the more aggressive tests, but in theory it leaves 50 PIDs free (that's directly handled by tst_get_free_pids_. But it goes over that quite a bit anyway, as explained.
Kevin
Thanks for the review ! Téo
if (nprocs >= free_pids) { tst_resm(TINFO, "Requested number of processes higher than limit (%d > %d), "
linux-morello-ltp mailing list -- linux-morello-ltp@op-lists.linaro.org To unsubscribe send an email to linux-morello-ltp-leave@op-lists.linaro.org
On 21/12/2022 10:59, Teo Couprie Diaz wrote:
+#define CGROUPS_V2_INIT_SCOPE "/sys/fs/cgroup/pids/init.scope/pids.max"
On Arch I don't get pids/ under cgroup/, what I have is init.scope/ directly underneath, that is:
/sys/fs/cgroup/init.scope/pids.max
Not sure if that's a configuration difference? In any case the idea sounds sensible.
Interesting, thanks for pointing it out. I do have the same on my personal Arch machine, but on my Ubuntu laptop I have the same as in our Debian image. Looking at the defines with a bit more attention, it seems like it could be a cgroups v1/v2 difference. The path I added looking like a v1 thing (similar to CGRUPS_V1_SLICE_FMT), rather than a v2.
I'll try to have a look and see if I can get our image to use cgroups v2.
Would it make sense to add two paths if I don't find another solution, to cover all of our bases ?
Right I see, in that case it would make sense to have both the v1 and v2 paths, especially since the test already handles both for the user slice.
Kevin
On 21/12/2022 10:22, Kevin Brodsky wrote:
On 21/12/2022 10:59, Teo Couprie Diaz wrote:
+#define CGROUPS_V2_INIT_SCOPE "/sys/fs/cgroup/pids/init.scope/pids.max"
On Arch I don't get pids/ under cgroup/, what I have is init.scope/ directly underneath, that is:
/sys/fs/cgroup/init.scope/pids.max
Not sure if that's a configuration difference? In any case the idea sounds sensible.
Interesting, thanks for pointing it out. I do have the same on my personal Arch machine, but on my Ubuntu laptop I have the same as in our Debian image. Looking at the defines with a bit more attention, it seems like it could be a cgroups v1/v2 difference. The path I added looking like a v1 thing (similar to CGRUPS_V1_SLICE_FMT), rather than a v2.
I'll try to have a look and see if I can get our image to use cgroups v2.
Would it make sense to add two paths if I don't find another solution, to cover all of our bases ?
Right I see, in that case it would make sense to have both the v1 and v2 paths, especially since the test already handles both for the user slice.
Agreed, I reached the same conclusion digging a bit deeper. I can confirm this is indeed a v1/v2 difference, so it makes sense to go that way.
Kevin
Téo
On Wed, Dec 21, 2022 at 10:40:57AM +0000, Teo Couprie Diaz wrote:
On 21/12/2022 10:22, Kevin Brodsky wrote:
On 21/12/2022 10:59, Teo Couprie Diaz wrote:
+#define CGROUPS_V2_INIT_SCOPE "/sys/fs/cgroup/pids/init.scope/pids.max"
On Arch I don't get pids/ under cgroup/, what I have is init.scope/ directly underneath, that is:
/sys/fs/cgroup/init.scope/pids.max
Not sure if that's a configuration difference? In any case the idea sounds sensible.
Interesting, thanks for pointing it out. I do have the same on my personal Arch machine, but on my Ubuntu laptop I have the same as in our Debian image. Looking at the defines with a bit more attention, it seems like it could be a cgroups v1/v2 difference. The path I added looking like a v1 thing (similar to CGRUPS_V1_SLICE_FMT), rather than a v2.
I'll try to have a look and see if I can get our image to use cgroups v2.
Would it make sense to add two paths if I don't find another solution, to cover all of our bases ?
Right I see, in that case it would make sense to have both the v1 and v2 paths, especially since the test already handles both for the user slice.
Agreed, I reached the same conclusion digging a bit deeper. I can confirm this is indeed a v1/v2 difference, so it makes sense to go that way.
Kevin
Téo
Out of curiosity: what if the LTP itself will be running under cgroup with higher restrictions that 'root' cgroup ? Wouldn't all the tests fail ? Shouldn't we try to find out under which cgroup the test process is being controlled ? FWIW: this is a long game but it would be nicer to have some kind of a parser rather then hardcoding paths. (simple 'ls -l /sys/fs/cgroup/*/*/pids.max' gives the pids.max for top of the hierarchy)
--- BR B.
linux-morello-ltp mailing list -- linux-morello-ltp@op-lists.linaro.org To unsubscribe send an email to linux-morello-ltp-leave@op-lists.linaro.org
On 21/12/2022 16:39, Beata Michalska wrote:
On Wed, Dec 21, 2022 at 10:40:57AM +0000, Teo Couprie Diaz wrote:
On 21/12/2022 10:22, Kevin Brodsky wrote:
On 21/12/2022 10:59, Teo Couprie Diaz wrote:
+#define CGROUPS_V2_INIT_SCOPE "/sys/fs/cgroup/pids/init.scope/pids.max"
On Arch I don't get pids/ under cgroup/, what I have is init.scope/ directly underneath, that is:
/sys/fs/cgroup/init.scope/pids.max
Not sure if that's a configuration difference? In any case the idea sounds sensible.
Interesting, thanks for pointing it out. I do have the same on my personal Arch machine, but on my Ubuntu laptop I have the same as in our Debian image. Looking at the defines with a bit more attention, it seems like it could be a cgroups v1/v2 difference. The path I added looking like a v1 thing (similar to CGRUPS_V1_SLICE_FMT), rather than a v2.
I'll try to have a look and see if I can get our image to use cgroups v2.
Would it make sense to add two paths if I don't find another solution, to cover all of our bases ?
Right I see, in that case it would make sense to have both the v1 and v2 paths, especially since the test already handles both for the user slice.
Agreed, I reached the same conclusion digging a bit deeper. I can confirm this is indeed a v1/v2 difference, so it makes sense to go that way.
Kevin
Téo
Out of curiosity: what if the LTP itself will be running under cgroup with higher restrictions that 'root' cgroup ? Wouldn't all the tests fail ?
They probably would, yes. I feel that the way it is retrieved is kind of hacky and not very reliable, as indeed it could be in a different cgroup, or not under a user slice...
Shouldn't we try to find out under which cgroup the test process is being controlled ?
Ideally yes absolutely. I have tried looking for something for a bit, but couldn't find anything readily available with systemd. I could (and probably should) delve a bit deeper. Maybe use a syscall directly to get the information (I've been working on getrlimit in parallel, it looks like it could be useful ?) would be more reliable ?
Anyway, I'll abstain from sending a v2 for now as the file I suggest using in this patch is now not there anymore on my test system ! Don't really know how or why, but it surely highlights how fragile this solution is...
FWIW: this is a long game but it would be nicer to have some kind of a parser rather then hardcoding paths. (simple 'ls -l /sys/fs/cgroup/*/*/pids.max' gives the pids.max for top of the hierarchy)
I did have this thought while adding the v1/v2 paths : it would be quite a nice thing to have !
Thanks for the comments Beata,
Téo
BR B.
linux-morello-ltp mailing list -- linux-morello-ltp@op-lists.linaro.org To unsubscribe send an email to linux-morello-ltp-leave@op-lists.linaro.org
linux-morello-ltp mailing list -- linux-morello-ltp@op-lists.linaro.org To unsubscribe send an email to linux-morello-ltp-leave@op-lists.linaro.org
On Wed, Dec 21, 2022 at 05:06:06PM +0000, Teo Couprie Diaz wrote:
On 21/12/2022 16:39, Beata Michalska wrote:
On Wed, Dec 21, 2022 at 10:40:57AM +0000, Teo Couprie Diaz wrote:
On 21/12/2022 10:22, Kevin Brodsky wrote:
On 21/12/2022 10:59, Teo Couprie Diaz wrote:
> +#define CGROUPS_V2_INIT_SCOPE > "/sys/fs/cgroup/pids/init.scope/pids.max" On Arch I don't get pids/ under cgroup/, what I have is init.scope/ directly underneath, that is:
/sys/fs/cgroup/init.scope/pids.max
Not sure if that's a configuration difference? In any case the idea sounds sensible.
Interesting, thanks for pointing it out. I do have the same on my personal Arch machine, but on my Ubuntu laptop I have the same as in our Debian image. Looking at the defines with a bit more attention, it seems like it could be a cgroups v1/v2 difference. The path I added looking like a v1 thing (similar to CGRUPS_V1_SLICE_FMT), rather than a v2.
I'll try to have a look and see if I can get our image to use cgroups v2.
Would it make sense to add two paths if I don't find another solution, to cover all of our bases ?
Right I see, in that case it would make sense to have both the v1 and v2 paths, especially since the test already handles both for the user slice.
Agreed, I reached the same conclusion digging a bit deeper. I can confirm this is indeed a v1/v2 difference, so it makes sense to go that way.
Kevin
Téo
Out of curiosity: what if the LTP itself will be running under cgroup with higher restrictions that 'root' cgroup ? Wouldn't all the tests fail ?
They probably would, yes. I feel that the way it is retrieved is kind of hacky and not very reliable, as indeed it could be in a different cgroup, or not under a user slice...
Shouldn't we try to find out under which cgroup the test process is being controlled ?
Ideally yes absolutely. I have tried looking for something for a bit, but couldn't find anything readily available with systemd. I could (and probably should) delve a bit deeper. Maybe use a syscall directly to get the information (I've been working on getrlimit in parallel, it looks like it could be useful ?) would be more reliable ?
I'm afraid that rlimit and cgroup pids limit might be actually considered separately (reasoning based on having a quick look at copy_process and pids controller's callback can_fork -> pids_can_fork). Furthermore: RLIMIT_NPROC, pid_max and cgroup's pid.max seem to actually define separate limits, LTP seems to be ignoring rlimit (or maybe I have missed that one). I'll try to help in finding a way to consider all three, and to preferably not to rely of fixed path fr the proc.max limits.
Anyway, I'll abstain from sending a v2 for now as the file I suggest using in this patch is now not there anymore on my test system ! Don't really know how or why, but it surely highlights how fragile this solution is...
It is worrying indeed.
FWIW: this is a long game but it would be nicer to have some kind of a parser rather then hardcoding paths. (simple 'ls -l /sys/fs/cgroup/*/*/pids.max' gives the pids.max for top of the hierarchy)
I did have this thought while adding the v1/v2 paths : it would be quite a nice thing to have !
Thanks for the comments Beata,
Téo
BR B.
linux-morello-ltp mailing list -- linux-morello-ltp@op-lists.linaro.org To unsubscribe send an email to linux-morello-ltp-leave@op-lists.linaro.org
linux-morello-ltp mailing list -- linux-morello-ltp@op-lists.linaro.org To unsubscribe send an email to linux-morello-ltp-leave@op-lists.linaro.org
On Wed, Dec 21, 2022 at 09:59:04AM +0000, Teo Couprie Diaz wrote:
On 20/12/2022 17:18, Kevin Brodsky wrote:
On 20/12/2022 10:45, Teo Couprie Diaz wrote:
In our debian-based distribution, the two files used in lib/tst_pid are not available, but systemd still imposes a task limit far lesser than the kernel pid_max. Add another file that seems to be always available to read the maximum number of PIDs.
This fixed msgstress04, but it appeared that msgstress03 didn't account for all of its PIDs, so it still hit the limit. Reduce the number of free PIDs by 10% in msgstress03 to account for it.
Signed-off-by: Teo Couprie Diaz teo.coupriediaz@arm.com
lib/tst_pid.c | 4 ++++ testcases/kernel/syscalls/ipc/msgstress/msgstress03.c | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/lib/tst_pid.c b/lib/tst_pid.c index 21cadef2a..3c10e0298 100644 --- a/lib/tst_pid.c +++ b/lib/tst_pid.c @@ -33,6 +33,7 @@ #define PID_MAX_PATH "/proc/sys/kernel/pid_max" #define CGROUPS_V1_SLICE_FMT "/sys/fs/cgroup/pids/user.slice/user-%d.slice/pids.max" #define CGROUPS_V2_SLICE_FMT "/sys/fs/cgroup/user.slice/user-%d.slice/pids.max" +#define CGROUPS_V2_INIT_SCOPE "/sys/fs/cgroup/pids/init.scope/pids.max"
On Arch I don't get pids/ under cgroup/, what I have is init.scope/ directly underneath, that is:
/sys/fs/cgroup/init.scope/pids.max
Not sure if that's a configuration difference? In any case the idea sounds sensible.
Interesting, thanks for pointing it out. I do have the same on my personal Arch machine, but on my Ubuntu laptop I have the same as in our Debian image. Looking at the defines with a bit more attention, it seems like it could be a cgroups v1/v2 difference. The path I added looking like a v1 thing (similar to CGRUPS_V1_SLICE_FMT), rather than a v2.
I'll try to have a look and see if I can get our image to use cgroups v2.
Would it make sense to add two paths if I don't find another solution, to cover all of our bases ?
/* Leave some available processes for the OS */ #define PIDS_RESERVE 50 @@ -103,6 +104,9 @@ static int get_session_pids_limit(void (*cleanup_fn) (void)) if (max_pids < 0) max_pids = read_session_pids_limit(CGROUPS_V1_SLICE_FMT, uid, cleanup_fn);
- if (max_pids < 0)
max_pids = read_session_pids_limit(CGROUPS_V2_INIT_SCOPE, uid,
if (max_pids < 0) return -1;cleanup_fn);
diff --git a/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c b/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c index 3cb70ab18..f0a631479 100644 --- a/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c +++ b/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c @@ -109,7 +109,7 @@ int main(int argc, char **argv) } }
- free_pids = tst_get_free_pids(cleanup);
- free_pids = tst_get_free_pids(cleanup) * 0.9;
Floating point calculations on integers are typically avoided if possible, you could simply use * 9 / 10 here.
Interesting, I didn't know : thanks for pointing it out !
Otherwise that idea sounds sensible to me too, I found it quite strange that the test attempts to use literally all the remaining PIDs in fact.
It's one of the more aggressive tests, but in theory it leaves 50 PIDs free (that's directly handled by tst_get_free_pids_. But it goes over that quite a bit anyway, as explained.
I think I am missing smth still: "....it seems like msgstress03 had some accounting issues, the real number of PIDs in use always being 5-10% greater than what it thought it was."
Isn't that expected ? Especially if the system is busy? tst_get_free_pids_ takes a 'snapshot' of the number of processes running at the time that function is being called but that does not mean that number will stand still - number of running processes may change behind the scenes. The worrying part is that the buffer of 50 is not big enough to accommodate that. Is my understanding correct that the test reports a failure if it fails to spawn expected number of processes ? If so, wouldn't it be better to, at that point, compare number of running processes - number of test forks against number of expected forks to see if there are other processes eating up max pid allowance ? Or did I misundertand the issue in the first place ?
--- BR B.
Kevin
Thanks for the review ! Téo
if (nprocs >= free_pids) { tst_resm(TINFO, "Requested number of processes higher than limit (%d > %d), "
linux-morello-ltp mailing list -- linux-morello-ltp@op-lists.linaro.org To unsubscribe send an email to linux-morello-ltp-leave@op-lists.linaro.org
linux-morello-ltp mailing list -- linux-morello-ltp@op-lists.linaro.org To unsubscribe send an email to linux-morello-ltp-leave@op-lists.linaro.org
On 21/12/2022 17:00, Beata Michalska wrote:
On Wed, Dec 21, 2022 at 09:59:04AM +0000, Teo Couprie Diaz wrote:
On 20/12/2022 17:18, Kevin Brodsky wrote:
On 20/12/2022 10:45, Teo Couprie Diaz wrote:
In our debian-based distribution, the two files used in lib/tst_pid are not available, but systemd still imposes a task limit far lesser than the kernel pid_max. Add another file that seems to be always available to read the maximum number of PIDs.
This fixed msgstress04, but it appeared that msgstress03 didn't account for all of its PIDs, so it still hit the limit. Reduce the number of free PIDs by 10% in msgstress03 to account for it.
Signed-off-by: Teo Couprie Diaz teo.coupriediaz@arm.com
lib/tst_pid.c | 4 ++++ testcases/kernel/syscalls/ipc/msgstress/msgstress03.c | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/lib/tst_pid.c b/lib/tst_pid.c index 21cadef2a..3c10e0298 100644 --- a/lib/tst_pid.c +++ b/lib/tst_pid.c @@ -33,6 +33,7 @@ #define PID_MAX_PATH "/proc/sys/kernel/pid_max" #define CGROUPS_V1_SLICE_FMT "/sys/fs/cgroup/pids/user.slice/user-%d.slice/pids.max" #define CGROUPS_V2_SLICE_FMT "/sys/fs/cgroup/user.slice/user-%d.slice/pids.max" +#define CGROUPS_V2_INIT_SCOPE "/sys/fs/cgroup/pids/init.scope/pids.max"
On Arch I don't get pids/ under cgroup/, what I have is init.scope/ directly underneath, that is:
� /sys/fs/cgroup/init.scope/pids.max
Not sure if that's a configuration difference? In any case the idea sounds sensible.
Interesting, thanks for pointing it out. I do have the same on my personal Arch machine, but on my Ubuntu laptop I have the same as in our Debian image. Looking at the defines with a bit more attention, it seems like it could be a cgroups v1/v2 difference. The path I added looking like a v1 thing (similar to CGRUPS_V1_SLICE_FMT), rather than a v2.
I'll try to have a look and see if I can get our image to use cgroups v2.
Would it make sense to add two paths if I don't find another solution, to cover all of our bases ?
/* Leave some available processes for the OS */ #define PIDS_RESERVE 50 @@ -103,6 +104,9 @@ static int get_session_pids_limit(void (*cleanup_fn) (void)) if (max_pids < 0) max_pids = read_session_pids_limit(CGROUPS_V1_SLICE_FMT, uid, cleanup_fn);
- if (max_pids < 0)
max_pids = read_session_pids_limit(CGROUPS_V2_INIT_SCOPE, uid,
if (max_pids < 0) return -1;cleanup_fn);
diff --git a/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c b/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c index 3cb70ab18..f0a631479 100644 --- a/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c +++ b/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c @@ -109,7 +109,7 @@ int main(int argc, char **argv) } }
- free_pids = tst_get_free_pids(cleanup);
- free_pids = tst_get_free_pids(cleanup) * 0.9;
Floating point calculations on integers are typically avoided if possible, you could simply use * 9 / 10 here.
Interesting, I didn't know : thanks for pointing it out !
Otherwise that idea sounds sensible to me too, I found it quite strange that the test attempts to use literally all the remaining PIDs in fact.
It's one of the more aggressive tests, but in theory it leaves 50 PIDs free (that's directly handled by tst_get_free_pids_. But it goes over that quite a bit anyway, as explained.
I think I am missing smth still: "....it seems like msgstress03 had some accounting issues, the real number of PIDs in use always being 5-10% greater than what it thought it was."
Isn't that expected ? Especially if the system is busy?
I think it _would_ indeed be expected if the system was busy. The error message does highlight it !
However, this is clearly not the case of my system during the test : before running msgstress03, the current task count in the scope was always less than 10, and quite stable. As the test is the only thing loading the system in my case, it should be quite close to the real task count. However, when it hit the limit (which was about 4900 PIDs), it was forking its 4500-4600th children, which is what lead me to this conclusion.
msgstress04 doesn't seem to suffer from such a large disparity. (It being much more cautious and halving the free_pids probably helps !)
tst_get_free_pids_ takes a 'snapshot' of the number of processes running at the time that function is being called but that does not mean that number will stand still - number of running processes may change behind the scenes.
Yep, completely agree.
The worrying part is that the buffer of 50 is not big enough to accommodate that. Is my understanding correct that the test reports a failure if it fails to spawn expected number of processes ?
The test fails if any of the forks fail, so in a way yes ? (It will report a TFAIL for each failing fork to be exact)
If so, wouldn't it be better to, at that point, compare number of running processes - number of test forks against number of expected forks to see if there are other processes eating up max pid allowance ? Or did I misundertand the issue in the first place ?
I believe that's what I tried to do, in my explanation above. Do tell me if I didn't understand the suggestion properly. I could always go back and be more thorough in investigating the why there is such a disparity, but there's not really much going on on the system when doing those tests, and others tests don't appear to have such an issue.
Thanks for the comments, Téo
BR B.
Kevin
Thanks for the review ! T�o
if (nprocs >= free_pids) { tst_resm(TINFO, "Requested number of processes higher than limit (%d > %d), "
linux-morello-ltp mailing list -- linux-morello-ltp@op-lists.linaro.org To unsubscribe send an email to linux-morello-ltp-leave@op-lists.linaro.org
linux-morello-ltp mailing list -- linux-morello-ltp@op-lists.linaro.org To unsubscribe send an email to linux-morello-ltp-leave@op-lists.linaro.org
On Wed, Dec 21, 2022 at 05:39:32PM +0000, Teo Couprie Diaz wrote:
On 21/12/2022 17:00, Beata Michalska wrote:
On Wed, Dec 21, 2022 at 09:59:04AM +0000, Teo Couprie Diaz wrote:
On 20/12/2022 17:18, Kevin Brodsky wrote:
On 20/12/2022 10:45, Teo Couprie Diaz wrote:
In our debian-based distribution, the two files used in lib/tst_pid are not available, but systemd still imposes a task limit far lesser than the kernel pid_max. Add another file that seems to be always available to read the maximum number of PIDs.
This fixed msgstress04, but it appeared that msgstress03 didn't account for all of its PIDs, so it still hit the limit. Reduce the number of free PIDs by 10% in msgstress03 to account for it.
Signed-off-by: Teo Couprie Diaz teo.coupriediaz@arm.com
lib/tst_pid.c | 4 ++++ testcases/kernel/syscalls/ipc/msgstress/msgstress03.c | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/lib/tst_pid.c b/lib/tst_pid.c index 21cadef2a..3c10e0298 100644 --- a/lib/tst_pid.c +++ b/lib/tst_pid.c @@ -33,6 +33,7 @@ #define PID_MAX_PATH "/proc/sys/kernel/pid_max" #define CGROUPS_V1_SLICE_FMT "/sys/fs/cgroup/pids/user.slice/user-%d.slice/pids.max" #define CGROUPS_V2_SLICE_FMT "/sys/fs/cgroup/user.slice/user-%d.slice/pids.max" +#define CGROUPS_V2_INIT_SCOPE "/sys/fs/cgroup/pids/init.scope/pids.max"
On Arch I don't get pids/ under cgroup/, what I have is init.scope/ directly underneath, that is:
� /sys/fs/cgroup/init.scope/pids.max
Not sure if that's a configuration difference? In any case the idea sounds sensible.
Interesting, thanks for pointing it out. I do have the same on my personal Arch machine, but on my Ubuntu laptop I have the same as in our Debian image. Looking at the defines with a bit more attention, it seems like it could be a cgroups v1/v2 difference. The path I added looking like a v1 thing (similar to CGRUPS_V1_SLICE_FMT), rather than a v2.
I'll try to have a look and see if I can get our image to use cgroups v2.
Would it make sense to add two paths if I don't find another solution, to cover all of our bases ?
/* Leave some available processes for the OS */ #define PIDS_RESERVE 50 @@ -103,6 +104,9 @@ static int get_session_pids_limit(void (*cleanup_fn) (void)) if (max_pids < 0) max_pids = read_session_pids_limit(CGROUPS_V1_SLICE_FMT, uid, cleanup_fn);
- if (max_pids < 0)
max_pids = read_session_pids_limit(CGROUPS_V2_INIT_SCOPE, uid,
if (max_pids < 0) return -1;cleanup_fn);
diff --git a/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c b/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c index 3cb70ab18..f0a631479 100644 --- a/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c +++ b/testcases/kernel/syscalls/ipc/msgstress/msgstress03.c @@ -109,7 +109,7 @@ int main(int argc, char **argv) } }
- free_pids = tst_get_free_pids(cleanup);
- free_pids = tst_get_free_pids(cleanup) * 0.9;
Floating point calculations on integers are typically avoided if possible, you could simply use * 9 / 10 here.
Interesting, I didn't know : thanks for pointing it out !
Otherwise that idea sounds sensible to me too, I found it quite strange that the test attempts to use literally all the remaining PIDs in fact.
It's one of the more aggressive tests, but in theory it leaves 50 PIDs free (that's directly handled by tst_get_free_pids_. But it goes over that quite a bit anyway, as explained.
I think I am missing smth still: "....it seems like msgstress03 had some accounting issues, the real number of PIDs in use always being 5-10% greater than what it thought it was."
Isn't that expected ? Especially if the system is busy?
I think it _would_ indeed be expected if the system was busy. The error message does highlight it !
However, this is clearly not the case of my system during the test : before running msgstress03, the current task count in the scope was always less than 10, and quite stable. As the test is the only thing loading the system in my case, it should be quite close to the real task count. However, when it hit the limit (which was about 4900 PIDs), it was forking its 4500-4600th children, which is what lead me to this conclusion.
msgstress04 doesn't seem to suffer from such a large disparity. (It being much more cautious and halving the free_pids probably helps !)
Any reason for not incorporating free_pids in msgstress03 ? (Just a quick thought)
tst_get_free_pids_ takes a 'snapshot' of the number of processes running at the time that function is being called but that does not mean that number will stand still - number of running processes may change behind the scenes.
Yep, completely agree.
The worrying part is that the buffer of 50 is not big enough to accommodate that. Is my understanding correct that the test reports a failure if it fails to spawn expected number of processes ?
The test fails if any of the forks fail, so in a way yes ? (It will report a TFAIL for each failing fork to be exact)
Is there a chance the fork fails for a different reason that exceeding the max pids limit ?
If so, wouldn't it be better to, at that point, compare number of running processes - number of test forks against number of expected forks to see if there are other processes eating up max pid allowance ? Or did I misundertand the issue in the first place ?
I believe that's what I tried to do, in my explanation above. Do tell me if I didn't understand the suggestion properly. I could always go back and be more thorough in investigating the why there is such a disparity, but there's not really much going on on the system when doing those tests, and others tests don't appear to have such an issue.
So what I was thinking is, at the point of failure, check the number of active processes (within a cgroup) and compare that with the max pids - for debugging purposes only. There seems to be also an event counter in the pids controller reflecting number of forks failed due to imposed limit - might be worth to check that to (again for debugging purposes, can be done outside of the test itself). I must admit I am not a big fan of arbitrary limits which might work for some cases, but not the others, and it does slightly feel like being more of covering the issue instead of fixing it. On the other hand, the test failure might be somewhat 'expected'. BTW: did we try to run that on cgrup-free system ?
--- BR B.
Thanks for the comments, Téo
BR B.
Kevin
Thanks for the review ! T�o
if (nprocs >= free_pids) { tst_resm(TINFO, "Requested number of processes higher than limit (%d > %d), "
linux-morello-ltp mailing list -- linux-morello-ltp@op-lists.linaro.org To unsubscribe send an email to linux-morello-ltp-leave@op-lists.linaro.org
linux-morello-ltp mailing list -- linux-morello-ltp@op-lists.linaro.org To unsubscribe send an email to linux-morello-ltp-leave@op-lists.linaro.org
linux-morello-ltp@op-lists.linaro.org