On 18/10/2022 18:46, Tudor Cretu wrote:
On 18-10-2022 17:17, Teo Couprie Diaz wrote:
On 18/10/2022 15:18, Tudor Cretu wrote:
On 18-10-2022 13:49, Teo Couprie Diaz wrote:
On 18/10/2022 13:07, Tudor Cretu wrote:
Hi Teo,
On 18-10-2022 09:46, Teo Couprie Diaz wrote:
Direct usage of brk is discouraged in favor of using malloc and
s/and/as maybe?
I guess it might be a consequence of the removal from the standard, not sure ? I can replace it if you feel that's better.
No preference really.
it was removed from POSIX in POSIX.1-2001. In particular, the musl libc always returns -ENOMEM for brk which means that the LTP tests fail when calling the kernel directly would pass.
I find this sentence a bit confusing, maybe something along this lines would be more clear:
In particular, the musl libc's brk always returns -ENOMEM which causes the LTP tests to fail. Invoking the syscall directly allows them to pass.
That's clearer indeed, thanks !
sbrk has the same issues so this patch reworks the brk tests slightly to only use direct brk syscalls.
Do you mean that you rewrite the sbrk calls to use the brk syscall instead? If so, I would just say that given that sbrk might have the same issues, you removed the dependency on it. Also, I would make it clear why tst_syscall(__NR_brk, 0) can replace sbrk(0). Up to you!
Exactly. I can try to reword it to make it clearer as you suggest. Explaining why it can replace sbrk is a great point.
Signed-off-by: Teo Couprie Diaz teo.coupriediaz@arm.com
As per internal discussion I believe this patch could be proposed upstream once it is in a good shape. It would fix the brk tests for musl-based systems like Alpine where they currently fail. Note that sbrk is not fixed by this patch as it is fully implemented in the libc: there is no syscall.
testcases/kernel/syscalls/brk/brk01.c | 22 ++++++++++++++++++---- testcases/kernel/syscalls/brk/brk02.c | 17 ++++++++++------- 2 files changed, 28 insertions(+), 11 deletions(-)
diff --git a/testcases/kernel/syscalls/brk/brk01.c b/testcases/kernel/syscalls/brk/brk01.c index 5f482b943..c16b46eaa 100644 --- a/testcases/kernel/syscalls/brk/brk01.c +++ b/testcases/kernel/syscalls/brk/brk01.c
Missing: #include "lapi/syscalls.h" I think
Correct, it slipped in the next patch.
@@ -16,7 +16,15 @@ void verify_brk(void) size_t inc = getpagesize() * 2 - 1; unsigned int i; - cur_brk = (uintptr_t)sbrk(0); + long ret = tst_syscall(__NR_brk, 0);
Is there any need to have ret? I suggest keeping cur_brk.
See below.
+ if (ret < 0) + {
Open brace should be on the previous line. Also below.
+ tst_res(TFAIL, "Failed to get current break"); > + return; + }
I am not sure that the result of the brk syscalls can ever be less than 0. The Linux syscall returns the new program break on success, while on failure, the syscall returns the current break. It should always fail if you call tst_syscall(__NR_brk, 0), so it should always give the current program break. I believe this is what Musl also does for sbrk(0). Therefore, there is no need to have this block. Also below.
In normal operation it cannot, but in one case it can. See `mm/mmap.c` line 204, in the definition of brk :
if (mmap_write_lock_killable(mm)) return -EINTR;
So there's one case where it returns an errno and thus is negative. Using cur_brk would overflow when hitting the error as it is an unsigned variable, this is why I added the ret variable.
However, I might be missing something regarding errno, syscall errors and EINTR which would make it so it really only returns the previous break or the new one.
Commit dc0ef0df7b6a90892ec41933212ac701152a254c says this:
" ... Most of the patches are really trivial because the lock is help from a shallow syscall paths where we can return EINTR trivially and allow the current task to die (note that EINTR will never get to the userspace as the task has fatal signal pending). ... "
So, I don't think brk can ever return EINTR to the userspace.
From what I gather of the commit message context it seems to relate to an out-of-memory situation, so I can understand that it would not reach userspace.
And regarding EINTR, the man page for signal(7) says: " If a signal handler is invoked while a system call or library function call is blocked, then either:
- the call is automatically restarted after the signal handler
returns; or
- the call fails with the error EINTR.
Which of these two behaviors occurs depends on the interface and whether or not the signal handler was established using the SA_RESTART flag "
In our case, the syscall would just be restarted as we don't have a signal handler without the SA_RESTART flag.
But we don't really have a signal handler _with_ SA_RESTART either, right ? Which I guess means that anyway if there's a signal that pops up and can prevent the mmap lock, we wouldn't have a signal handler for it and it would kill the test anyway ?
I guess what confuses me is that I don't really understand the mechanism that would prevent EINTR to be returned to userspace, given that's the point of the error return and that you need to handle it for other syscalls. My interpretation of the signal man page would be that either you have a signal handler with the flag and it restarts, or you don't and you get EINTR. But I can't find any signal handler that would be set in those tests that would have SA_RESTART, so it would be possible to see it in userspace.
I did see it in strace, so it seems to be restarting and not ending up in userspace, but I feel like I'm missing something to understand.
I think mentioning the signal man page was misleading from my side. I don't think the signal system has anything to do with regards to brk/mmap returning EINTR or not actually.
Ok, I was confused and felt like I was missing something !
I think that what stops brk/mmap to actually return EINTR is that the only path that reach that point is already on a task with fatal signal pending. At least this is what I understand from tracking the code a bit and this commit: 916633a40370.
I agree, that would make the most sense.
You're right related that we don't have a signal handler here. If we don't have a signal handler, then the signal default action is executed, which for some, might be to try to restart the signal.
Hope I clarified a bit all the confusion I created. 🙂
Yes, I think we agree and it's much clearer for me ! I'll remove the checks and related code then.
Thanks, Tudor
Thanks for the discussion ! Téo
+ cur_brk = (uintptr_t)ret; > for (i = 0; i < 33; i++) { switch (i % 3) { @@ -31,11 +39,17 @@ void verify_brk(void) break; } - TST_EXP_PASS_SILENT(brk((void *)new_brk), "brk()"); - if (!TST_PASS) + tst_syscall(__NR_brk, new_brk);
In the original code, if brk((void *)new_brk), "brk()") fails, the test prints a TFAIL message and ends. This doesn't happen anymore. Just keep the TST_EXP_PASS_SILENT and only replace the brk(..) with tst_syscall(__NR_brk, ...).
According to the documentation :
> The 'TST_EXP_PASS()' can be used for calls that return -1 on failure and 0 on > success. It will check for the return value and reports failure if the return > value is not equal to 0. The call also sets the 'TST_PASS' variable to 1 if > the call succeeeded.
As tst_syscall returns with the return value of the syscall, I don't believe it would work. (The only difference with the _SILENT version is the announcement or not of the PASS.) I could create a wrapper to replace that but that seems unnecessary as I already check that it hasn't errored and the following test checks that the result is correct.
Indeed, it wouldn't work. Also, doing brk twice seems unnecessary. I think you could assign "cur_break = tst_syscall(__NR_brk, new_brk)" directly and just keep the check for cur_brk != new_brk. Would that work?
I believe it would indeed, I will make the changes. I tried to stay close to the original code which is why it is weirdly duplicated.
+ ret = tst_syscall(__NR_brk, 0);
+ if (ret < 0) + > + tst_res(TFAIL, "Failed to get current break"); return; + } - cur_brk = (uintptr_t)sbrk(0); + cur_brk = (uintptr_t)ret; if (cur_brk != new_brk) { tst_res(TFAIL, diff --git a/testcases/kernel/syscalls/brk/brk02.c b/testcases/kernel/syscalls/brk/brk02.c index 11e803cb4..19514febb 100644 --- a/testcases/kernel/syscalls/brk/brk02.c +++ b/testcases/kernel/syscalls/brk/brk02.c @@ -14,24 +14,27 @@ #include <unistd.h> #include <sys/mman.h> #include "tst_test.h" +#include "lapi/syscalls.h" void brk_down_vmas(void) { - void *brk_addr = sbrk(0); + long ret = tst_syscall(__NR_brk, 0);
I suggest keeping brk_addr here as well. No need to store the result into a long that you cast to void* afterwards. You can change the brk_addr's type to uintptr_t though if you want.
Same as above, will change depending on what we decide.
- if (brk_addr == (void *) -1) - tst_brk(TBROK, "sbrk() failed"); + if (ret < 0) + tst_brk(TBROK, "Failed to get current break");
Same as above, tst_syscall(__NR_brk, 0) never returns below 0, so no need to check it.
Same as above, will change depending on discussion.
Good patch overall, only a few nits! Make sure to verify the LTP patches against checkpatch as well.
Thanks for the review Tudor !
Thanks, Tudor
Thanks, Téo
+ void *brk_addr = (void *) ret; unsigned long page_size = getpagesize(); void *addr = brk_addr + page_size; - if (brk(addr)) { + if (tst_syscall(__NR_brk, addr) < addr) { tst_res(TFAIL | TERRNO, "Cannot expand brk() by page size"); return; } addr += page_size; - if (brk(addr)) { + if (tst_syscall(__NR_brk, addr) < addr) { tst_res(TFAIL | TERRNO, "Cannot expand brk() by 2x page size"); return; } @@ -42,12 +45,12 @@ void brk_down_vmas(void) } addr += page_size; - if (brk(addr)) { + if (tst_syscall(__NR_brk, addr) < addr) { tst_res(TFAIL | TERRNO, "Cannot expand brk() after mprotect"); return; } - if (brk(brk_addr)) { + if (tst_syscall(__NR_brk, brk_addr) != brk_addr) { tst_res(TFAIL | TERRNO, "Cannot restore brk() to start address"); return; }