On 18/10/2022 17:18, Szabolcs Nagy wrote:
The 10/18/2022 16:46, Kevin Brodsky wrote:
On 17/10/2022 19:06, Szabolcs Nagy wrote:
currently a capability store to shared memory (MAP_SHARED) segfaults, but process shared robust mutex requires this to work.
the linux design for robust mutex is that each thread has a list of shared robust mutex objects (see set_robust_list system call) that the kernel can walk on thread exit and wake all other waiters of that mutex (with FUTEX_OWNER_DIED).
the list pointer is stored in the mutex object itself which is in shared memory in case of a process shared robust mutex.
Thank you for bringing this up, this is a very unusual piece of ABI indeed and I wasn't familiar with the details myself.
It does feel like some significant change is required to keep this mechanism working, I have a few ideas (summarised below) but none of them is ideal.
I wonder if the list pointer really has to be in the mutex object? Would it be somehow possible to store all the list pointers in some internal libc struct, and only store some index in the mutex object? Granted, that defeats the original design of robust futexes, but it would be the safest option from a CHERI perspective.
the robust list head contains a futex_offset which is the distance between the futex and list pointer members within the mutex object so when the kernel walks the list it can wake on the futex.
https://git.morello-project.org/morello/kernel/linux/-/blob/morello/master/i...
with this approach i don't think we can store the list pointer elsewhere
Right, of course, I hadn't realised the kernel needed to find where the futex is from the address of the list pointer. In this case I agree the list pointer clearly cannot be stored anywhere but next to the mutex object (i.e. in the shared mapping here).
If this option is not realistic, I have an alternative idea, based on two assumptions:
- It is possible to change the mmap() call used to set up the shared
mapping. 2. Userspace never reads list pointers (or at least it doesn't need to retrieve a valid capability), it only writes them.
well the code does read pointers back when it removes items from the list.
only the thread that wrote a particular pointer may access it.
If both are true (please confirm as I'm really not sure!), then it should be possible to make mmap() return a capability with StoreCap but without LoadCap. This way userspace could write capabilities there as needed, which seems safe as it could never read them back. The kernel would then rebuild capabilities when walking the list to add the LoadCap permission. This is unfortunately a fairly complicated scheme.
If everything else fails I think we will have no choice but to disable capability checking here, that is libc would have to store invalid capabilities and the kernel would fake their tag when walking the list. I don't think we can reasonably have a shared mapping that allows both reading and writing capabilities here, because it would be very easy to exchange arbitrary capabilities between processes by abusing this mapping.
instead of a pointer it could be just an address, but then removing an item from the list needs a lookuptable in libc to get an actual capability to the prev/next entries. this requires memory so it introduces new failures in pthread_mutex_init. and the kernel side has to create a capability from the address too.
Right so this is the second thing I hadn't realised - libc needs to be able to walk the list to remove items, so userspace is both a reader and a writer. That eliminates my remaining ideas :(
It does feel that this piece of ABI is fundamentally dangerous with shared mutexes: with P1 the process owning the mutex and P2 another process having access to the mutex, AFAICT nothing prevents P2 from writing an arbitrary pointer in the mutex object, causing P1 to make arbitrary accesses while walking the robust list. Simply using capabilities does not prevent this (P2 can write a valid capability that is still completely arbitrary from P1's perspective).
Having libc use a lookup table would improve the situation in userspace, as it prevents arbitrary accesses on the libc side. However the kernel side may still perform arbitrary accesses, which can be acceptable for reads but less so for the write that occurs when the kernel decides to set FUTEX_OWNER_DIED in the futex word.
Considering how fundamentally difficult it is to solve this issue (potentially requiring a whole new approach to do it safely), I wonder if it would be acceptable to leave it unresolved for the time being and add it to the list of known issues (notably in the PCuABI spec).
Kevin
not sure if we can make this to work on morello.
example posix code:
#include <pthread.h> #include <sys/mman.h> int main() { pthread_mutex_t *m = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_ANONYMOUS|MAP_SHARED, -1, 0); pthread_mutexattr_t a; pthread_mutexattr_init(&a); pthread_mutexattr_setrobust(&a, PTHREAD_MUTEX_ROBUST); pthread_mutex_init(m, &a); pthread_mutexattr_destroy(&a); pthread_mutex_lock(m); // segfaults in libc return 0; } IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. _______________________________________________ linux-morello mailing list -- linux-morello@op-lists.linaro.org To unsubscribe send an email to linux-morello-leave@op-lists.linaro.org