On 25/10/2022 16:49, Matthew Wilcox wrote:
Keeping in mind the distinction in between addresses and pointers, I think that having an automated way to convert the kernel would definitely make our life easier. Something I played in past with is combining python with coccinelle which might be useful in this situation as well.
When I last spoke to Torvalds about uintptr_t, he's completely opposed. In fact, we already do make a distinction inside the kernel between addresses (unsigned long) and pointers (void * / void __user *).
As I said in my previous email, this all depends on whether or not we decide to support capability based architectures in the kernel together with 128 bit ones.
If we say we do not support capability based architectures in the kernel this approach is perfect (e.g. RV128).
If we decide to support capability based architectures, the first thing we need to think of is the fact that userspace on these architectures will most likely not upscale artificially unsigned long to 128 bit (it will continue to use 64 bit ones). This is mainly for memory impact and performance reasons. In such a scenario according to me we (as in for Morello) are "forced" to head to a pure capability kernel (not a big deal since it is my preferred choice) but we are still left with one mass conversion at ABI level. In fact IMHO we will need to replace unsigned long with something like __kernel_ulong_t (already present in the kernel) or similar to distinguish the semantical meaning of longs in userspace from what the kernel does.
On open question on this front is the performance impact of such a change since we still did not have time to measure it.
An alternative, for sake of completeness, would be to say that unsigned long can contain capabilities, but I am mentioning it only for academic reasons because I think that we all agree on the fact that making unsigned long 129-bits (128-bit data + capability valid bit) would have a devastating effect on performances.
Yes, there are places where we do confuse the two, but I think they're fixable.
In our current implementation we redefine the meaning of __user to capability to make sure that we propagate capabilities at ABI level in the correct way. This approach has clearly some limitations and it is not meant for upstream. But it was the fastest way we found to enable userspace.
Another advantage of this approach is to identify all the places where __user is used improperly (the kernel does not build, we cannot bypass it). We fixed a few, and where it makes sense we are contributing back our findings upstream, for instance [1].
[1] https://lore.kernel.org/all/20220907121230.21252-1-vincenzo.frascino@arm.com...
This was what I was trying to propose in my talk, that we widen both long and pointer to 128-bit at the same time. It means we don't need the mass-conversion of long -> intptr_t.
We watched very carefully your presentation because we think that most of the concepts are in line with a pure capability based kernel implementation. And we believe that this is the only sane way to bring capabilities upstream.
Though the doubt I presented above still stands.