On Tue, Oct 25, 2022, at 17:13, Vincenzo Frascino wrote:
On 10/11/22 11:45, Arnd Bergmann wrote:
On Mon, Sep 26, 2022, at 10:09 AM, Vincenzo Frascino wrote:
I've skimmed through the documents and your current implementation to get a rough feeling where you are now. I have a number of questions and some concerns regarding how this would end up being upstreamed as well as about possible sources of bugs. Here are some initial thoughts I have:
We are not planning to upstream our work, at least in its current form. As I was mentioning in my previous email, Morello is not a committed architecture hence arm64 maintainers are not keen to support it in the long term, unless something changes. Our current focus is to enable userspace (as quickly as possible) to allow research on capabilities. We are really keen to align our implementation with upstream and contribute back (where it makes sense) because we feel it helps us to maintain our implementation and makes it simple to rebase our patches on a recent kernel.
Right, that was my understanding already.
=== System call ABI ===
- Repurposing the 32-bit compat layer to support normal LP64 tasks is clearly a clever trick to get things going. However, this is something that worries me because you end up implicitly defining the native ABI with 128-bit pointers. This will likely repeat some of the mistakes that got introduced with the first 64-bit ABIs. My feeling is that the compat namespace (struct compat_*, in_compat_syscall(), ...) is better left for 32-bit tasks, but adding a compat64_* and/or compat128_* namespace would be fine.
This approach was taken because it was the fastest for us to be able to run multiple PCuABI compatible binaries without having to port a shell and bought us time to focus on the kernel implementation. It has definitely limitations and problems, we recognize it, but it served its purpose. Once we complete the implementation of the PCuABI and enforce capability properties across the kernel (our main focus at the moment) we might revisit this. Having a different namespace for compat64 sounds a good approach.
Ok.
One immediate issue I see with the new native ABI is that it creates structures with holes in them like this one from your documentation:
struct clone_args { __aligned_u64 flags; __kernel_uintptr_t pidfd; __kernel_uintptr_t child_tid; __kernel_uintptr_t parent_tid; __aligned_u64 exit_signal; __kernel_uintptr_t stack; __aligned_u64 stack_size; __kernel_uintptr_t tls; __kernel_uintptr_t set_tid; __aligned_u64 set_tid_size; __aligned_u64 cgroup; };
 Since __kernel_uintptr_t has 128-bit alignment, there are now holes after flags, exit_signal, and stack_size. We try hard not to define ABIs like this in the kernel because that risks information leaks when copying the structure from kernel to user space (clone3 does not copy its arguments back, but other syscalls do).
I discussed this topic with the team and we agree on the potential risk of leaking information when data structures containing holes are copied from kernel to userspace.
Said that, we tried to define our PCuABI with two concepts in mind: maintainability and simplicity in porting applications that today do not support capabilities. One of the approaches we could have followed to prevent holes would have been to rearrange the data structures (this is possible because we are defining a new ABI) but it means increasing the complexity in maintaining different data structures across ABIs and makes generally more difficult porting applications because it might break some userspace assumptions.
This does not mean that we take the leaks problem less seriously, in fact we were thinking to run, in future, KMSAN sanitizer [1] (support recently merged upstream) as part of our CI, which should be able to detect these kind of problems for both structs on the stack and on the heap. We have a ticket against our project that describes the details [2].
Note: It requires LLVM 14.0.6 for arm64 support.
[1] https://www.kernel.org/doc/html/latest/dev-tools/kmsan.html [2] https://git.morello-project.org/morello/kernel/linux/-/issues/38
I've created an account there now and subscribed to this issue.
Most likely we should end up with a combination of new interfaces where appropriate and other cases where the implicit padding gets turned into explicit padding with an #ifdef. We can discuss this further in that issue.
=== ioctl interfaces ===
- Getting driver ioctl interfaces right is going to be the most tedious part of this work. You have so far sidestepped a lot of this by not submitting the code to the upstream kernel and only supporting a small set of drivers that actually make sense on the Morello SoC, but I would still hope that we can come up with a plan that lets us do this in the mainline kernel in a way that is helpful to both you and others that need changes to the way we handle ioctls.
I hope so as well, and I agree on the fact that dealing with ioctl() is the most tedious part of the whole project. We decided therefore to leave it for the end and enable in the meantime only what was strictly needed to focus on the PCuABI.
How do you plan to deal with cases where you currently have an accidental ABI that may or may not work when the correct one involves an ABI change? If you do a series of internal releases that are not required to be strictly backwards compatible, that won't be much of a problem, but I'm worried that defining interfaces ad-hoc causes more work later if you need to stay compatible.
Arnd