On 12/07/2023 12:19, Kevin Brodsky wrote:
On 08/06/2023 10:42, Zachary Leaf wrote:
The bpf syscall does not require a compat handler for 32-bit compat. This is achieved by using u64 instead of pointer types in the bpf_attr union used to pass arguments to the syscall. This means that in a system where pointers are 32-bit, the struct/union layouts and offsets are the same as in a 64-bit arch, since the u64 field is split into two u32 fields/registers.
This greatly simplifies 32-bit compat at the small cost of requiring casting pointers passed in through the uAPI to u64 (generally via ptr_to_u64() helper functions).
This poses a problem in architectures where user pointers are longer than 64b such as Morello/PCuABI where pointers are represented as 129b capabilities. In order to extend the bpf syscall interface to accept capabilities and still retain compatibility with the existing 64/32b ABI, a 64-bit compat layer and appropriate conversions must be added to handle the different union/struct sizes caused by this pointer size mis-match.
Before extending the number of bits in union bpf_attr to accept capabilitities, set the groundwork with a compat64 handler and
"capabilitities" sounds rather wrong, better with one less "ti" :D
conversion function to take a compat64 sized bpf_attr and convert it to what will be the new native offsets.
Inbound conversion is handled upfront to minimise impact on existing code and reduce overall diff size. After dispatch_bpf the majority of code can remain unchanged. The cases where conversion back out to userspace is required are handled in subsequent commits.
Signed-off-by: Zachary Leaf zachary.leaf@arm.com
[...] +#ifdef CONFIG_COMPAT64 +static void convert_compat_bpf_attr(union bpf_attr *dest, const union compat_bpf_attr *cattr, int cmd) +{
- struct bpf_prog *prog;
- switch (cmd) {
[snip]
- case BPF_PROG_BIND_MAP:
dest->prog_bind_map.prog_fd = cattr->prog_bind_map.prog_fd;
dest->prog_bind_map.map_fd = cattr->prog_bind_map.map_fd;
dest->prog_bind_map.flags = cattr->prog_bind_map.flags;
break;
- };
Considering the really large number of assignments, I think it would be worth introducing a macro that takes dest, cattr and the name of the member to copy. That would shorten things a bit for nested structs, but most importantly it would prevent typos slipping in.
That's a good idea.
I've added a generic copy_field() macro in include/linux/stddef.h:
#define copy_field(DEST, SRC, FIELD) \ ((DEST)->FIELD = (SRC)->FIELD)
A similar macro for pointer conversion could be introduced in the last patch (which would also hide the very noisy cast to __kernel_aligned_uintptr_t).
This is a bit more tricky to name and figure out where it should live. So far I have compat_uptr_to_kern() in arch/arm64/include/asm/compat.h next to compat_ptr() define just for lack of a better place. It's kind of generic (not bpf related) but also pretty specific at the same time, hence hard to name without the name being really long. Maybe we just make it bpf specific and limit it to bpf/syscall.c + bpf/btf.c where it's used.
#define compat_uptr_to_kern(DEST, SRC, FIELD) \ ((DEST)->FIELD = (__kernel_aligned_uintptr_t)compat_ptr((SRC)->FIELD))
+} +#endif /* CONFIG_COMPAT64 */
+static int bpf_check_perms(int cmd) +{
- bool capable = bpf_capable() || !sysctl_unprivileged_bpf_disabled;
- /* Intent here is for unprivileged_bpf_disabled to block key object
* creation commands for unprivileged users; other actions depend
* of fd availability and access to bpffs, so are dependent on
* object creation success. Capabilities are later verified for
* operations such as load and map create, so even with unprivileged
* BPF disabled, capability checks are still carried out for these
* and other operations.
*/
- if (!capable &&
(cmd == BPF_MAP_CREATE || cmd == BPF_PROG_LOAD))
return -EPERM;
- return 0;
+}
+static int __sys_bpf(int cmd, bpfptr_t uattr, unsigned int size) +{
- union bpf_attr attr;
- int err;
- err = bpf_check_perms(cmd);
- if (err)
return err;
- err = bpf_check_uarg_tail_zero(uattr, sizeof(attr), size);
- if (err)
return err;
- size = min_t(u32, size, sizeof(attr));
- /* copy attributes from user space, may be less than sizeof(bpf_attr) */
- memset(&attr, 0, sizeof(attr));
- if (copy_from_bpfptr_with_ptr(&attr, uattr, size) != 0)
I would prefer if the switch to the *_with_ptr() variants happened in the final patch, like in the io_uring series. That's because it is more logical this way: at this stage there is no actual pointer in bpf_attr, the need for *_with_ptr() arises when changing the types in the final patch. This also applies to patch 5-8.
Ack
return -EFAULT;
- return dispatch_bpf(cmd, &attr, uattr, size);
+}
SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size) { return __sys_bpf(cmd, USER_BPFPTR(uattr), size); } +#ifdef CONFIG_COMPAT64 +static int __sys_compat_bpf(int cmd, bpfptr_t uattr, unsigned int size) +{
- union bpf_attr attr;
- union compat_bpf_attr cattr;
- int err;
- err = bpf_check_perms(cmd);
- if (err)
return err;
- err = bpf_check_uarg_tail_zero(uattr, sizeof(cattr), size);
- if (err)
return err;
- size = min_t(u32, size, sizeof(cattr));
- /* copy attributes from user space, may be less than sizeof(bpf_attr) */
- memset(&cattr, 0, sizeof(cattr));
- if (copy_from_bpfptr_with_ptr(&cattr, uattr, size) != 0)
compat structs/unions do not hold actual pointers so we shouldn't use *_with_ptr() in that case.
Ack
return -EFAULT;
- convert_compat_bpf_attr(&attr, &cattr, cmd);
- return dispatch_bpf(cmd, &attr, uattr, sizeof(attr));
+}
+COMPAT_SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size) +{
- return __sys_compat_bpf(cmd, USER_BPFPTR(uattr), size);
+}
Adding a compat handler is the "normal" way to handle this situation, so the approach in this patch is sound. However, it seems preferable to keep a native handler only and use in_compat_syscall() in __sys_bpf() for a couple of reasons:
- AFAICT the diff in this file would be a lot smaller - just a few lines
changed in __sys_bpf(), and of course the new conversion function.
It's a bit more compact - but there's some extra handling to make sure we're selecting the right size and union for everything:
inline bool in_compat64_syscall() { // generic - can go in arch/arm64/include/asm/compat.h return IS_ENABLED(CONFIG_COMPAT64) && in_compat_syscall(); }
#define select_attr (in_compat64_syscall() ? &cattr : attr) #define attr_size (in_compat64_syscall() ? sizeof(union compat_bpf_attr) : \ sizeof(union bpf_attr)) #define bpfptr_copy (in_compat64_syscall() ? copy_from_bpfptr : \ copy_from_bpfptr_with_ptr) static int copy_bpf_attr_from_user(union bpf_attr *attr, int cmd, bpfptr_t uattr, unsigned int *size) { union compat_bpf_attr cattr; int err;
err = bpf_check_uarg_tail_zero(uattr, attr_size, *size); if (err) return err; *size = min_t(u32, *size, attr_size);
/* copy attributes from user space, may be less than sizeof(bpf_attr) */ memset(select_attr, 0, attr_size); if (bpfptr_copy(select_attr, uattr, *size) != 0) return -EFAULT;
err = check_attr(cmd, select_attr); if (err) return -EINVAL;
if(in_compat64_syscall()) convert_compat_bpf_attr(attr, &cattr, cmd);
return 0; } #undef select_attr #undef attr_size #undef bpfptr_copy
static int __sys_bpf(int cmd, bpfptr_t uattr, unsigned int size) { union bpf_attr attr; int err;
err = bpf_check_perms(cmd); if (err) return err;
err = copy_bpf_attr_from_user(&attr, cmd, uattr, &size); if (err) return err;
return dispatch_bpf(cmd, &attr, uattr, size); }
- It would avoid overriding the handler in sys_compat64.c, which is not
ideal as it makes it hard to figure out what handler is actually used. The situation would be different if we were adding a "generic" compat handler, as in that case we could just change include/uapi/asm-generic/unistd.h (the normal way to do this).
I don't follow that - it adds a handler in sys_compat64.c, and the handler in bpf/syscall.c is clearly guarded by #ifdef CONFIG_COMPAT64. We can even add a comment stating the compat handler is for compat64 only.
With that approach, a further small improvement can be made. On the same model as Tudor's "io_uring: Implement compat versions of uAPI structs and handle them", __sys_bpf() could call a generic function, say copy_bpf_attr_from_user(). This function would then call the conversion function in compat64. Besides a better encapsulation, the main advantage is that the union compat_bpf_attr variable is allocated in the conversion function, and it is therefore live only for the duration of the conversion. Conversely, allocating it in __sys_compat_bpf() means it lives for the whole duration of the syscall handling. In other words, it slightly improves the stack usage. Not essential but nice to have. I think that approach could also be used in patch 5-8.
Makes sense. That should be possible with the __sys_compat_bpf() handler too, just call out into another small helper function.
Finally to reduce the amount of #ifdef, we could have a macro like io_in_compat64() that only returns true in compat64. This allows keeping most of the code out of any #ifdef, assuming struct definitions and such are not #ifdef'd either (which is not particularly useful anyway).
I had preferred the other option just because everything compat64 is behind an #ifdef; so in the CONFIG_CHERI_PURECAP_UABI=n case it's as if compat64 doesn't even exist. We don't allocate, define or otherwise do anything compat64 related when CONFIG_COMPAT64 is disabled. So what's the problem with #ifdef? Maybe that is not that useful since I doubt anyone is using this kernel without PCuABI. Just seems a bit "cleaner" and straight forward to me, at the cost of some minor duplication.
Either way should work really so I don't know if you have strong feelings one way or the other after seeing the above. I'm leaning towards __sys_compat_bpf() as before but I might have mis-understood some stuff or not implemented it correctly.
Thanks,
Zach
Kevin