On 15/09/2023 15:47, Kevin Brodsky wrote:
Nit (in title): "xyz" might be misinterpreted :D bpf_*_info would be clearer.
I went with bpf_{btf,prog,link}_info to be even clearer (:
On 12/09/2023 11:51, Zachary Leaf wrote:
[...]
+static int copy_bpf_btf_info_from_user(const union bpf_attr *attr,
struct bpf_btf_info *info,
u32 *info_len)
+{
- struct compat_bpf_btf_info cinfo;
- int err;
- void *select_info = in_compat64_syscall() ? &cinfo : info;
- size_t info_size = in_compat64_syscall() ? sizeof(struct compat_bpf_btf_info)
: sizeof(struct bpf_btf_info);
- void __user *uinfo = u64_to_user_ptr(attr->info.info);
Nit: there is normally always an empty line between declarations and the first non-declaration statement.
- *info_len = attr->info.info_len;
- err = bpf_check_uarg_tail_zero(USER_BPFPTR(uinfo),
info_size, *info_len);
It took me a long time to figure out why this suddenly appeared - eventually I realised that it got moved from bpf_btf_get_info_by_fd(). It's a small thing but could we still have it in a separate patch? That will also make the diff in syscall.c less noisy around that function.
Sure
- if (err)
return err;
- *info_len = min_t(u32, info_size, *info_len);
- memset(info, 0, sizeof(struct bpf_btf_info));
Can use sizeof(*info), similarly to the original code.
Done
Thinking some more about the zeroing, having looked at some other conversion functions, I feel that we might be missing some. There are two situations where zeroing is crucial:
- When copying in and the user-supplied struct is smaller than the
kernel one. If we miss zeroing, then we're going to read garbage from the last struct members.
I think you're right. Thinking out loud:
We're handling different userspace struct sizes by getting the size from userspace, and using that if it's smaller i.e. *info_len = min_t(u32, kernel_size, user_size);
So if the user supplied struct is smaller, then we'd only copy in user_size bytes. That part is fine.
The problem is now when we're doing the compat64 conversion, and copying fields from that compat struct into a native struct. For the last remaining members, if we never zero initialised with memset and never copied in those last fields because userspace had a smaller struct, we could copy in garbage there.
So agree we need to zero initialise there, because we don't want garbage in those fields to be used by the kernel.
I think those fields shouldn't ever get written out. When we do copy_to_user, we would again use the userspace provided size if it was smaller. Still - it's a problem because the garbage could be used elsewhere.
- When copying out the entire struct. We should have set all the
members at this point, but if we miss zeroing then any potential padding will be garbage. This is even worse than 1. as arbitrary kernel data should never be written to user memory.
Agreed, the padding could then contain garbage and get written out, so this is a problem. We need to zero initialise before writing out.
- was not an issue before because that padding was provided by
userspace, when we copy_from_user(). Overall because we now have both a native and a compat path, it gets harder to track if the zeroing is right or not. In this case, I think we're missing 1. in the compat case, since we don't zero the compat struct and then read it to initialise members of the native struct. I sense that in general we should zero both the native and compat structs.
In that case we need to explicitly memset() rather than initialise the struct with {0}, since the latter doesn't set the padding.
So we end up with if (in_compat64_syscall()) memset(cinfo, 0, sizeof(cinfo));
in all the copy to/from functions.
Similar thing for bpf_attr in relevant commit.
It's worth mentioning that because Morello Clang (possibly GCC too) supports it, we get CONFIG_INIT_STACK_ALL_ZERO=y by default (see security/Kconfig.hardening). This means that we won't actually see whether we get zeroing right or not, in fact the codegen should be unchanged. We could verify it using CONFIG_INIT_STACK_ALL_PATTERN=y, but I don't think it's worth the hassle. Let's just try our best to be consistent.
Okay thanks, good to know about security/Kconfig.hardening.
- if (copy_from_user(select_info, uinfo, *info_len))
return -EFAULT;
- if (in_compat64_syscall())
convert_compat_btf_info_in(info, &cinfo);
- return 0;
+}
+static int copy_bpf_btf_info_to_user(const union bpf_attr *attr,
union bpf_attr __user *uattr,
struct bpf_btf_info *info,
u32 *info_len)
+{
- struct compat_bpf_btf_info cinfo;
- void *select_info = in_compat64_syscall() ? &cinfo : info;
- void __user *uinfo = (void __user *)attr->info.info;
That won't build, you could stick to u64_to_user_ptr() as this is temporary anyway.
Ah yep. Missed this one or it somehow found it's way back in.
- if (in_compat64_syscall())
convert_compat_btf_info_out(&cinfo, info);
- if (copy_to_user(uinfo, select_info, *info_len) ||
bpf_put_uattr(uattr, *info_len, info.info_len))
return -EFAULT;
- return 0;
+}
int btf_get_info_by_fd(const struct btf *btf, const union bpf_attr *attr, union bpf_attr __user *uattr) {
- struct bpf_btf_info __user *uinfo; struct bpf_btf_info info;
- u32 info_copy, btf_copy;
- u32 btf_copy; void __user *ubtf; char __user *uname; u32 uinfo_len, uname_len, name_len; int ret = 0;
Nit: better remove the 0 initialiser since ret is now assigned just after.
Done.
- uinfo = u64_to_user_ptr(attr->info.info);
- uinfo_len = attr->info.info_len;
- info_copy = min_t(u32, uinfo_len, sizeof(info));
- memset(&info, 0, sizeof(info));
- if (copy_from_user(&info, uinfo, info_copy))
return -EFAULT;
- ret = copy_bpf_btf_info_from_user(attr, &info, &uinfo_len);
- if (ret)
return ret;
info.id = btf->id; ubtf = u64_to_user_ptr(info.btf); @@ -7268,9 +7332,9 @@ int btf_get_info_by_fd(const struct btf *btf, } }
- if (copy_to_user(uinfo, &info, info_copy) ||
put_user(info_copy, &uattr->info.info_len))
return -EFAULT;
- ret = copy_bpf_btf_info_to_user(attr, uattr, &info, &uinfo_len);
There is a semantic change here in case ret is set to something else than 0 (3 lines above). Given that these out functions only return -EFAULT as an error, we may be better off keeping the existing style in general, that is if (copy...) return -EFAULT.
Yeah I see. Agreed for copy..**to** here. For copy_**from**, then bpf_check_uarg_tail_zero() can return something other than -EFAULT so need to keep that as is.
[...]
+static void +convert_compat_link_info_out(struct compat_bpf_link_info *dest,
const struct bpf_link_info *info,
enum bpf_link_type type)
+{
- copy_field(dest, info, type);
- copy_field(dest, info, id);
- copy_field(dest, info, prog_id);
- switch (type) {
- case BPF_LINK_TYPE_RAW_TRACEPOINT:
copy_field(dest, info, raw_tracepoint.tp_name);
copy_field(dest, info, raw_tracepoint.tp_name_len);
return;
- case BPF_LINK_TYPE_TRACING:
copy_field(dest, info, tracing.attach_type);
copy_field(dest, info, tracing.target_obj_id);
copy_field(dest, info, tracing.target_btf_id);
return;
- case BPF_LINK_TYPE_CGROUP:
copy_field(dest, info, cgroup.cgroup_id);
copy_field(dest, info, cgroup.attach_type);
return;
- case BPF_LINK_TYPE_ITER:
copy_field(dest, info, iter.target_name);
copy_field(dest, info, iter.target_name_len);
/* remaining struct is fixed size integers, so identical in
* purecap/compat64 - since figuring out what members are active
Nit: to avoid being too specific in generic code, we could just say "any 64-bit ABI" (excluding 32-bit because alignof(__u64) is 4 on x86_32, unlike any other ABI). This is compat64-specific so 32-bit ABIs are irrelevant.
Done
* is non-trivial, memcpy to the end of the struct */
memcpy((u8 *)dest+offsetof(struct compat_bpf_link_info, iter.map),
(u8 *)info+offsetof(struct bpf_link_info, iter.map),
offsetofend(struct bpf_link_info, iter.cgroup.order) -
offsetof(struct bpf_link_info, iter.map.map_id));
We could slightly simplify this by referring to the structs and not the innermost members, that is iter.map instead of iter.map_id and iter.cgroup instead of iter.cgroup.order. The latter may result in a slightly larger copy size due to padding, but that's not an issue because we (should) have zeroed that padding to start with (see comment above). In any case, this memcpy() relies on any padding being initialised.
That works too.
Thanks,
Zach
Kevin