Hello all,
Apologies for taking so long to send a v2. It needed a few big reworks for
which finding time was difficult.
But they are now done and all the comments that were raised on the v1 are
hopefully addressed, as are the few issues I raised myself.
Hopefully now that I have everything set-up again and changes should be smaller,
I should be able to address comments much quicker !
This patch properly restricts the bounds of argv and envp strings in purecap.
It handles the padding and alignment changes required for setting exact bounds.
The strings are still passed to userspace on the stack. Changing this would be
covered by some future work.
# Remarks
I am a bit unsure of a few things. Mainly, how the capabilities passed to
`put_str_array()` are copied and need updating outside, and the way I force the
alignment of the stack in `setup_arg_pages()`.
# Testing
For validation purposes, LTP was run in purecap with the full syscalls suite
and the morello skip lists, with no regression observed.
For quickly testing the patch, one can build a program trying to access an argv
outside of its bounds. Here is a sample program I used for testing.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <inttypes.h>
int main(int argc, char** argv) {
printf("argc : %d\n", argc);
for (int i = 0; i < argc; i++) {
size_t arglen = strlen(argv[i]);
printf("arg #%d (len %ld):\n", i, arglen);
printf("\t%s\n", argv[i]);
}
for (int i = 0; i < argc; i++) {
size_t arglen = strlen(argv[i]);
printf("arg #%d @ %ld:\n", i, arglen+5);
printf("\t%c\n", argv[i][arglen+5]);
}
return 0;
}
Without the patch or in COMPAT : all the args are printed, and argc number of
out of bounds characters are printed.
With the patch, in purecap : all the args are printed, but the program SEGFAULTS
before the first out of bounds character is printed.
You can generate a large enough argument that would need padding with this :
bigarg=""; count=0; while [ $(echo $bigarg | wc -c) -le 20000 ]; do
bigarg=${bigarg}"==${count}==This_should_be_a_really_big_arg_string"
count=$(($count + 1))
done
Thanks in advance for your comments,
Best regards
Téo
# Changes from v1[0]
- Rebased on top of the last release
- Update to make use of the properly derived stack capability in binfmt_elf
- Move the copying of argv and envp strings in binfmt_elf to a helper function
- Check for padding after an arg only
- Greatly simplify the change in exec by completely skipping the padding,
rather than looping through it and allocating pages in exec
- Add more details to the comments explaining the padding process
- Rename most variables. I'm not great with names so hopefully they are OK,
otherwise feel free to suggest new ones !
- Proper COMPAT handling, but with a slight loss compared to a regular kernel
- Detail trade-off in comments and in commit message
- Force a greater stack alignment in exec to mitigate issues during relocation
- Simplify accesses and use `get_user()` rather than `copy_from_user()`
- Use wrappers provided by <cheriintrin.h> rather than builtins
- Get rid of the elf_stack_put_user_cap macro
Gitlab patch for review :
https://git.morello-project.org/Teo-CD/linux/-/commit/089b18d666efd567530a7…
[0]: https://op-lists.linaro.org/archives/list/linux-morello@op-lists.linaro.org…
Teo Couprie Diaz (1):
fs: Handle exact bounds for argv and envp
fs/binfmt_elf.c | 111 ++++++++++++++++++++++++++++++++++++++++--------
fs/exec.c | 50 ++++++++++++++++++++--
2 files changed, 139 insertions(+), 22 deletions(-)
--
2.34.1
Hi,
Here's v2 of the bpf syscall patches, following on from the RFC[1] and
v1[2].
The bpf syscall is updated to propagate user pointers as capabilities in
the pure-capability kernel-user ABI (PCuABI). It also includes an
approach to support the existing aarch64 ABI (compat64).
One complication here is from the fact this syscall supports many
multiplexed sub-commands, some of which are themselves multiplexed with
a number of nested multiplexed options.
A further complication is that the existing syscall uses a trick of
storing user pointers as u64 to avoid needing a compat handler for
32-bit systems. To retain compatibility with the aarch64 ABI and add
Morello support, special compat64 conversion and handling is
implemented.
Inbound (userspace->kernel) conversion between compat64/native struct
layouts is handled upfront on entry to the syscall (with the exception
of bpf_xyz_info structs). This minimises changes to subcommand handlers.
Some subcommands require conversion back out to userspace and that is by
necessity handled where it occurs.
Patch 1 is not essential to this series but it's a nice debug feature to
have and works[3]. It enables BPF_PROG_TYPE_TRACEPOINT which many eBPF
kselftests use.
Patches 2-4, 9 are setup and helper functions.
Patches 5-7 implement the core compat64 handling. Each commit compiles
cleanly but relevant parts will be broken inbetween.
Patch 8 fixes a check to also check configs passed in via compat64.
Patch 10 finally enables capabilities in the kernel.
Patches 11,12 handles uaccess that occurs in two eBPF helper functions.
Testing wise, see associated LTP changes below as posted to LTP mailing
list[4]. The eBPF LTP tests are fairly minimal and test only a small
part of the changes here. There's a new test to test CHECK_ATTR from
patch 8.
The kernel kselftests contain much more extensive eBPF tests. They can
be built fairly easily natively on aarch64 which is useful for testing
compat64. More work needs to be done here though to:
a) enable out-of-tree cross-compilation for purecap as well as
x86->aarch64
b) replace ptr_to_u64() with casts to uintptr_t in tests
c) general libbpf/bpftool enablement and fixes since many tests rely
on this
d) CONFIG_DEBUG_INFO_BTF required for many tests but this requires the
build system to have a recent version of pahole tool
Next steps once we have the core kernel support would be porting libbpf
and bpftool for purecap plus work on enabling kselftests as above.
Kernel branch available at:
https://git.morello-project.org/zdleaf/linux/-/tree/morello/bpf_v2
Associated LTP test/changes at:
https://git.morello-project.org/zdleaf/morello-linux-test-project/-/tree/mo…
Thanks,
Zach
[1] [RFC PATCH 0/9] update bpf syscall for PCuABI/compat64
https://op-lists.linaro.org/archives/list/linux-morello@op-lists.linaro.org…
[2] [PATCH 00/10] update bpf syscall for PCuABI/compat64
https://op-lists.linaro.org/archives/list/linux-morello@op-lists.linaro.org…
[3] [PATCH v3 0/5] Restore syscall tracing on Morello
https://op-lists.linaro.org/archives/list/linux-morello@op-lists.linaro.org…
[4] [PATCH 0/3] add eBPF support
https://op-lists.linaro.org/archives/list/linux-morello-ltp@op-lists.linaro…
-----------------------------------------------------------------------
v2:
- Fixed copy_from_bpfptr_offset_with_ptr - this no longer uses sockptr
as of 6.1 (see patch 9)
- Rebase on 6.4 - new struct members need conversion/handling:
- New BPF_PROG_TYPE_NETFILTER + associated bpf_attr BPF_LINK_CREATE
members
- New bpf_link_info types BPF_LINK_TYPE_NETFILTER +
BPF_LINK_TYPE_STRUCT_OPS
- Renamed in_32bit_compat_syscall() to in_compat32_syscall() and added
in_compat64_syscall()
- Handled uaccess from bpf helper programs
bpf/helpers.c:bpf_copy_from_user + bpf_copy_from_user_task
- Added new stddef.h macro copy_field() to simplify conversion
assignments
- bpf: compat64: add handler and convert bpf_attr in
- Replaced #ifdef CONFIG_COMPAT64 with in_compat64_syscall() + use
copy_bpf_attr_from_user() helper function for inbound compat
conversion
- This removes the __sys_compat_bpf compat64 handler, now using the
existing __sys_bpf + in_compat64_syscall() for reduced diff size,
less duplication and clearer code
- Conversion now happens in copy_bpf_attr_from_user() for better
encapsulation + stack usage
- bpf: compat64: bpf_attr convert out
- Renamed PUT_USER_ATTR() to bpf_put_uattr() + moved to bpf.h
- Introduced bpfptr_put_uattr() in bpfptr.h
- Originally missed handling cases writing out to userspace with
copy_to_bpfptr_offset() - now replaced with new bpfptr_put_uattr()
macro
- 6.4 now has a new field showing what log size will be needed - see
47a71c1f9af0 ("bpf: Add log_true_size output field to return
necessary log buffer size")
- This also requires new macro bpf_field_exists() to handle compat64
when checking that userspace supports this new field
- bpf: compat64: handle bpf_xyz_info
- Simplified bpf_link_info conversion handling using memcpy
- Removed bpf_map_info patch since struct is the same in
compat64/native (no ptrs)
- Replaced #ifdef CONFIG_COMPAT64 with in_compat64_syscall() + use
copy_bpf_xyz_info_{from,to}_user() helper functions for compat
conversions
- Merged bpf_{btf,prog,link}_info into a single patch
- bpf: use user pointer types in uAPI structs
- Added new compat_uptr_to_kern() macro to simplify casting/converting
in compat ptrs
- Usage of copy_{to,from}_user_with_ptr variants now correctly applied
with new helpers
-----------------------------------------------------------------------
Zachary Leaf (12):
arm64: morello: enable syscall tracing
arch: rename to in_compat32_syscall
arch: add compat helpers specific to 64-bit
stddef: introduce copy_field helper
bpf: compat64: add handler and convert bpf_attr in
bpf: compat64: bpf_attr convert out
bpf: compat64: handle bpf_xyz_info
bpf: compat64: support CHECK_ATTR macro
bpf: copy_{to,from}_user_with_ptr helpers
bpf: use user pointer types in uAPI structs
bpf: use addr for bpf_copy_from_user_with_task
bpf: use addr for bpf_copy_from_user
.../morello_transitional_pcuabi_defconfig | 3 +-
arch/arm64/include/asm/compat.h | 5 +
arch/sparc/include/asm/compat.h | 2 +-
arch/x86/include/asm/compat.h | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 2 +-
drivers/input/input.c | 2 +-
drivers/media/rc/bpf-lirc.c | 7 +-
fs/ext4/dir.c | 2 +-
fs/nfs/dir.c | 2 +-
include/linux/bpf.h | 13 +
include/linux/bpf_compat.h | 423 ++++++++
include/linux/bpfptr.h | 27 +-
include/linux/compat.h | 14 +-
include/linux/stddef.h | 3 +
include/uapi/linux/bpf.h | 94 +-
kernel/bpf/bpf_iter.c | 2 +-
kernel/bpf/btf.c | 100 +-
kernel/bpf/cgroup.c | 10 +-
kernel/bpf/hashtab.c | 13 +-
kernel/bpf/helpers.c | 9 +-
kernel/bpf/net_namespace.c | 7 +-
kernel/bpf/offload.c | 2 +-
kernel/bpf/syscall.c | 991 ++++++++++++++----
kernel/bpf/verifier.c | 21 +-
kernel/time/time.c | 2 +-
kernel/trace/bpf_trace.c | 6 +-
net/bpf/bpf_dummy_struct_ops.c | 9 +-
net/bpf/test_run.c | 32 +-
net/core/sock_map.c | 7 +-
30 files changed, 1449 insertions(+), 365 deletions(-)
create mode 100644 include/linux/bpf_compat.h
--
2.34.1
Next version of Harry's tcp_zerocopy_receive series. It aims to enable
purecap applications to utilise tcp_zerocopy_receive to map packets
directly from a network interface card into a shared space with the
process and kernel.
v4..v5:
- Change error code for failed check_cheri_cap() from EFUALT to EINVAL
v3..v4:
- Remove function tcp_zerocopy_receive_size() as it's used in only
one spot
- Change form compat to compat64 in the commit messages.
- Make initialisation of compat_zc more compact in
set_compat64_tcp_zerocopy_receive()
- Split copy_{from,to}_sockptr_with_ptr() into a separate commit
- Leave address tcp_zerocopy_vm_insert_batch() as just an address, not
a pointer
- Change the check_user_ptr_read() to a check for CHERI_PERM_SW_VMEM
v2..v3:
- Fix the split between compat enablement and capability support
- Reorganise comments in struct tcp_zerocopy_receive
- Fix the order of arguments for the *_to_sockptr functions
- Change a few variables to user_uintptr_t as they were originally
unsigned long
- Remove unnecessary checks
- Fix cast warnings
- Add compat handling for offsetofend
- Move a check_user_ptr closer to where its metadata is dropped
- Format misc nits
Gitlab Issue:
https://git.morello-project.org/morello/kernel/linux/-/issues/46
Review branch:
https://git.morello-project.org/tudcre01/linux/-/commits/morello/tcp_v5
Cc: Harry Ramsey <harry.ramsey(a)arm.com>
Harry Ramsey (3):
sockptr: Preserve capability tags with copy_{from,to}_sockptr_with_ptr
tcp: Implement compat64 handling for struct tcp_zerocopy_receive
tcp: Support capabilities in tcp_zerocopy_receive
include/linux/sockptr.h | 28 ++++++++
include/uapi/linux/tcp.h | 10 +--
net/ipv4/tcp.c | 148 +++++++++++++++++++++++++++++++--------
3 files changed, 151 insertions(+), 35 deletions(-)
--
2.34.1
Syscalls operating on memory mappings manage their address space via
owning capabilities. They must adhere to a certain set of rules[1] in
order to ensure memory safety. Address space management syscalls are
only allowed to manipulate mappings that are within the range of the
owning capability and have the appropriate permissions. Tests to check
the capability's tag, bounds, range as well as permissions have been
added. Finally, as certain flags and syscalls conflict with the
reservation model or lack implementation, a check to verify appropriate
handling of the same has also been added.
The mincore() tests are expected to fail in this iteration as they are
not fully supported. The next iterations will contain representability
testcases.
Review branch:
https://git.morello-project.org/chaitanya_prakash/linux/-/tree/review/purec…
This patch series has been tested on:
https://git.morello-project.org/amitdaniel/linux/-/tree/review/purecap_mm_r…
[1] https://git.morello-project.org/morello/kernel/linux/-/wikis/Morello-pure-c…
Changes in V3:
- Added get_pagesize() function and VERRIFY_ERRNO() macro
- Added LoadCap and StoreCap permissions testcase
- Added validity_tag_check testcases
- Added reservation tests
- Renamed variable "addr" to "ptr" to avoid confusion when manipulating
both addresses and capabilities
- Cleaned up syscall_mmap and syscall_mmap2 testcases
- Restructured code into testcases that check tags, range, bounds
and permissions
- Improved range_check testcases
- Improved commit messages
- Removed helper functions, tests directly written in testcase functions
- Removed signal handling and ddc register testcases
Changes in V2:
https://op-lists.linaro.org/archives/list/linux-morello@op-lists.linaro.org…
- Added link to the review branch
- Removed unnecessary whitespace
Changes in V1:
https://op-lists.linaro.org/archives/list/linux-morello@op-lists.linaro.org…
Chaitanya S Prakash (11):
kselftests/arm64: morello: Create wrapper functions for frequently
invoked syscalls
kselftests/arm64: morello: Add get_pagesize() function
kselftests/arm64: morello: Add VERIFY_ERRNO() macro
kselftests/arm64: morello: mmap: Clean up existing testcases
kselftests/arm64: morello: mmap: Add MAP_GROWSDOWN testcase
kselftests/arm64: morello: mmap: Add validity tag check testcases
kselftests/arm64: morello: mmap: Add capability range testcases
kselftests/arm64: morello: mmap: Add mmap() bounds check testcases
kselftests/arm64: morello: mmap: Add mremap() bounds check testcases
kselftests/arm64: morello: mmap: Add permission check testcases
kselftests/arm64: morello: mmap: Add brk() testcase
.../selftests/arm64/morello/bootstrap.c | 13 -
.../selftests/arm64/morello/freestanding.c | 16 +-
.../selftests/arm64/morello/freestanding.h | 74 ++-
tools/testing/selftests/arm64/morello/mmap.c | 547 +++++++++++++++++-
4 files changed, 606 insertions(+), 44 deletions(-)
--
2.25.1
Next version of Harry's tcp_zerocopy_receive series. It aims to enable
purecap applications to utilise tcp_zerocopy_receive to map packets
directly from a network interface card into a shared space with the
process and kernel.
v3..v4:
- Remove function tcp_zerocopy_receive_size() as it's used in only
one spot
- Change form compat to compat64 in the commit messages.
- Make initialisation of compat_zc more compact in
set_compat64_tcp_zerocopy_receive()
- Split copy_{from,to}_sockptr_with_ptr() into a separate commit
- Leave address tcp_zerocopy_vm_insert_batch() as just an address, not
a pointer
- Change the check_user_ptr_read() to a check for CHERI_PERM_SW_VMEM
v2..v3:
- Fix the split between compat enablement and capability support
- Reorganise comments in struct tcp_zerocopy_receive
- Fix the order of arguments for the *_to_sockptr functions
- Change a few variables to user_uintptr_t as they were originally
unsigned long
- Remove unnecessary checks
- Fix cast warnings
- Add compat handling for offsetofend
- Move a check_user_ptr closer to where its metadata is dropped
- Format misc nits
Gitlab Issue:
https://git.morello-project.org/morello/kernel/linux/-/issues/46
Review branch:
https://git.morello-project.org/tudcre01/linux/-/commits/morello/tcp_v4
Cc: Harry Ramsey <harry.ramsey(a)arm.com>
Harry Ramsey (3):
sockptr: Preserve capability tags with copy_{from,to}_sockptr_with_ptr
tcp: Implement compat64 handling for struct tcp_zerocopy_receive
tcp: Support capabilities in tcp_zerocopy_receive
include/linux/sockptr.h | 28 ++++++++
include/uapi/linux/tcp.h | 10 +--
net/ipv4/tcp.c | 148 +++++++++++++++++++++++++++++++--------
3 files changed, 151 insertions(+), 35 deletions(-)
--
2.34.1
Next version of Harry's tcp_zerocopy_receive series. It aims to enable
purecap applications to utilise tcp_zerocopy_receive to map packets
directly from a network interface card into a shared space with the
process and kernel.
v2..v3:
- Fix the split between compat enablement and capability support
- Reorganise comments in struct tcp_zerocopy_receive
- Fix the order of arguments for the *_to_sockptr functions
- Change a few variables to user_uintptr_t as they were originally
unsigned long
- Remove unnecessary checks
- Fix cast warnings
- Add compat handling for offsetofend
- Move a check_user_ptr closer to where its metadata is dropped
- Format misc nits
Gitlab Issue:
https://git.morello-project.org/morello/kernel/linux/-/issues/46
Review branch:
https://git.morello-project.org/tudcre01/linux/-/commits/morello/tcp_v3
Cc: Harry Ramsey <harry.ramsey(a)arm.com>
Harry Ramsey (2):
tcp: Implement compat handling for struct tcp_zerocopy_receive
tcp: Support capabilities in tcp_zerocopy_receive
include/linux/sockptr.h | 28 +++++++
include/uapi/linux/tcp.h | 10 +--
net/ipv4/tcp.c | 157 +++++++++++++++++++++++++++++++--------
3 files changed, 157 insertions(+), 38 deletions(-)
--
2.34.1
Hello,
This patch series enables purecap applications to utilise
tcp_zerocopy_receive to map packets directly from a network interface
card into a shared space with the process and kernel.
I do not think I shall have time to continue revising this patch series
and debugging the bus error generated by the latest patch. Hopefully
this provides a start to tcp_zerocopy_receive for capability
architecture.
v2:
- There appears to be a new error generated against musl resulting in a
BUS error when using memcpy to copy between allocated shared space
and other regions of memory.
- Rebase patch order to ensure aarch64 support throughout commits.
- Fix naming convention format issues for compat structs and functions.
- Remove uaddr_to_user_ptr usage to enforce capability model.
v1:
I have tested these changes against musl and there still exists an issue
in this implementation with copybuf and potentially msg_control
generating an EFAULT error.
Gitlab Issue:
- https://git.morello-project.org/morello/kernel/linux/-/issues/46
Review branch:
- https://git.morello-project.org/harryramsey/linux/-/commits/tcp_zerocopy
Thanks,
Harry
Harry Ramsey (2):
tcp: Implement compat version of tcp_zerocopy_receive
tcp: Support userspace capabilities for tcp_zerocopy_receive
include/linux/sockptr.h | 28 ++++++++
include/uapi/linux/tcp.h | 6 +-
net/ipv4/tcp.c | 135 ++++++++++++++++++++++++++++++++++-----
3 files changed, 149 insertions(+), 20 deletions(-)
--
2.34.1