Hi,
This series implements the restriction of bounds and permissions of all capabilities provided to a new process. This includes registers (PCC, CSP, C1-C3), capabilities in the auxiliary vector and capabilities to strings in argv/envp (issue #19 [1]). To complete the alignment with the PCuABI specification, AT_BASE is restored as a simple address, DDC is nullified and CCTLR_EL0.SBL is set (issue #20 [2]).
This series is composed of the following patches:
* Patch 1-4 fix issues revealed by the restrictions made in the following patches, notably DDC being set to null.
* Patch 5 adds a generic helper.
* Patch 6-8 refactor the start_thread() machinery so that binfmt_elf provides all the capabilities to set the initial capability registers to. This allows to centralise all calculations in binfmt_elf, and guarantees that capabilities in registers and in the auxiliary vector are consistent. No functional change at this point (the capabilities remain unrestricted).
* Patch 9-11 perform the actual restriction of capabilities, simultaneously in initial registers and the auxiliary vector. Patch 11 is a simplified version of Téo's patch [3], without ensuring bounds representability (see below).
* Patch 12-15 take care of the remaining alignment with the spec (AT_BASE, DDC, CCTLR_EL0.SBL).
With respect to the tightening of capability bounds, two important caveats should be noted:
* In the absence of stack reservation (issue #21 [4]), the stack remains notionally unbounded, and so are the capabilities covering the entire stack (CSP, AT_CHERI_STACK_CAP).
* No particular effort is made with respect to bounds representability. This means notably that capabilities to argv/envp strings may overlap, as well as capabilities to the argv/envp arrays. This is however very unlikely in practice (it would require very large strings or a very large number of arguments).
Review branch:
https://git.morello-project.org/kbrodsky-arm/linux/-/commits/morello/initial...
Thanks, Kevin
[1] https://git.morello-project.org/morello/kernel/linux/-/issues/19 [2] https://git.morello-project.org/morello/kernel/linux/-/issues/20 [3] https://op-lists.linaro.org/archives/list/linux-morello@op-lists.linaro.org/... [4] https://git.morello-project.org/morello/kernel/linux/-/issues/21
Kevin Brodsky (15): arm64: barrier: Make arch_counter_enforce_ordering() purecap-friendly kselftests/arm64: morello: Process caprelocs using appropriate root caps kselftests/arm64: morello: Remove invalid assumptions about initial data kselftests/arm64: morello: Seal reconstructed function pointer linux/user_ptr.h: Introduce user_ptr_set_bounds fs/binfmt_elf: Add entry member to elf_load_info binfmt: Store initial user pointers in bprm in PCuABI arm64: Use bprm->pcuabi to set initial pointers in PCuABI fs/binfmt_elf: Restrict executable/interpreter capabilities fs/binfmt_elf: Restrict stack capabilities fs/binfmt_elf: Restrict string capabilities fs/binfmt_elf: Make AT_BASE an address again elf: Remove elf_uaddr_to_user_ptr() arm64: morello: Nullify DDC in PCuABI arm64: morello: Set CCTLR_EL0.SBL in PCuABI
arch/arm64/include/asm/barrier.h | 13 + arch/arm64/include/asm/elf.h | 2 +- arch/arm64/include/asm/morello.h | 4 +- arch/arm64/include/asm/processor.h | 47 +-- arch/arm64/include/asm/sysreg.h | 2 + arch/arm64/kernel/morello.c | 44 ++- arch/arm64/kernel/process.c | 17 + fs/binfmt_elf.c | 306 ++++++++++++++---- fs/compat_binfmt_elf.c | 3 - include/linux/binfmts.h | 13 + include/linux/elf.h | 1 - include/linux/user_ptr.h | 22 ++ .../selftests/arm64/morello/bootstrap.c | 64 ++-- .../arm64/morello/freestanding_init_globals.c | 54 ++-- .../arm64/morello/freestanding_start.S | 4 +- 15 files changed, 417 insertions(+), 179 deletions(-)
The vDSO uses arch_counter_enforce_ordering(), so it needs to work as intended in purecap (when PCuABI is selected). At the moment it creates a dummy pointer from the 64-bit value of SP and then dereferences it. To make this valid in purecap, derive a valid capability from CSP instead.
This issue went unnoticed due to DDC allowing all X-based loads/stores, even in purecap. Once DDC is nullified, this patch is required to avoid a capability fault when arch_counter_enforce_ordering() is called from the purecap vDSO.
Signed-off-by: Kevin Brodsky kevin.brodsky@arm.com --- arch/arm64/include/asm/barrier.h | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h index cf2987464c18..38bf8e0d9655 100644 --- a/arch/arm64/include/asm/barrier.h +++ b/arch/arm64/include/asm/barrier.h @@ -106,6 +106,18 @@ static inline unsigned long array_index_mask_nospec(unsigned long idx, * * https://lore.kernel.org/r/alpine.DEB.2.21.1902081950260.1662@nanos.tec.linut... */ +#ifdef __CHERI_PURE_CAPABILITY__ +#define arch_counter_enforce_ordering(val) do { \ + u64 tmp, _val = (val); \ + void *ptr; \ + \ + asm volatile( \ + " eor %0, %2, %2\n" \ + " add %1, csp, %0\n" \ + " ldr xzr, [%1]" \ + : "=r" (tmp), "=r"(ptr) : "r" (_val)); \ +} while (0) +#else /* __CHERI_PURE_CAPABILITY__ */ #define arch_counter_enforce_ordering(val) do { \ u64 tmp, _val = (val); \ \ @@ -115,6 +127,7 @@ static inline unsigned long array_index_mask_nospec(unsigned long idx, " ldr xzr, [%0]" \ : "=r" (tmp) : "r" (_val)); \ } while (0) +#endif /* __CHERI_PURE_CAPABILITY__ */
#define __smp_mb() dmb(ish) #define __smp_rmb() dmb(ishld)
We are about to restrict the bounds of initial capabilities, such as PCC, and nullify DDC, in line with the PCuABI specification. Additionally, Clang/LLD still generate caprelocs when linking a static executable, so we need to ensure that we process them in a way that is PCuABI-compliant.
Accordingly, this patch switches the caprelocs processing to using AT_CHERI_EXEC_*_CAP as root capabilities instead of DDC, just like when processing dynamic relocations.
Signed-off-by: Kevin Brodsky kevin.brodsky@arm.com --- .../arm64/morello/freestanding_init_globals.c | 54 ++++++++++--------- .../arm64/morello/freestanding_start.S | 3 +- 2 files changed, 32 insertions(+), 25 deletions(-)
diff --git a/tools/testing/selftests/arm64/morello/freestanding_init_globals.c b/tools/testing/selftests/arm64/morello/freestanding_init_globals.c index ddb1635a8762..12eb0af97887 100644 --- a/tools/testing/selftests/arm64/morello/freestanding_init_globals.c +++ b/tools/testing/selftests/arm64/morello/freestanding_init_globals.c @@ -47,6 +47,25 @@ struct cap_reloc { size_t perms_to_clear; };
+static void get_caps(uintptr_t *cap_rx, uintptr_t *cap_rw, const uintptr_t *auxv) +{ + for (;;) { + switch ((unsigned long)auxv[0]) { + case AT_NULL: + return; + case AT_CHERI_EXEC_RX_CAP: + *cap_rx = auxv[1]; + *cap_rx = cheri_perms_and(*cap_rx, CHERI_PERM_MASK_RX); + break; + case AT_CHERI_EXEC_RW_CAP: + *cap_rw = auxv[1]; + *cap_rw = cheri_perms_and(*cap_rw, CHERI_PERM_MASK_RW); + break; + } + auxv += 2; + } +} + /* * Process capability relocations stored in the __cap_relocs section. Each * entry in that section has a layout corresponding to struct cap_reloc. @@ -55,10 +74,11 @@ struct cap_reloc { * because capability relocations must have already been processed in order to * refer to such symbols. */ -void __morello_process_cap_relocs(void) +void __morello_process_cap_relocs(void *auxv) { const struct cap_reloc *start_cap_relocs, *end_cap_relocs, *reloc; - uintcap_t root_cap; + uintptr_t cap_rx = 0; + uintptr_t cap_rw = 0;
/* * References to the linker-generated start/stop section symbols must @@ -77,15 +97,20 @@ void __morello_process_cap_relocs(void) "add %0, %0, #:lo12:__stop___cap_relocs" : "=C"(end_cap_relocs));
- root_cap = (uintcap_t)cheri_ddc_get(); + get_caps(&cap_rx, &cap_rw, auxv);
for (reloc = start_cap_relocs; reloc < end_cap_relocs; ++reloc) { + bool is_writable = + (reloc->perms_to_clear & CHERI_PERM_STORE) == 0; bool is_executable = (reloc->perms_to_clear & CHERI_PERM_EXECUTE) == 0; uintcap_t cap; uintcap_t *target;
- cap = cheri_address_set(root_cap, reloc->base); + if (is_writable) + cap = cheri_address_set(cap_rw, reloc->base); + else + cap = cheri_address_set(cap_rx, reloc->base);
if (!is_executable && reloc->size) cap = cheri_bounds_set(cap, reloc->size); @@ -96,31 +121,12 @@ void __morello_process_cap_relocs(void) if (is_executable) cap = cheri_sentry_create(cap);
- target = (uintcap_t *)cheri_address_set(root_cap, + target = (uintcap_t *)cheri_address_set(cap_rw, reloc->capability_location); *target = cap; } }
-static void get_caps(uintptr_t *cap_rx, uintptr_t *cap_rw, const uintptr_t *auxv) -{ - for (;;) { - switch ((unsigned long)auxv[0]) { - case AT_NULL: - return; - case AT_CHERI_EXEC_RX_CAP: - *cap_rx = auxv[1]; - *cap_rx = cheri_perms_and(*cap_rx, CHERI_PERM_MASK_RX); - break; - case AT_CHERI_EXEC_RW_CAP: - *cap_rw = auxv[1]; - *cap_rw = cheri_perms_and(*cap_rw, CHERI_PERM_MASK_RW); - break; - } - auxv += 2; - } -} - static inline uintptr_t morello_relative(uint64_t base, uintptr_t cap_rx, uintptr_t cap_rw, Elf64_Rela *reloc, void *reloc_addr) { diff --git a/tools/testing/selftests/arm64/morello/freestanding_start.S b/tools/testing/selftests/arm64/morello/freestanding_start.S index 92260fdc384d..755f24de79f2 100644 --- a/tools/testing/selftests/arm64/morello/freestanding_start.S +++ b/tools/testing/selftests/arm64/morello/freestanding_start.S @@ -72,7 +72,8 @@ FUNCTION_START(_start) mov c22, c2 mov c23, c3
- /* call void __morello_process_cap_relocs(void) */ + /* call void __morello_process_cap_relocs(char *auxv) */ + mov c0, c23 bl __morello_process_cap_relocs /* call void __morello_process_dynamic_relocs(char *auxv) */ mov c0, c23
The bootstrap tests make a number of assumptions that are not in line with the PCuABI specification, notably that argv/envp have the same bounds as CSP and that AT_ENTRY is unsealed. We are about to improve the alignment with the spec, so remove those assumptions.
Since the initial data cannot be assumed to live on the stack at all in PCuABI, the parsing of argc/argv/envp/auxv on the stack is removed, and instead the argv/envp pointers (passed in c1/c2) are asserted to be equal to AT_ARGV/AT_ENVP.
Signed-off-by: Kevin Brodsky kevin.brodsky@arm.com --- .../selftests/arm64/morello/bootstrap.c | 64 ++++++++----------- 1 file changed, 26 insertions(+), 38 deletions(-)
diff --git a/tools/testing/selftests/arm64/morello/bootstrap.c b/tools/testing/selftests/arm64/morello/bootstrap.c index a2719fe5e883..2891318052d9 100644 --- a/tools/testing/selftests/arm64/morello/bootstrap.c +++ b/tools/testing/selftests/arm64/morello/bootstrap.c @@ -87,7 +87,6 @@ int verify_auxval(struct morello_auxv *auxv) if ((void *)auxv->a_val == NULL) break; /* fallthrough */ - case AT_ENTRY: case AT_EXECFN: case AT_PLATFORM: case AT_RANDOM: @@ -97,8 +96,6 @@ int verify_auxval(struct morello_auxv *auxv) case AT_CHERI_STACK_CAP: case AT_CHERI_SEAL_CAP: case AT_CHERI_CID_CAP: - case AT_ARGV: - case AT_ENVP: /* valid and unsealed */ ASSERT_TRUE(cheri_tag_get(auxv->a_val)) TH_LOG("auxv (%ld) value is invalid", auxv->a_type); @@ -106,6 +103,17 @@ int verify_auxval(struct morello_auxv *auxv) TH_LOG("auxv (%ld) value is sealed", auxv->a_type); } break; + case AT_ENTRY: + /* valid */ + ASSERT_TRUE(cheri_tag_get(auxv->a_val)) + TH_LOG("auxv (%ld) value is invalid", auxv->a_type); + break; + case AT_ARGV: + ASSERT_CAP_EQ(reg_data.argv, auxv->a_val); + break; + case AT_ENVP: + ASSERT_CAP_EQ(reg_data.envp, auxv->a_val); + break; default: /* "null capability with its address set to the usual value" */ ASSERT_CAP_EQ( @@ -149,56 +157,36 @@ TEST(test_c64) ASSERT_TRUE(cheri_tag_get(cap)) TH_LOG("not running in C64"); }
-TEST(test_stack) +TEST(test_initial_data) { - /* copy the pointer so we can modify it */ - char **stack = stack_from_kernel; - int argc_stack; - /* start with an argc check */ - VERIFY_CAP(stack, sizeof(void *), STACK_PERMS, "argc"); - - /* dereference and go past argc */ - argc_stack = *(int *)stack; - ASSERT_EQ(argc_stack, reg_data.argc); - stack += 1; + int envc = 0; + int auxc = 0;
/* argv + null */ - VERIFY_CAP(stack, sizeof(void *) * (argc_stack + 1), STACK_PERMS, "argv_stack"); - ASSERT_CAP_EQ(reg_data.argv, stack); + VERIFY_CAP(reg_data.argv, sizeof(void *) * (reg_data.argc + 1), STACK_PERMS, "argv");
/* we are clear to dereference all argv */ - for (int i = 0; i < argc_stack; i++) { - char *arg = *(stack + i); - verify_string(arg); + for (int i = 0; i < reg_data.argc; i++) { + verify_string(reg_data.argv[i]); }
- /* go past argv */ - stack += argc_stack; - ASSERT_NULL(*stack) TH_LOG("argv was not null terminated on stack"); - /* go past the null terminator */ - stack += 1; + ASSERT_NULL(reg_data.argv[reg_data.argc]) TH_LOG("argv is not null terminated");
/* progressively check bounds for envp and dereference */ - ASSERT_CAP_EQ(reg_data.envp, stack); while (1) { - char *envp_stack = *stack; - - VERIFY_CAP(stack, sizeof(void *), STACK_PERMS, "envp"); - stack += 1; - if (envp_stack == NULL) + VERIFY_CAP(reg_data.envp + envc, sizeof(void *), STACK_PERMS, "envp"); + if (reg_data.envp[envc] == NULL) break; - verify_string(envp_stack); + verify_string(reg_data.envp[envc]); + envc++; }
/* finally, go through auxv */ - ASSERT_CAP_EQ(reg_data.auxv, stack); while (1) { - struct morello_auxv *auxv_stack = ((struct morello_auxv *) stack); - - VERIFY_CAP(auxv_stack, sizeof(struct morello_auxv), STACK_PERMS, "auxv"); - if (verify_auxval(auxv_stack) == 0) + VERIFY_CAP(reg_data.auxv + auxc, sizeof(struct morello_auxv), STACK_PERMS, "auxv"); + if (verify_auxval(®_data.auxv[auxc]) == 0) break; - stack += 2; + auxc++; } }
@@ -226,7 +214,7 @@ int main(int argc, char **argv, char **envp, struct morello_auxv *auxv) test_write(); /* from now on write and exit work, so go wild */ test_c64(); - test_stack(); + test_initial_data(); /* stack is good, use it so we don't overflow the temp one */ install_kernel_stack(); test_set_tid_address_initial();
switch_to_restricted() must explicitly reconstruct the given function pointer, as it needs to remove a permission bit, which clears the tag due to the capability being sealed. However, it does not currently reseal the capability after using the BUILD instruction. This will cause a capability fault on BLRR once CCTLR_EL0.SBL is set, as per the PCuABI specification.
Fix this by explicitly sealing the new capability before branching to it.
Signed-off-by: Kevin Brodsky kevin.brodsky@arm.com --- tools/testing/selftests/arm64/morello/freestanding_start.S | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/arm64/morello/freestanding_start.S b/tools/testing/selftests/arm64/morello/freestanding_start.S index 755f24de79f2..c5da0b2d6d41 100644 --- a/tools/testing/selftests/arm64/morello/freestanding_start.S +++ b/tools/testing/selftests/arm64/morello/freestanding_start.S @@ -248,6 +248,7 @@ FUNCTION_START(switch_to_restricted) */ adr c1, #0 build c0, c0, c1 + seal c0, c0, rb /* Branch (restricted) */ blrr c0
Add a generic helper to set the bounds of a user pointer, if it carries such metadata. This will be useful to implement capability narrowing (in PCuABI) without additional #ifdef'ing.
Signed-off-by: Kevin Brodsky kevin.brodsky@arm.com --- include/linux/user_ptr.h | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+)
diff --git a/include/linux/user_ptr.h b/include/linux/user_ptr.h index 685586bc0d89..db0f52a05344 100644 --- a/include/linux/user_ptr.h +++ b/include/linux/user_ptr.h @@ -221,4 +221,26 @@ static inline bool user_ptr_is_same(const void __user *p1, const void __user *p2 #endif }
+/** + * user_ptr_set_bounds() - Extract the lower and upper bounds of a user pointer. + * @ptr: The input user pointer. + * @len: The length of the new bounds. + * + * The lower bound (base) of @ptr is set to its address, and its upper bound + * (limit) is set to its address + @len. The lower and upper bounds may be + * adjusted downwards (resp. upwards) if they cannot be exactly represented. If + * @ptr does not carry any bound information, this function returns @ptr + * unchanged. + * + * Return: @ptr with adjusted bounds. + */ +static inline void __user *user_ptr_set_bounds(void __user *ptr, size_t len) +{ +#ifdef CONFIG_CHERI_PURECAP_UABI + return __builtin_cheri_bounds_set(ptr, len); +#else + return ptr; +#endif +} + #endif /* _LINUX_USER_PTR_H */
We can simplify things a little in create_elf_tables() and load_elf_binary() by storing the final entry address for the executable and interpreter in their respective elf_load_info struct, replacing the e_entry variable/argument (executable entry address).
In PCuABI, this will be useful to create the PCC capability in create_elf_tables().
Signed-off-by: Kevin Brodsky kevin.brodsky@arm.com --- fs/binfmt_elf.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index c10ba610be50..f1d9b57d35af 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -163,12 +163,14 @@ struct elf_load_info { unsigned long end_elf_rx; unsigned long start_elf_rw; unsigned long end_elf_rw; + + unsigned long entry; };
static int create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec, unsigned long interp_load_addr, - unsigned long e_entry, unsigned long phdr_addr, + unsigned long phdr_addr, const struct elf_load_info *exec_load_info, const struct elf_load_info *interp_load_info) { @@ -273,7 +275,7 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec, if (bprm->interp_flags & BINPRM_FLAGS_PRESERVE_ARGV0) flags |= AT_FLAGS_PRESERVE_ARGV0; NEW_AUX_ENT(AT_FLAGS, flags); - NEW_AUX_ENT(AT_ENTRY, elf_uaddr_to_user_ptr(e_entry)); + NEW_AUX_ENT(AT_ENTRY, elf_uaddr_to_user_ptr(exec_load_info->entry)); NEW_AUX_ENT(AT_UID, from_kuid_munged(cred->user_ns, cred->uid)); NEW_AUX_ENT(AT_EUID, from_kuid_munged(cred->user_ns, cred->euid)); NEW_AUX_ENT(AT_GID, from_kgid_munged(cred->user_ns, cred->gid)); @@ -908,7 +910,6 @@ static int load_elf_binary(struct linux_binprm *bprm) unsigned long elf_brk; int retval, i; unsigned long elf_entry; - unsigned long e_entry; unsigned long interp_load_addr = 0; unsigned long start_code, end_code, start_data, end_data; unsigned long reloc_func_desc __maybe_unused = 0; @@ -1280,7 +1281,7 @@ static int load_elf_binary(struct linux_binprm *bprm) exec_load_info.end_elf_rx = max(k, exec_load_info.end_elf_rx); }
- e_entry = elf_ex->e_entry + load_bias; + exec_load_info.entry = elf_ex->e_entry + load_bias; phdr_addr += load_bias; elf_brk += load_bias; start_code += load_bias; @@ -1309,6 +1310,7 @@ static int load_elf_binary(struct linux_binprm *bprm) */ interp_load_addr = elf_entry; elf_entry += interp_elf_ex->e_entry; + interp_load_info.entry = elf_entry; } if (BAD_ADDR(elf_entry)) { retval = IS_ERR_VALUE(elf_entry) ? @@ -1323,7 +1325,7 @@ static int load_elf_binary(struct linux_binprm *bprm) kfree(interp_elf_ex); kfree(interp_elf_phdata); } else { - elf_entry = e_entry; + elf_entry = exec_load_info.entry; if (BAD_ADDR(elf_entry)) { retval = -EINVAL; goto out_free_dentry; @@ -1341,7 +1343,7 @@ static int load_elf_binary(struct linux_binprm *bprm) #endif /* ARCH_HAS_SETUP_ADDITIONAL_PAGES */
retval = create_elf_tables(bprm, elf_ex, interp_load_addr, - e_entry, phdr_addr, + phdr_addr, &exec_load_info, &interp_load_info); if (retval < 0) goto out;
In PCuABI, the initial pointers (program counter, stack pointer, argv/envp/auxv) are written to registers in the arch-specific START_THREAD(). argv/envp are also stored in the auxiliary vector itself.
We are about to narrow down the bounds and permissions of those pointers, so to avoid duplicating capability operations, make the pointers directly available to START_THREAD() by storing them in the bprm struct.
Signed-off-by: Kevin Brodsky kevin.brodsky@arm.com --- fs/binfmt_elf.c | 36 ++++++++++++++++++++++++++++++------ include/linux/binfmts.h | 13 +++++++++++++ 2 files changed, 43 insertions(+), 6 deletions(-)
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index f1d9b57d35af..62c79b87729a 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -167,6 +167,27 @@ struct elf_load_info { unsigned long entry; };
+#if defined(CONFIG_CHERI_PURECAP_UABI) && (ELF_COMPAT == 0) +static void set_bprm_stack_caps(struct linux_binprm *bprm, void __user *sp) +{ + void __user *p; + size_t len; + + bprm->pcuabi.csp = sp; + + p = sp + sizeof(elf_stack_item_t); + len = (bprm->argc + 1) * sizeof(elf_stack_item_t); + bprm->pcuabi.argv = p; + + p += len; + len = (bprm->envc + 1) * sizeof(elf_stack_item_t); + bprm->pcuabi.envp = p; + + p += len; + bprm->pcuabi.auxv = p; +} +#endif /* CONFIG_CHERI_PURECAP_UABI && ELF_COMPAT == 0 */ + static int create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec, unsigned long interp_load_addr, @@ -303,6 +324,9 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec, /* * TODO [PCuABI] - Restrict bounds/perms for AT_CHERI_* entries */ + bprm->pcuabi.pcc = elf_uaddr_to_user_ptr(interp_load_addr ? + interp_load_info->entry : + exec_load_info->entry); NEW_AUX_ENT(AT_CHERI_EXEC_RW_CAP, (exec_load_info->start_elf_rw != ~0UL ? elf_uaddr_to_user_ptr(exec_load_info->start_elf_rw) : @@ -349,6 +373,12 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec, sp = STACK_ROUND(sp, items); bprm->p = user_ptr_addr(sp);
+#if defined(CONFIG_CHERI_PURECAP_UABI) && (ELF_COMPAT == 0) + set_bprm_stack_caps(bprm, sp); + *mm_at_argv = (elf_stack_item_t)bprm->pcuabi.argv; + *mm_at_envp = (elf_stack_item_t)bprm->pcuabi.envp; +#endif + /* Point sp at the lowest address on the stack */ #ifdef CONFIG_STACK_GROWSUP sp -= (items + ei_index) * sizeof(elf_stack_item_t); @@ -374,9 +404,6 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec, return -EFAULT;
/* Populate list of argv pointers back to argv strings. */ -#if defined(CONFIG_CHERI_PURECAP_UABI) && (ELF_COMPAT == 0) - *mm_at_argv = (elf_stack_item_t)stack_item; -#endif /* In PCuABI, this derives a capability from SP pointing to arg_start */ ustr = sp + (mm->arg_start - user_ptr_addr(sp)); while (argc-- > 0) { @@ -393,9 +420,6 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec, mm->arg_end = user_ptr_addr(ustr);
/* Populate list of envp pointers back to envp strings. */ -#if defined(CONFIG_CHERI_PURECAP_UABI) && (ELF_COMPAT == 0) - *mm_at_envp = (elf_stack_item_t)stack_item; -#endif mm->env_start = user_ptr_addr(ustr); while (envc-- > 0) { size_t len; diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h index 16d03f38f92e..e0e26c9a0ff9 100644 --- a/include/linux/binfmts.h +++ b/include/linux/binfmts.h @@ -12,6 +12,15 @@ struct coredump_params;
#define CORENAME_MAX_SIZE 128
+struct pcuabi_binprm { + void __user *pcc; + void __user *csp; + /* Capabilities to the initial arrays in the new process's address space */ + void __user *argv; + void __user *envp; + void __user *auxv; +}; + /* * This structure is used to hold the arguments that are used when loading binaries. */ @@ -62,6 +71,10 @@ struct linux_binprm { struct rlimit rlim_stack; /* Saved RLIMIT_STACK used during exec. */
char buf[BINPRM_BUF_SIZE]; + +#ifdef CONFIG_CHERI_PURECAP_UABI + struct pcuabi_binprm pcuabi; +#endif } __randomize_layout;
#define BINPRM_FLAGS_ENFORCE_NONDUMP_BIT 0
The bprm struct now contains the full initial user pointers (PCC, CSP, argv/envp/auxv) as calculated by binfmt_elf, so we can use them to set pcc, csp and c1-c3 in start_thread() in PCuABI. This also ensures that the capability metadata of argv/envp is the same in the auxiliary vector and c1-c2.
To centralise the PCuABI handling in morello.c, x0 and c1-c3 are now set in morello_thread_start() instead of the separate init_gp_regs().
start_thread() had to be moved to process.c as it is not possible to include linux/binfmts.h from asm/processor.h (circular reference), in order to obtain the definition of struct linux_binprm.
Signed-off-by: Kevin Brodsky kevin.brodsky@arm.com --- arch/arm64/include/asm/elf.h | 2 +- arch/arm64/include/asm/morello.h | 4 ++- arch/arm64/include/asm/processor.h | 47 +++--------------------------- arch/arm64/kernel/morello.c | 27 ++++++++++++++--- arch/arm64/kernel/process.c | 17 +++++++++++ 5 files changed, 48 insertions(+), 49 deletions(-)
diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h index 63f6f50599e6..7d253fc3961c 100644 --- a/arch/arm64/include/asm/elf.h +++ b/arch/arm64/include/asm/elf.h @@ -135,7 +135,7 @@ typedef elf_greg_t elf_gregset_t[ELF_NGREG]; typedef struct user_fpsimd_state elf_fpregset_t;
#define START_THREAD(elf_ex, regs, elf_entry, bprm) \ - start_thread(regs, elf_entry, bprm->p, bprm->argc, bprm->envc) + start_thread(regs, elf_entry, bprm)
#define SET_PERSONALITY_AARCH64() \ ({ \ diff --git a/arch/arm64/include/asm/morello.h b/arch/arm64/include/asm/morello.h index da198622b35e..df85e838ffdf 100644 --- a/arch/arm64/include/asm/morello.h +++ b/arch/arm64/include/asm/morello.h @@ -10,6 +10,7 @@ struct pt_regs; struct task_struct; struct user_cap; +struct linux_binprm;
#ifdef CONFIG_ARM64_MORELLO
@@ -54,7 +55,8 @@ void morello_thread_set_csp(struct pt_regs *regs, user_uintptr_t sp); * Any invalid usage will result in an error at link time. */
-void morello_thread_start(struct pt_regs *regs, unsigned long pc); +int morello_thread_start(struct pt_regs *regs, unsigned long pc, + struct linux_binprm *bprm); void morello_thread_init_user(void); void morello_thread_save_user_state(struct task_struct *tsk); void morello_thread_restore_user_state(struct task_struct *tsk); diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index ab02589247f4..c8b91f1a6b88 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -267,35 +267,6 @@ void tls_preserve_current_state(void); .fpsimd_cpu = NR_CPUS, \ }
-static inline int init_gp_regs(struct pt_regs *regs, unsigned long sp, - int argc, int envc) -{ - int retval = 0; -#ifdef CONFIG_CHERI_PURECAP_UABI - unsigned long argv, envp, auxv; - /* - * TODO [PCuABI]: - * - When argv/envp/auxv is moved off the stack, update the registers x1-x3 - * with the new pointer values, and ensure c1-c3 contain appropriate - * capabilities (currently set to csp). - * - Restrict bounds/perms of c1-c3. - * - * In ret_to_user c regs are first loaded then merged with x regs if their values - * are different. Hence we load capabilities in c regs and the value in x regs. - */ - retval = argc; /* Placed in x0 */ - argv = sp + 1 * sizeof(user_uintptr_t); /* Increment past argc on stack */ - envp = argv + (argc + 1) * sizeof(user_uintptr_t); /* Go past arg vals + NULL */ - auxv = envp + (envc + 1) * sizeof(user_uintptr_t); /* Go past env vals + NULL */ - regs->cregs[1] = regs->cregs[2] = regs->cregs[3] = regs->csp; - regs->regs[1] = argv; - regs->regs[2] = envp; - regs->regs[3] = auxv; -#endif /* CONFIG_CHERI_PURECAP_UABI */ - - return retval; -} - static inline void start_thread_common(struct pt_regs *regs, unsigned long pc) { s32 previous_syscall = regs->syscallno; @@ -307,19 +278,8 @@ static inline void start_thread_common(struct pt_regs *regs, unsigned long pc) regs->pmr_save = GIC_PRIO_IRQON; }
-static inline int start_thread(struct pt_regs *regs, unsigned long pc, - unsigned long sp, int argc, int envc) -{ - start_thread_common(regs, pc); - regs->pstate = PSR_MODE_EL0t; - spectre_v4_enable_task_mitigation(current); - regs->sp = sp; - - if (system_supports_morello()) - morello_thread_start(regs, pc); - - return init_gp_regs(regs, sp, argc, envc); -} +int start_thread(struct pt_regs *regs, unsigned long pc, + struct linux_binprm *bprm);
#ifdef CONFIG_COMPAT #ifdef CONFIG_COMPAT64 @@ -333,7 +293,8 @@ static inline void compat_start_thread(struct pt_regs *regs, unsigned long pc, regs->sp = sp;
if (system_supports_morello()) - morello_thread_start(regs, pc); + /* The third argument is only used in PCuABI */ + morello_thread_start(regs, pc, NULL); }
#else /* CONFIG_COMPAT64 */ diff --git a/arch/arm64/kernel/morello.c b/arch/arm64/kernel/morello.c index 7e14ff42db68..ae5977ca6420 100644 --- a/arch/arm64/kernel/morello.c +++ b/arch/arm64/kernel/morello.c @@ -5,6 +5,7 @@
#define pr_fmt(fmt) "morello: " fmt
+#include <linux/binfmts.h> #include <linux/cache.h> #include <linux/capability.h> #include <linux/cheri.h> @@ -65,6 +66,14 @@ static void update_regs_c64(struct pt_regs *regs, unsigned long pc) } }
+#ifdef CONFIG_CHERI_PURECAP_UABI +static void set_creg_user_ptr(struct pt_regs *regs, int r, void __user *val) +{ + regs->regs[r] = user_ptr_addr(val); + regs->cregs[r] = (uintcap_t)val; +} +#endif + void morello_cap_get_val_tag(uintcap_t cap, __uint128_t *val, u8 *tag) { *((uintcap_t *)val) = cheri_tag_clear(cap); @@ -86,8 +95,11 @@ uintcap_t morello_build_any_user_cap(const __uint128_t *val, u8 tag) return cap; }
-void morello_thread_start(struct pt_regs *regs, unsigned long pc) +int morello_thread_start(struct pt_regs *regs, unsigned long pc, + struct linux_binprm *bprm) { + int ret = 0; + update_regs_c64(regs, pc);
/* @@ -96,14 +108,21 @@ void morello_thread_start(struct pt_regs *regs, unsigned long pc) * register merging automatically happens during ret_to_user. */ if (is_pure_task()) { - /* TODO [PCuABI] - Adjust the bounds/permissions properly */ - regs->pcc = cheri_user_root_cap; +#ifdef CONFIG_CHERI_PURECAP_UABI + regs->pcc = (uintcap_t)bprm->pcuabi.pcc; + regs->csp = (uintcap_t)bprm->pcuabi.csp;
- regs->csp = cheri_user_root_cap; + ret = bprm->argc; /* Set x0 */ + set_creg_user_ptr(regs, 1, bprm->pcuabi.argv); + set_creg_user_ptr(regs, 2, bprm->pcuabi.envp); + set_creg_user_ptr(regs, 3, bprm->pcuabi.auxv); +#endif } else /* Hybrid */ { regs->pcc = cheri_user_root_allperms_cap; /* CSP is null-derived in hybrid */ } + + return ret; }
void morello_thread_init_user(void) diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 03b48b6bc9a3..f560f92c4c3b 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -41,6 +41,7 @@ #include <linux/thread_info.h> #include <linux/prctl.h> #include <linux/stacktrace.h> +#include <linux/binfmts.h>
#include <asm/alternative.h> #include <asm/compat.h> @@ -456,6 +457,22 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) return 0; }
+int start_thread(struct pt_regs *regs, unsigned long pc, + struct linux_binprm *bprm) +{ + int ret = 0; + + start_thread_common(regs, pc); + regs->pstate = PSR_MODE_EL0t; + spectre_v4_enable_task_mitigation(current); + regs->sp = bprm->p; + + if (system_supports_morello()) + ret = morello_thread_start(regs, pc, bprm); + + return ret; +} + void tls_preserve_current_state(void) { task_save_user_tls(current);
This patch narrows down the bounds and permissions of all initial capabilities pointing to executable and interpreter mappings in PCuABI, as per the specification. This includes:
* PCC (the initial capability program counter) * The root executable/interpreter capabilities (AT_CHERI_{EXEC,INTERP}_*_CAP) * Other auxv entries that point to the executable (AT_ENTRY, AT_PHDR)
The capabilities are currently derived from the root userspace capability. While the resulting bounds should be correct, they should ideally be derived from the appropriate owning capability instead, once reservations are implemented.
AT_BASE remains a simple address in the PCuABI specification, this will be amended in a subsequent patch.
Signed-off-by: Kevin Brodsky kevin.brodsky@arm.com --- fs/binfmt_elf.c | 127 +++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 105 insertions(+), 22 deletions(-)
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 62c79b87729a..fafe1202dae6 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -168,6 +168,86 @@ struct elf_load_info { };
#if defined(CONFIG_CHERI_PURECAP_UABI) && (ELF_COMPAT == 0) +static elf_stack_item_t make_ro_at_from_addr(ptraddr_t addr, size_t len) +{ + void __user *cap; + cheri_perms_t perms; + + perms = CHERI_PERM_GLOBAL | CHERI_PERM_LOAD; + cap = cheri_build_user_cap_inexact_bounds(addr, len, perms); + + return (elf_stack_item_t)cap; +} + +static elf_stack_item_t make_at_entry(const struct elf_load_info *load_info) +{ + void __user *cap; + size_t len; + cheri_perms_t perms; + + len = PAGE_ALIGN(load_info->end_elf_rx - load_info->start_elf_rx); + perms = CHERI_PERM_GLOBAL | CHERI_PERMS_READ | CHERI_PERMS_EXEC; + + cap = cheri_build_user_cap_inexact_bounds(load_info->start_elf_rx, + len, perms); + cap = cheri_address_set(cap, load_info->entry); + cap = cheri_sentry_create(cap); + + return (elf_stack_item_t)cap; +} + +static void __user *make_user_pcc(const struct elf_load_info *load_info) +{ + void __user *pcc; + size_t len; + cheri_perms_t perms; + + len = PAGE_ALIGN(load_info->end_elf_rx - load_info->start_elf_rx); + perms = CHERI_PERM_GLOBAL | CHERI_PERMS_READ | CHERI_PERMS_EXEC; + + pcc = cheri_build_user_cap_inexact_bounds(load_info->start_elf_rx, + len, perms); + pcc = cheri_address_set(pcc, load_info->entry); + + return pcc; +} + +static void __user *make_elf_rw_cap(const struct elf_load_info *load_info) +{ + ptraddr_t start_addr; + size_t len; + cheri_perms_t perms; + + if (load_info->start_elf_rw == ~0UL) + return NULL; + + /* + * The RW region typically starts after the first segment, and the + * offset may not be page-aligned. + */ + start_addr = PAGE_ALIGN_DOWN(load_info->start_elf_rw); + len = PAGE_ALIGN(load_info->end_elf_rw - start_addr); + perms = CHERI_PERMS_ROOTCAP | CHERI_PERMS_READ | CHERI_PERMS_WRITE; +#ifdef CONFIG_ARM64_MORELLO + perms |= ARM_CAP_PERMISSION_BRANCH_SEALED_PAIR; +#endif + return cheri_build_user_cap_inexact_bounds(start_addr, len, perms); +} + +static void __user *make_elf_rx_cap(const struct elf_load_info *load_info) +{ + size_t len; + cheri_perms_t perms; + + len = PAGE_ALIGN(load_info->end_elf_rx - load_info->start_elf_rx); + perms = CHERI_PERMS_ROOTCAP | CHERI_PERMS_READ | CHERI_PERMS_EXEC; +#ifdef CONFIG_ARM64_MORELLO + perms |= ARM_CAP_PERMISSION_BRANCH_SEALED_PAIR; +#endif + return cheri_build_user_cap_inexact_bounds(load_info->start_elf_rx, + len, perms); +} + static void set_bprm_stack_caps(struct linux_binprm *bprm, void __user *sp) { void __user *p; @@ -186,6 +266,16 @@ static void set_bprm_stack_caps(struct linux_binprm *bprm, void __user *sp) p += len; bprm->pcuabi.auxv = p; } +#else /* CONFIG_CHERI_PURECAP_UABI && ELF_COMPAT == 0 */ +static elf_stack_item_t make_ro_at_from_addr(ptraddr_t addr, size_t len) +{ + return addr; +} + +static elf_stack_item_t make_at_entry(const struct elf_load_info *load_info) +{ + return load_info->entry; +} #endif /* CONFIG_CHERI_PURECAP_UABI && ELF_COMPAT == 0 */
static int @@ -289,14 +379,15 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec, NEW_AUX_ENT(AT_HWCAP, ELF_HWCAP); NEW_AUX_ENT(AT_PAGESZ, ELF_EXEC_PAGESIZE); NEW_AUX_ENT(AT_CLKTCK, CLOCKS_PER_SEC); - NEW_AUX_ENT(AT_PHDR, elf_uaddr_to_user_ptr(phdr_addr)); + NEW_AUX_ENT(AT_PHDR, make_ro_at_from_addr(phdr_addr, + exec->e_phnum * sizeof(struct elf_phdr))); NEW_AUX_ENT(AT_PHENT, sizeof(struct elf_phdr)); NEW_AUX_ENT(AT_PHNUM, exec->e_phnum); NEW_AUX_ENT(AT_BASE, elf_uaddr_to_user_ptr(interp_load_addr)); if (bprm->interp_flags & BINPRM_FLAGS_PRESERVE_ARGV0) flags |= AT_FLAGS_PRESERVE_ARGV0; NEW_AUX_ENT(AT_FLAGS, flags); - NEW_AUX_ENT(AT_ENTRY, elf_uaddr_to_user_ptr(exec_load_info->entry)); + NEW_AUX_ENT(AT_ENTRY, make_at_entry(exec_load_info)); NEW_AUX_ENT(AT_UID, from_kuid_munged(cred->user_ns, cred->uid)); NEW_AUX_ENT(AT_EUID, from_kuid_munged(cred->user_ns, cred->euid)); NEW_AUX_ENT(AT_GID, from_kgid_munged(cred->user_ns, cred->gid)); @@ -321,26 +412,18 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec, NEW_AUX_ENT(AT_RSEQ_ALIGN, __alignof__(struct rseq)); #endif #if defined(CONFIG_CHERI_PURECAP_UABI) && (ELF_COMPAT == 0) - /* - * TODO [PCuABI] - Restrict bounds/perms for AT_CHERI_* entries - */ - bprm->pcuabi.pcc = elf_uaddr_to_user_ptr(interp_load_addr ? - interp_load_info->entry : - exec_load_info->entry); - NEW_AUX_ENT(AT_CHERI_EXEC_RW_CAP, - (exec_load_info->start_elf_rw != ~0UL ? - elf_uaddr_to_user_ptr(exec_load_info->start_elf_rw) : - NULL)); - NEW_AUX_ENT(AT_CHERI_EXEC_RX_CAP, - elf_uaddr_to_user_ptr(exec_load_info->start_elf_rx)); - NEW_AUX_ENT(AT_CHERI_INTERP_RW_CAP, - ((interp_load_addr && interp_load_info->start_elf_rw != ~0UL) ? - elf_uaddr_to_user_ptr(interp_load_info->start_elf_rw) : - NULL)); - NEW_AUX_ENT(AT_CHERI_INTERP_RX_CAP, - (interp_load_addr ? - elf_uaddr_to_user_ptr(interp_load_info->start_elf_rx) : - NULL)); + NEW_AUX_ENT(AT_CHERI_EXEC_RW_CAP, make_elf_rw_cap(exec_load_info)); + NEW_AUX_ENT(AT_CHERI_EXEC_RX_CAP, make_elf_rx_cap(exec_load_info)); + if (interp_load_addr) { + NEW_AUX_ENT(AT_CHERI_INTERP_RW_CAP, make_elf_rw_cap(interp_load_info)); + NEW_AUX_ENT(AT_CHERI_INTERP_RX_CAP, make_elf_rx_cap(interp_load_info)); + bprm->pcuabi.pcc = make_user_pcc(interp_load_info); + } else { + NEW_AUX_ENT(AT_CHERI_INTERP_RW_CAP, NULL); + NEW_AUX_ENT(AT_CHERI_INTERP_RX_CAP, NULL); + bprm->pcuabi.pcc = make_user_pcc(exec_load_info); + } + NEW_AUX_ENT(AT_CHERI_STACK_CAP, elf_uaddr_to_user_ptr(0)); NEW_AUX_ENT(AT_CHERI_SEAL_CAP, cheri_user_root_seal_cap); NEW_AUX_ENT(AT_CHERI_CID_CAP, cheri_user_root_cid_cap);
This patch narrows down the permissions, and bounds if possible, of all initial capabilities pointing to the stack in PCuABI, as per the specification. This includes:
* CSP (the capability stack pointer) * The argv/envp/auxv pointers * The root stack capability (AT_CHERI_STACK_CAP) * Other auxv entries that point to the stack (AT_RANDOM, AT_EXECFN, AT_PLATFORM, AT_BASE_PLATFORM)
Capabilities giving access to the whole stack (CSP, AT_CHERI_STACK_CAP) remain unbounded for now, as the stack is itself unbounded. It will be possible to reduce their bounds once a stack reservation with a fixed size is introduced, as per the PCuABI specification.
argv and envp may overlap if a very large number of arguments / environment variables is provided (see TODO in set_bprm_stack_caps()). This is however unlikely in practice.
Signed-off-by: Kevin Brodsky kevin.brodsky@arm.com --- fs/binfmt_elf.c | 86 ++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 68 insertions(+), 18 deletions(-)
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index fafe1202dae6..1b3892cdd21d 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -140,12 +140,12 @@ static int padzero(unsigned long address) USER_PTR_ALIGN((sp) + (items) * sizeof(elf_stack_item_t), 16) #define STACK_ALLOC(sp, len) ({ \ void __user *old_sp = sp; sp += len; \ - old_sp; }) + user_ptr_set_bounds(old_sp, len); }) #else #define STACK_ADD(sp, items) ((sp) - (items) * sizeof(elf_stack_item_t)) #define STACK_ROUND(sp, items) \ USER_PTR_ALIGN_DOWN((sp) - (items) * sizeof(elf_stack_item_t), 16) -#define STACK_ALLOC(sp, len) (sp -= len) +#define STACK_ALLOC(sp, len) ({ sp -= len; user_ptr_set_bounds(sp, len); }) #endif static_assert(sizeof(elf_stack_item_t) <= 16);
@@ -168,6 +168,20 @@ struct elf_load_info { };
#if defined(CONFIG_CHERI_PURECAP_UABI) && (ELF_COMPAT == 0) +static void __user *make_user_sp(unsigned long p) +{ + /* + * TODO [PCuABI] - derive a capability with bounds matching the stack + * reservation. + */ + void __user *sp = uaddr_to_user_ptr_safe(p); + + sp = cheri_perms_and(sp, CHERI_PERM_GLOBAL | + CHERI_PERMS_READ | CHERI_PERMS_WRITE); + + return sp; +} + static elf_stack_item_t make_ro_at_from_addr(ptraddr_t addr, size_t len) { void __user *cap; @@ -179,6 +193,13 @@ static elf_stack_item_t make_ro_at_from_addr(ptraddr_t addr, size_t len) return (elf_stack_item_t)cap; }
+static elf_stack_item_t make_ro_at_from_uptr(void __user *ptr) +{ + cheri_perms_t perms = CHERI_PERM_GLOBAL | CHERI_PERM_LOAD; + + return (elf_stack_item_t)cheri_perms_and(ptr, perms); +} + static elf_stack_item_t make_at_entry(const struct elf_load_info *load_info) { void __user *cap; @@ -248,30 +269,63 @@ static void __user *make_elf_rx_cap(const struct elf_load_info *load_info) len, perms); }
-static void set_bprm_stack_caps(struct linux_binprm *bprm, void __user *sp) +static void __user *make_root_stack_cap(void) +{ + /* + * TODO [PCuABI] - derive a capability with bounds matching the stack + * reservation. + */ + void __user *cap = uaddr_to_user_ptr_safe(0); + + cap = cheri_perms_and(cap, CHERI_PERMS_ROOTCAP | + CHERI_PERMS_READ | CHERI_PERMS_WRITE); + + return cap; +} + +static void set_bprm_stack_caps(struct linux_binprm *bprm, void __user *sp, + int auxv_size) { void __user *p; size_t len;
bprm->pcuabi.csp = sp;
+ /* + * TODO [PCuABI] - argc and envc may be very large (up to + * MAX_ARG_STRINGS), and as a result the bounds of argv/envp may not be + * exact. Padding should ideally be introduced between the arrays in + * such a situation. This should be easily done once they are moved out + * of the stack, as per the PCuABI specification. + */ p = sp + sizeof(elf_stack_item_t); len = (bprm->argc + 1) * sizeof(elf_stack_item_t); - bprm->pcuabi.argv = p; + bprm->pcuabi.argv = cheri_bounds_set(p, len);
p += len; len = (bprm->envc + 1) * sizeof(elf_stack_item_t); - bprm->pcuabi.envp = p; + bprm->pcuabi.envp = cheri_bounds_set(p, len);
p += len; - bprm->pcuabi.auxv = p; + len = auxv_size * sizeof(elf_stack_item_t); + bprm->pcuabi.auxv = cheri_bounds_set(p, len); } #else /* CONFIG_CHERI_PURECAP_UABI && ELF_COMPAT == 0 */ +static void __user *make_user_sp(unsigned long p) +{ + return uaddr_to_user_ptr_safe(p); +} + static elf_stack_item_t make_ro_at_from_addr(ptraddr_t addr, size_t len) { return addr; }
+static elf_stack_item_t make_ro_at_from_uptr(void __user *ptr) +{ + return user_ptr_addr(ptr); +} + static elf_stack_item_t make_at_entry(const struct elf_load_info *load_info) { return load_info->entry; @@ -315,12 +369,7 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec, */
p = arch_align_stack(p); - - /* - * TODO [PCuABI] - derive an appropriate capability (bounds matching - * the stack reservation, RW permissions) - */ - sp = uaddr_to_user_ptr_safe(p); + sp = make_user_sp(p);
/* * If this architecture has a platform capability string, copy it @@ -393,16 +442,17 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec, NEW_AUX_ENT(AT_GID, from_kgid_munged(cred->user_ns, cred->gid)); NEW_AUX_ENT(AT_EGID, from_kgid_munged(cred->user_ns, cred->egid)); NEW_AUX_ENT(AT_SECURE, bprm->secureexec); - NEW_AUX_ENT(AT_RANDOM, (user_uintptr_t)u_rand_bytes); + NEW_AUX_ENT(AT_RANDOM, make_ro_at_from_uptr(u_rand_bytes)); #ifdef ELF_HWCAP2 NEW_AUX_ENT(AT_HWCAP2, ELF_HWCAP2); #endif - NEW_AUX_ENT(AT_EXECFN, elf_uaddr_to_user_ptr(bprm->exec)); + NEW_AUX_ENT(AT_EXECFN, make_ro_at_from_addr(bprm->exec, + strnlen(bprm->filename, MAX_ARG_STRLEN) + 1)); if (k_platform) { - NEW_AUX_ENT(AT_PLATFORM, (user_uintptr_t)u_platform); + NEW_AUX_ENT(AT_PLATFORM, make_ro_at_from_uptr(u_platform)); } if (k_base_platform) { - NEW_AUX_ENT(AT_BASE_PLATFORM, (user_uintptr_t)u_base_platform); + NEW_AUX_ENT(AT_BASE_PLATFORM, make_ro_at_from_uptr(u_base_platform)); } if (bprm->have_execfd) { NEW_AUX_ENT(AT_EXECFD, bprm->execfd); @@ -424,7 +474,7 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec, bprm->pcuabi.pcc = make_user_pcc(exec_load_info); }
- NEW_AUX_ENT(AT_CHERI_STACK_CAP, elf_uaddr_to_user_ptr(0)); + NEW_AUX_ENT(AT_CHERI_STACK_CAP, make_root_stack_cap()); NEW_AUX_ENT(AT_CHERI_SEAL_CAP, cheri_user_root_seal_cap); NEW_AUX_ENT(AT_CHERI_CID_CAP, cheri_user_root_cid_cap);
@@ -457,7 +507,7 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec, bprm->p = user_ptr_addr(sp);
#if defined(CONFIG_CHERI_PURECAP_UABI) && (ELF_COMPAT == 0) - set_bprm_stack_caps(bprm, sp); + set_bprm_stack_caps(bprm, sp, ei_index); *mm_at_argv = (elf_stack_item_t)bprm->pcuabi.argv; *mm_at_envp = (elf_stack_item_t)bprm->pcuabi.envp; #endif
This patch narrows down the permissions and bounds of initial capabilities pointing to argument and environment variable strings (elements of argv and envp).
This is a reduced version of Téo's patch ("fs: Handle exact bounds for argv and envp"), which had to be reverted as it could not handle empty strings. This version does not handle representability at all, as a result string capabilities may overlap. To properly handle string representability, it seems unavoidable to pass additional metadata to binfmt_elf, to clarify if and where padding was added by copy_strings().
Co-developed-by: Teo Couprie Diaz teo.coupriediaz@arm.com Signed-off-by: Kevin Brodsky kevin.brodsky@arm.com --- fs/binfmt_elf.c | 59 +++++++++++++++++++++++++++++++++++++------------ 1 file changed, 45 insertions(+), 14 deletions(-)
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 1b3892cdd21d..2bf664fcc3c0 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -133,6 +133,37 @@ static int padzero(unsigned long address) return 0; }
+/* + * Put a pointer to a string on the stack, return the size of the string on the + * stack or an error. + */ +static long put_str_ptr(char __user *ustr, + elf_stack_item_t __user *stack_item) +{ + size_t len; + + len = strnlen_user(ustr, MAX_ARG_STRLEN); + if (!len || len > MAX_ARG_STRLEN) + return -EINVAL; + + /* + * TODO [PCuABI] - it is possible (though unlikely) that the string + * length makes it impossible to represent the bounds of ustr exactly. + * To avoid string pointers overlapping, padding should therefore be + * introduced at the point where strings are copied into the new + * process (copy_strings() in exec.c). Because strings can be empty + * (just one zero byte), the padding cannot simply be skipped here; some + * padding metadata should instead be made available by exec.c, for + * instance in struct linux_binprm. + */ + ustr = user_ptr_set_bounds(ustr, len); + + if (elf_stack_put_user(ustr, stack_item++)) + return -EFAULT; + + return len; +} + /* Let's use some macros to make this stack manipulation a little clearer */ #ifdef CONFIG_STACK_GROWSUP #define STACK_ADD(sp, items) ((sp) + (items) * sizeof(elf_stack_item_t)) @@ -539,14 +570,16 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec, /* Populate list of argv pointers back to argv strings. */ /* In PCuABI, this derives a capability from SP pointing to arg_start */ ustr = sp + (mm->arg_start - user_ptr_addr(sp)); +#if defined(CONFIG_CHERI_PURECAP_UABI) && (ELF_COMPAT == 0) + ustr = cheri_perms_and(ustr, CHERI_PERM_GLOBAL | + CHERI_PERM_LOAD | CHERI_PERM_STORE); +#endif while (argc-- > 0) { - size_t len; - if (elf_stack_put_user(ustr, stack_item++)) - return -EFAULT; - len = strnlen_user(ustr, MAX_ARG_STRLEN); - if (!len || len > MAX_ARG_STRLEN) - return -EINVAL; - ustr += len; + long str_ret = put_str_ptr(ustr, stack_item++); + + if (str_ret < 0) + return str_ret; + ustr += str_ret; } if (elf_stack_put_user(0, stack_item++)) return -EFAULT; @@ -555,13 +588,11 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec, /* Populate list of envp pointers back to envp strings. */ mm->env_start = user_ptr_addr(ustr); while (envc-- > 0) { - size_t len; - if (elf_stack_put_user(ustr, stack_item++)) - return -EFAULT; - len = strnlen_user(ustr, MAX_ARG_STRLEN); - if (!len || len > MAX_ARG_STRLEN) - return -EINVAL; - ustr += len; + long str_ret = put_str_ptr(ustr, stack_item++); + + if (str_ret < 0) + return str_ret; + ustr += str_ret; } if (elf_stack_put_user(0, stack_item++)) return -EFAULT;
Discussed directly with Kevin, agreed that without properly handling padding and passing data between `fs/exec.c` and `fs/binfmt_elf.c` this is our best option.
Looks good to me !
Signed-off-by: Teo Couprie Diaz teo.coupriediaz@arm.com
On 11/04/2024 13:03, Kevin Brodsky wrote:
This patch narrows down the permissions and bounds of initial capabilities pointing to argument and environment variable strings (elements of argv and envp).
This is a reduced version of Téo's patch ("fs: Handle exact bounds for argv and envp"), which had to be reverted as it could not handle empty strings. This version does not handle representability at all, as a result string capabilities may overlap. To properly handle string representability, it seems unavoidable to pass additional metadata to binfmt_elf, to clarify if and where padding was added by copy_strings().
Co-developed-by: Teo Couprie Diaz teo.coupriediaz@arm.com Signed-off-by: Kevin Brodsky kevin.brodsky@arm.com
fs/binfmt_elf.c | 59 +++++++++++++++++++++++++++++++++++++------------ 1 file changed, 45 insertions(+), 14 deletions(-)
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 1b3892cdd21d..2bf664fcc3c0 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -133,6 +133,37 @@ static int padzero(unsigned long address) return 0; } +/*
- Put a pointer to a string on the stack, return the size of the string on the
- stack or an error.
- */
+static long put_str_ptr(char __user *ustr,
elf_stack_item_t __user *stack_item)
+{
- size_t len;
- len = strnlen_user(ustr, MAX_ARG_STRLEN);
- if (!len || len > MAX_ARG_STRLEN)
return -EINVAL;
- /*
* TODO [PCuABI] - it is possible (though unlikely) that the string
* length makes it impossible to represent the bounds of ustr exactly.
* To avoid string pointers overlapping, padding should therefore be
* introduced at the point where strings are copied into the new
* process (copy_strings() in exec.c). Because strings can be empty
* (just one zero byte), the padding cannot simply be skipped here; some
* padding metadata should instead be made available by exec.c, for
* instance in struct linux_binprm.
*/
- ustr = user_ptr_set_bounds(ustr, len);
- if (elf_stack_put_user(ustr, stack_item++))
return -EFAULT;
- return len;
+}
- /* Let's use some macros to make this stack manipulation a little clearer */ #ifdef CONFIG_STACK_GROWSUP #define STACK_ADD(sp, items) ((sp) + (items) * sizeof(elf_stack_item_t))
@@ -539,14 +570,16 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec, /* Populate list of argv pointers back to argv strings. */ /* In PCuABI, this derives a capability from SP pointing to arg_start */ ustr = sp + (mm->arg_start - user_ptr_addr(sp)); +#if defined(CONFIG_CHERI_PURECAP_UABI) && (ELF_COMPAT == 0)
- ustr = cheri_perms_and(ustr, CHERI_PERM_GLOBAL |
CHERI_PERM_LOAD | CHERI_PERM_STORE);
+#endif while (argc-- > 0) {
size_t len;
if (elf_stack_put_user(ustr, stack_item++))
return -EFAULT;
len = strnlen_user(ustr, MAX_ARG_STRLEN);
if (!len || len > MAX_ARG_STRLEN)
return -EINVAL;
ustr += len;
long str_ret = put_str_ptr(ustr, stack_item++);
if (str_ret < 0)
return str_ret;
} if (elf_stack_put_user(0, stack_item++)) return -EFAULT;ustr += str_ret;
@@ -555,13 +588,11 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec, /* Populate list of envp pointers back to envp strings. */ mm->env_start = user_ptr_addr(ustr); while (envc-- > 0) {
size_t len;
if (elf_stack_put_user(ustr, stack_item++))
return -EFAULT;
len = strnlen_user(ustr, MAX_ARG_STRLEN);
if (!len || len > MAX_ARG_STRLEN)
return -EINVAL;
ustr += len;
long str_ret = put_str_ptr(ustr, stack_item++);
if (str_ret < 0)
return str_ret;
} if (elf_stack_put_user(0, stack_item++)) return -EFAULT;ustr += str_ret;
AT_BASE was once used in purecap as root capability for the interpreter. It was superseded by AT_CHERI_INTERP_*_CAP and no known current software relies on it being a capability. Turn it back to a plain address, consistently with the PCuABI specification (which makes no special provision for AT_BASE, implying that it is a 64-bit value).
Signed-off-by: Kevin Brodsky kevin.brodsky@arm.com --- fs/binfmt_elf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 2bf664fcc3c0..cf96d703b333 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -463,7 +463,7 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec, exec->e_phnum * sizeof(struct elf_phdr))); NEW_AUX_ENT(AT_PHENT, sizeof(struct elf_phdr)); NEW_AUX_ENT(AT_PHNUM, exec->e_phnum); - NEW_AUX_ENT(AT_BASE, elf_uaddr_to_user_ptr(interp_load_addr)); + NEW_AUX_ENT(AT_BASE, interp_load_addr); if (bprm->interp_flags & BINPRM_FLAGS_PRESERVE_ARGV0) flags |= AT_FLAGS_PRESERVE_ARGV0; NEW_AUX_ENT(AT_FLAGS, flags);
This macro is no longer used after the recent changes to binfmt_elf.
Signed-off-by: Kevin Brodsky kevin.brodsky@arm.com --- fs/compat_binfmt_elf.c | 3 --- include/linux/elf.h | 1 - 2 files changed, 4 deletions(-)
diff --git a/fs/compat_binfmt_elf.c b/fs/compat_binfmt_elf.c index 6f108d3e1278..3a7fe15cebd9 100644 --- a/fs/compat_binfmt_elf.c +++ b/fs/compat_binfmt_elf.c @@ -26,9 +26,6 @@ #undef elf_stack_item_t #define elf_stack_item_t elf_addr_t
-#undef elf_uaddr_to_user_ptr -#define elf_uaddr_to_user_ptr(addr) addr - #undef elf_copy_to_user_stack #define elf_copy_to_user_stack(to, from, len) copy_to_user(to, from, len)
diff --git a/include/linux/elf.h b/include/linux/elf.h index abfa9b96259b..f998627b6642 100644 --- a/include/linux/elf.h +++ b/include/linux/elf.h @@ -65,7 +65,6 @@ extern Elf64_Dyn _DYNAMIC []; #endif
#define elf_stack_item_t user_uintptr_t -#define elf_uaddr_to_user_ptr(addr) uaddr_to_user_ptr_safe(addr) #define elf_copy_to_user_stack(to, from, len) copy_to_user_with_ptr(to, from, len) #define elf_stack_put_user(val, ptr) put_user_ptr((user_uintptr_t)(val), (ptr))
Now that our implementation of PCuABI is getting close to the specification, it is time to zero out DDC in PCuABI. Root capabilities are available separately for all the relevant regions (AT_CHERI_*_CAP in the auxiliary vector, and capabilities returned by mmap() and related syscalls).
Signed-off-by: Kevin Brodsky kevin.brodsky@arm.com --- arch/arm64/kernel/morello.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kernel/morello.c b/arch/arm64/kernel/morello.c index ae5977ca6420..f20e0b386828 100644 --- a/arch/arm64/kernel/morello.c +++ b/arch/arm64/kernel/morello.c @@ -127,10 +127,14 @@ int morello_thread_start(struct pt_regs *regs, unsigned long pc,
void morello_thread_init_user(void) { - /* TODO [PCuABI] - Set DDC to the null capability */ - uintcap_t ddc = is_pure_task() ? cheri_user_root_cap - : cheri_user_root_allperms_cap; struct morello_state *morello_state = ¤t->thread.morello_user_state; + uintcap_t ddc; + + if (is_pure_task()) { + ddc = 0; + } else { + ddc = cheri_user_root_allperms_cap; + }
/* * CTPIDR doesn't need to be initialised explicitly:
As per the PCuABI specification, CCTLR_EL0.SBL should be set in purecap, ensuring that CLR is sealed by BL* instructions, and requiring the target capability to be sealed for register-based branch instructions.
Signed-off-by: Kevin Brodsky kevin.brodsky@arm.com --- arch/arm64/include/asm/sysreg.h | 2 ++ arch/arm64/kernel/morello.c | 7 +++++-- 2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 7a0ee4677d09..5b51d5ed0493 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -962,6 +962,8 @@ #define TRFCR_ELx_ExTRE BIT(1) #define TRFCR_ELx_E0TRE BIT(0)
+#define CCTLR_ELx_SBL BIT(7) + /* GIC Hypervisor interface registers */ /* ICH_MISR_EL2 bit definitions */ #define ICH_MISR_EOI (1 << 0) diff --git a/arch/arm64/kernel/morello.c b/arch/arm64/kernel/morello.c index f20e0b386828..12116b725460 100644 --- a/arch/arm64/kernel/morello.c +++ b/arch/arm64/kernel/morello.c @@ -129,11 +129,14 @@ void morello_thread_init_user(void) { struct morello_state *morello_state = ¤t->thread.morello_user_state; uintcap_t ddc; + u64 cctlr;
if (is_pure_task()) { ddc = 0; + cctlr = CCTLR_ELx_SBL; } else { ddc = cheri_user_root_allperms_cap; + cctlr = 0; }
/* @@ -159,8 +162,8 @@ void morello_thread_init_user(void) write_cap_sysreg(0, cid_el0); morello_state->cid = (uintcap_t)0;
- write_sysreg(0, cctlr_el0); - morello_state->cctlr = 0; + write_sysreg(cctlr, cctlr_el0); + morello_state->cctlr = cctlr; }
void morello_thread_save_user_state(struct task_struct *tsk)
On 11/04/2024 14:03, Kevin Brodsky wrote:
Hi,
This series implements the restriction of bounds and permissions of all capabilities provided to a new process. This includes registers (PCC, CSP, C1-C3), capabilities in the auxiliary vector and capabilities to strings in argv/envp (issue #19 [1]). To complete the alignment with the PCuABI specification, AT_BASE is restored as a simple address, DDC is nullified and CCTLR_EL0.SBL is set (issue #20 [2]).
This series is composed of the following patches:
Patch 1-4 fix issues revealed by the restrictions made in the following patches, notably DDC being set to null.
Patch 5 adds a generic helper.
Patch 6-8 refactor the start_thread() machinery so that binfmt_elf provides all the capabilities to set the initial capability registers to. This allows to centralise all calculations in binfmt_elf, and guarantees that capabilities in registers and in the auxiliary vector are consistent. No functional change at this point (the capabilities remain unrestricted).
Patch 9-11 perform the actual restriction of capabilities, simultaneously in initial registers and the auxiliary vector. Patch 11 is a simplified version of Téo's patch [3], without ensuring bounds representability (see below).
Patch 12-15 take care of the remaining alignment with the spec (AT_BASE, DDC, CCTLR_EL0.SBL).
With respect to the tightening of capability bounds, two important caveats should be noted:
In the absence of stack reservation (issue #21 [4]), the stack remains notionally unbounded, and so are the capabilities covering the entire stack (CSP, AT_CHERI_STACK_CAP).
No particular effort is made with respect to bounds representability. This means notably that capabilities to argv/envp strings may overlap, as well as capabilities to the argv/envp arrays. This is however very unlikely in practice (it would require very large strings or a very large number of arguments).
Review branch:
https://git.morello-project.org/kbrodsky-arm/linux/-/commits/morello/initial...
Thanks, Kevin
[1] https://git.morello-project.org/morello/kernel/linux/-/issues/19 [2] https://git.morello-project.org/morello/kernel/linux/-/issues/20 [3] https://op-lists.linaro.org/archives/list/linux-morello@op-lists.linaro.org/... [4] https://git.morello-project.org/morello/kernel/linux/-/issues/21
Kevin Brodsky (15): arm64: barrier: Make arch_counter_enforce_ordering() purecap-friendly kselftests/arm64: morello: Process caprelocs using appropriate root caps kselftests/arm64: morello: Remove invalid assumptions about initial data kselftests/arm64: morello: Seal reconstructed function pointer linux/user_ptr.h: Introduce user_ptr_set_bounds fs/binfmt_elf: Add entry member to elf_load_info binfmt: Store initial user pointers in bprm in PCuABI arm64: Use bprm->pcuabi to set initial pointers in PCuABI fs/binfmt_elf: Restrict executable/interpreter capabilities fs/binfmt_elf: Restrict stack capabilities fs/binfmt_elf: Restrict string capabilities fs/binfmt_elf: Make AT_BASE an address again elf: Remove elf_uaddr_to_user_ptr() arm64: morello: Nullify DDC in PCuABI arm64: morello: Set CCTLR_EL0.SBL in PCuABI
Applied on next.
Kevin
arch/arm64/include/asm/barrier.h | 13 + arch/arm64/include/asm/elf.h | 2 +- arch/arm64/include/asm/morello.h | 4 +- arch/arm64/include/asm/processor.h | 47 +-- arch/arm64/include/asm/sysreg.h | 2 + arch/arm64/kernel/morello.c | 44 ++- arch/arm64/kernel/process.c | 17 + fs/binfmt_elf.c | 306 ++++++++++++++---- fs/compat_binfmt_elf.c | 3 - include/linux/binfmts.h | 13 + include/linux/elf.h | 1 - include/linux/user_ptr.h | 22 ++ .../selftests/arm64/morello/bootstrap.c | 64 ++-- .../arm64/morello/freestanding_init_globals.c | 54 ++-- .../arm64/morello/freestanding_start.S | 4 +- 15 files changed, 417 insertions(+), 179 deletions(-)
linux-morello@op-lists.linaro.org