With this patch capabilities provided to userspace to access argv and envp
are properly bounded with the permissions defined in the PCuABI spec.
This change is split between two files, as they handle two parts of the
process. This cover letter tries to explain the process surrounding the
argument strings, hopefully to facilitate reviewing the patch but also
to check my understanding, and the logic implemented by the patch.
Hopefully I can find some time to refine it after reviews, but in case
I cannot I hope the details will be useful.
# General patch comments
One thing that might be an issue is that I don't see how to update
fs/exec.c:bprm_stack_limits to handle this new process short of
reproducing it almost entirely inside the function.
There are warnings because elf_stack_put_user_cap expects uintcap_t and
cheri_build_user_cap builds a capability, but I didn't really know
how to handle that.
The function itself might be entirely unecessary.
checkpatch complains about elf_stack_put_user_cap, but it follows the
style of the other functions, and from missing blank lines which appear
to be there, so I'm not too sure what to do about it.
# Giving argument strings to userspace
Calling a new executable is done through execve, where userspace passes
the argv and envp pointer arrays to the kernel.
In fs/exec.c:do_execveat_common, the kernel creates a struct linux_binprm
which is used throughout the process of loading a new binary.
Among other things, it keeps track of the stack for the new executable,
both the memory space and its current top, which it allocates here.
The argument are copied to this new stack in fs/exec.c:copy_strings.
It does so from the top of the stack, starting with the last envp element
and ending with the first of argv: opposite to how it will be read later.
The function allocates pages as it goes, thus splitting the string in at
most page-sized chunks, and handles offsets where the page was already
partially used or the remainder of the string will not fill it.
This only puts the /strings/ themselves on the stack.
Later, in fs/binfmt_elf.c:create_elf_tables, the kernel puts the auxv,
argv and envp arrays themselves – thus the pointers to the strings put
on the stack earlier – on the stack.
It does so in the order expected by userspace: from bottom to top of the
strings, so from the first of argv to the last of envp.
# Making it CHERI
We thus have a potential issue: the moment where the strings themselves
are allocated is different from the place where the capabilities are
created, with limited ways to pass information in between.
However, one piece of information is available in both cases: the length
of the null-terminated string.
This way, we can get the same representable length from the length of the
string, which remains unchanged, while allocating the string *and* when
creating the capability.
Assuming we can properly align the strings in copy_strings, we don't need
more information to properly create a capability with exact bounds
in create_elf_tables.
Thankfully, we can re-use most of the machinery in copy_strings to do so.
If the representable length of a string differs from its real length, we
need to get the mask needed for aligning it for exact bounds.
Given this new length, we can compute the position where it would end up
without alignment and ALIGN_DOWN() as we are going downards the stack.
This gives the position where the string *must* start for exact capability
bounds. Thus, all the padding will be after the string.
However as we are going backwards, we need to put the padding first and
the string second.
The code already goes through the string chunk by chunk, handling smaller
chunks at page and argument boundaries.
We can re-use this by first going through the length of padding, and once
this is done go through the string as normal.
This guarantees the string will start at the properly aligned address with
the appropriate amount of padding behind the previous allocations.
We do need to handle this padding and to detect when strings might need
proper handling for exact bounds.
As there isn't padding normally, we can simply check if there is some.
If there is, get the representable length of the string and use it for
the exact capability bounds.
As all the padding is after the strings, if there was padding needed for
alignment the next argument will start in the padding. This is slow, but
just go through the zeroes until we find the real start of the argument
and go from there.
This is not really properly done in this revision, it is a remnant of
the first draft and as such will improperly flag arguments as needing
adjustment after one that needed.
This is known but should be simply fixable by moving this looping
through the zeroes after allocating the capability.
In fs/binfmt_elf.c:create_elf_tables argv and envp are handled exactly the
same but in different loops. I removed the comments from the envp section
for conciseness.
# Testing
To generate a big enough argument, you can use this dirty bash script:
bigarg=""; count=0; while [ $(echo $bigarg | wc -c) -le 20000 ]; do
bigarg=${bigarg}"==${count}==This_should_be_a_really_big_arg_string"
count=$(($count + 1))
done
It is a bit unnecessary and slow but it does allow easily checking that
it starts and ends in the proper places.
Without the patch, the kernel log should show a CPU fault if $bigarg is
passed as an argument or in the environment. With the patch, it should
work fine.
Hopefully this is useful for reviewing, and correct in the first place !
Review branch:
https://git.morello-project.org/Teo-CD/linux/-/tree/review-arg-str-resticte…
Commit itself:
https://git.morello-project.org/Teo-CD/linux/-/commit/a4d9fe880d098e9f9fe1e…
Thanks very much in advance !
Téo
Teo Couprie Diaz (1):
fs: Handle exact bounds for argv and envp
fs/binfmt_elf.c | 111 ++++++++++++++++++++++++++++++++++++++++++--
fs/exec.c | 66 ++++++++++++++++++++++++++
include/linux/elf.h | 4 ++
3 files changed, 178 insertions(+), 3 deletions(-)
--
2.34.1
This series of patches enables nfs rootfs
support on the Morello board.
Patch 01 is fixing the inital kernel build error
associated with a wrong function pointer type within
the sunrpc modules due to the unlocked_ioctl fp,
the error occurs upon enabling nfs within the defconfig.
Patch 02 deals with the fallout caused by changes
inferred by patches 01. See details in the description
of the patch.
Patch 03 is enabling nfs rootfs by default in the kernel.
It was confirmed that the kernel can boot with a nfs rootfs.
V2 changes:
- patch only the modules that are actually being used
- change description and fix nits
- address the incorrect proc_compat_ioctl
Pawel Zalewski (3):
net: sunrpc: fix unlocked_ioctl handler signature
include: linux: fix proc_ioctl
arm64: morello: enable nfs rootfs by default
arch/arm64/configs/morello_transitional_pcuabi_defconfig | 2 ++
drivers/pci/proc.c | 4 ++--
include/linux/proc_fs.h | 2 +-
net/sunrpc/cache.c | 6 +++---
net/sunrpc/rpc_pipe.c | 2 +-
5 files changed, 9 insertions(+), 7 deletions(-)
--
2.34.1
Some io_uring operations' SQEs store user_data values in the addr2 field.
These don't need to be modified as they're not dereferenced by the kernel.
Reported-by: Kevin Brodsky <kevin.brodsky(a)arm.com>
Signed-off-by: Tudor Cretu <tudor.cretu(a)arm.com>
---
Review branch:
https://git.morello-project.org/tudcre01/linux/-/commits/morello/addr2_fix
---
io_uring/io_uring.h | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index 5b4f0f298ad9..db4f91cc64b2 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -132,9 +132,23 @@ static inline void convert_compat64_io_uring_sqe(struct io_ring_ctx *ctx,
sqe->ioprio = READ_ONCE(compat_sqe->ioprio);
sqe->fd = READ_ONCE(compat_sqe->fd);
BUILD_BUG_COMPAT_SQE_UNION_ELEM(addr2, addr);
- sqe->addr2 = (__kernel_uintptr_t)compat_ptr(READ_ONCE(compat_sqe->addr2));
- BUILD_BUG_COMPAT_SQE_UNION_ELEM(addr, len);
+ /*
+ * Some opcodes set a user_data value in the addr2 field to be matched
+ * with a pre-existing IO event's user_data or to propagate it to the
+ * user_data field of a CQE. It's not dereferenced by the kernel, so
+ * don't modify it.
+ */
+ switch (sqe->opcode) {
+ case IORING_OP_POLL_REMOVE:
+ case IORING_OP_MSG_RING:
+ sqe->addr2 = (__kernel_uintptr_t)READ_ONCE(compat_sqe->addr2);
+ break;
+ default:
+ sqe->addr2 = (__kernel_uintptr_t)compat_ptr(READ_ONCE(compat_sqe->addr2));
+ break;
+ }
+ BUILD_BUG_COMPAT_SQE_UNION_ELEM(addr, len);
/*
* Some opcodes set a user_data value in the addr field to be matched
* with a pre-existing IO event's user_data. It's not dereferenced by
--
2.34.1
Hi,
This is a short announcement following the latest merge of next into
master today. This merge had been delayed due to the 1.6 integration
process, and a discovered incompatibility with GDB, which has now been
addressed. Special thanks to Luis Machado for looking into this on the
GDB side!
As a result, please make sure to use the *latest GDB* [1] when running
the latest kernel from master.
Also part of this merge, an index of CHERI-related documentation is now
available under Documentation/cheri [2], and the PCuABI porting guide
was moved there.
Cheers,
Kevin
[1]
https://sourceware.org/git/?p=binutils-gdb.git;a=shortlog;h=refs/heads/user…
[2]
https://git.morello-project.org/morello/kernel/linux/-/tree/morello/master/…
Hi All,
I am glad to inform you that our SDK for Morello is finally ready. After months
of hard work we are happy to share with you the results of our work.
Our motto is "Let Linux developers focus on the porting of their own
application" and today we are taking the first steps to deliver on that.
[Morello SDK]
In less than 10 minutes you should be able to setup a docker container with
everything you need to build an application for Morello.
- Documentation: https://sdk.morello-project.org/
- Code repository: https://git.morello-project.org/morello/morello-sdk
If you want to try a demo of the SDK that runs on a Morello FVP (for more
information on what is an FVP: www.morello-project.org) please have a look below:
[Morello Linux]
In less than 10 minutes you should be able to setup a docker container with
everything you need to build and boot into a Morello Debian environment.
- Documentation: https://linux.morello-project.org/
- Code repository: https://git.morello-project.org/morello/morello-linux
Note: The documentation covers the instructions for Linux but if you know what
you are doing and are familiar with docker no one stops you from running our
solution on Windows or Mac.
Are we done with it?
No, by any mean. This is just the beginning and we need your help and
collaboration to make sure that we improve our solution to meet developers
needs: your needs!
So why don't you try it and let us know your thoughts?
Thanks and Regards,
Vincenzo
Hi,
The top of the master branch has been tagged [0] as part of the
integration drop 1.6.
Below is the changelog for kernel users, since the previous integration
drop (1.5).
New features
------------
- Read/write tag access in mappings is now advertised as rc/wc in
/proc/<pid>/smaps [1].
- The clone3 syscall now reads full capabilities from userspace in
PCuABI, thanks to an updated layout for struct clone_args. The new
struct definition is documented in the PCuABI specification [2].
- Device trees for the Morello board and FVP are now included [3].
- A minimal set of options to run a graphical environment is now enabled
in the transitional PCuABI defconfig [4].
- The branch was rebased on the 6.1 upstream release. The only
user-facing Morello-related change is an updated value for
HWCAP2_MORELLO; see the announcement email [5] for details.
Bug fixes
---------
- Fixed ptrace(PTRACE_POKEDATA) writing too much data in purecap.
- Fixed most remaining LTP failures in compat64 and purecap (notably by
fixing struct layouts and updating internal constants for compat64).
- Fixed a Bionic tests failure in compat64, which was due to the
preadv/pwritev syscall handlers misreading the offset argument.
- Fixed the vast majority of warnings when building the kernel with GCC.
Cheers,
Kevin
[0]
https://git.morello-project.org/morello/kernel/linux/-/commits/morello-rele…
[1]
https://git.morello-project.org/morello/kernel/linux/-/blob/morello-release…
[2]
https://git.morello-project.org/morello/kernel/linux/-/wikis/Morello-pure-c…
[3] arch/arm64/boot/dts/arm/morello-{soc,fvp}.dts
[4] arch/arm64/configs/morello_transitional_pcuabi_defconfig
[5]
https://op-lists.linaro.org/archives/list/linux-morello@op-lists.linaro.org…
This series of patches enables nfs rootfs support on the Morello board.
Patches 01 and 02 fix the inital kernel build error associated with a wrong
function pointer type within the sunrpc modules due to the unlocked_ioctl fp,
the error occurs upon enabling nfs within the defconfig.
Patches 03-09 deal with the fallout caused by changes inferred by patches 01 and 02.
Details can be found in the description of patch 03.
Patch 10 is enabling nfs rootfs by default in the kernel.
It was confirmed that the kernel can boot with a nfs rootfs,
the other affected modules were not tested.
Pawel Zalewski (10):
net:sunrpc: fix incompatible function pointer type in cache
net:sunrpc: fix incompatible function pointer type in rpc_pipe
include:linux: proc_ioctl should take user_uintptr_t as argument
net:sunrpc: fix incompatible function pointer type in cache
sound:core: fix incompatible function pointer type in info
scsi:esas2r: fix incompatible function pointer type in esas2r_main
pci: fix incompatible function pointer type in proc
hwmon: fix incompatible function pointer type in dell-smm-hwmon
cpu:mtrr: fix incompatible function pointer type in if
defconfig: enable nfs rootfs by default
arch/arm64/configs/morello_transitional_pcuabi_defconfig | 2 ++
arch/x86/kernel/cpu/mtrr/if.c | 2 +-
drivers/hwmon/dell-smm-hwmon.c | 2 +-
drivers/pci/proc.c | 2 +-
drivers/scsi/esas2r/esas2r_main.c | 2 +-
include/linux/proc_fs.h | 4 ++--
include/sound/info.h | 2 +-
net/sunrpc/cache.c | 6 +++---
net/sunrpc/rpc_pipe.c | 2 +-
sound/core/info.c | 2 +-
10 files changed, 14 insertions(+), 12 deletions(-)
--
2.34.1
This commit tackles the issue reported at:
https://git.morello-project.org/morello/kernel/linux/-/issues/6
Commit also available at:
https://git.morello-project.org/Sevenarth/linux/-/commits/morello/futex-v3
v3:
- reworded commit bodies
- removed a redundant include
- fixed whitespace alignment
v2:
- split code in 3 commits as suggested
- added more details in the commit bodies
- updated the TODO notation for futex.h
- updated the prefix for A64/C64 definitions in futex.h
- updated the asm constraint's name to follow naming conventions
- updated the robust list entry fetch code to use the pre-existing
helper USER_PTR_ALIGN_DOWN
- reverted pointer comparisons
Luca Vizzarro (3):
arm64: futex: Enable capability-based uaccess
futex: Handle capability-based robust list entries
futex: Add explicit capability checking TODOs
arch/arm64/include/asm/futex.h | 47 ++++++++++++++++++++++++----------
kernel/futex/core.c | 19 ++++++--------
2 files changed, 41 insertions(+), 25 deletions(-)
--
2.34.1