Hi,
After getting side tracked by eBPF libraries/tools (libbpf/bpftool) and
kselftest cross-compilation, here's the core kernel changes following on
from the RFC[1] posted late last year.
The bpf syscall is updated to propagate user pointers as capabilities in
the pure-capability kernel-user ABI (PCuABI). It also includes an
approach to support the existing aarch64 ABI as a compatibility layer
(compat64).
One complication here is from the fact this syscall supports many
multiplexed sub-commands, some of which are themselves multiplexed with
a number of nested multiplexed options.
A further complication is that the existing syscall uses a trick of
storing user pointers as u64 to avoid needing a compat handler for
32-bit systems (see patch 3). To retain compatibility with the aarch64
ABI and add Morello support, a compat layer is added here only for the
compat64 case, guarded by #ifdef CONFIG_COMPAT64. Normal compat32
operation is therefore unchanged.
Compared to the original RFC, inbound (userspace->kernel) conversion
between compat64/native struct layouts is now handled upfront. This
minimises changes to subcommand handlers. Some subcommands require
conversion back out to userspace and that is by necessity handled where
it occurs.
Patch 1 is not essential to this series but it's a nice debug feature to
have and works[2]. It enables BPF_PROG_TYPE_TRACEPOINT which many eBPF
kselftests use.
Patch 2 is required setup for the rest of the patches.
Patches 3-8 implement the core compat64 handling. Each commit compiles
cleanly but relevant parts will be broken inbetween. They're split
mainly to make review here easier.
Patch 9 fixes a check to also check configs passed in via compat64.
Patch 10 finally enables capabilities in the kernel.
Testing wise, see associated LTP changes below which will be posted to
linux-morello-ltp mailing list. The eBPF LTP tests are fairly minimal
and test only a small part of the changes here. There's a new test to
test patch 9.
The kernel kselftests contain much more extensive eBPF tests. The
kselftests have been used to test many parts of the compat64 handling
but overall more work needs to be done here:
a) enable cross-compilation for purecap as well as x86->aarch64
b) replace ptr_to_u64() with casts to uintptr_t in tests
b) general libbpf/bpftool enablement and fixes since many tests rely
on this
c) CONFIG_DEBUG_INFO_BTF required for many tests but this requires
the build system to have a recent version of pahole tool
Next steps once we have the core kernel support is porting libbpf and
bpftool for purecap plus work on enabling kselftests as above.
Kernel branch available at:
https://git.morello-project.org/zdleaf/linux/-/tree/morello/bpf
Associated LTP test/changes at:
https://git.morello-project.org/zdleaf/morello-linux-test-project/-/tree/mo…
Thanks,
Zach
[1] [RFC PATCH 0/9] update bpf syscall for PCuABI/compat64
https://op-lists.linaro.org/archives/list/linux-morello@op-lists.linaro.org…
[2] [PATCH v3 0/5] Restore syscall tracing on Morello
https://op-lists.linaro.org/archives/list/linux-morello@op-lists.linaro.org…
Zachary Leaf (10):
arm64: morello: enable syscall tracing
bpf/net: copy ptrs from user with bpf/sockptr_t
bpf: compat64: add handler and convert bpf_attr in
bpf: compat64: bpf_attr convert out
bpf: compat64: handle bpf_btf_info
bpf: compat64: handle bpf_prog_info
bpf: compat64: handle bpf_map_info
bpf: compat64: handle bpf_link_info
bpf: compat64: support CHECK_ATTR macro
bpf: use user pointer types in uAPI structs
.../morello_transitional_pcuabi_defconfig | 2 +-
arch/arm64/kernel/sys_compat64.c | 4 +
drivers/media/rc/bpf-lirc.c | 7 +-
include/linux/bpf_compat.h | 413 ++++++
include/linux/bpfptr.h | 18 +-
include/linux/sockptr.h | 9 +
include/uapi/linux/bpf.h | 94 +-
kernel/bpf/bpf_iter.c | 2 +-
kernel/bpf/btf.c | 97 +-
kernel/bpf/cgroup.c | 10 +-
kernel/bpf/hashtab.c | 13 +-
kernel/bpf/net_namespace.c | 7 +-
kernel/bpf/offload.c | 2 +-
kernel/bpf/syscall.c | 1136 +++++++++++++----
kernel/bpf/verifier.c | 2 +-
kernel/trace/bpf_trace.c | 6 +-
net/bpf/bpf_dummy_struct_ops.c | 3 +-
net/bpf/test_run.c | 32 +-
net/core/sock_map.c | 7 +-
19 files changed, 1534 insertions(+), 330 deletions(-)
create mode 100644 include/linux/bpf_compat.h
--
2.34.1
Hi Menna,
Please always keep the list in copy for any communication.
On 7/12/23 14:32, Menna Mahmoud wrote:
> Hi Vincenzo,
>
> On Wed, 12 Jul 2023 at 16:24, Vincenzo Frascino <vincenzo.frascino(a)arm.com>
> wrote:
>
>> Hi Menna,
>>
>> On 7/12/23 14:01, Menna Mahmoud wrote:
>>> Where fvp.dtb exists? , I couldn't find it.
>>
>> Is the dtb file you are generating to with the kernel compilation.
>> You need to copy it over and *rename* it as fvp.dtb.
>>
>
> Sorry for disturbing, but which one:
>
> ```
> menna@menna:~/Desktop/optee-project/linux/arch/arm64/boot/dts$ ls
> actions amlogic broadcom intel microchip renesas tesla
> allwinner apm cavium lg nuvoton rockchip ti
> altera apple exynos Makefile nvidia socionext toshiba
> amazon arm freescale marvell qcom sprd xilinx
> amd bitmain hisilicon mediatek realtek synaptics
> menna@menna:~/Desktop/optee-project/linux/arch/arm64/boot/dts$ cd arm
> menna@menna:~/Desktop/optee-project/linux/arch/arm64/boot/dts/arm$ ls
> corstone1000.dtsi foundation-v8-gicv3.dtb fvp-base-revc.dtb
> juno-r1.dts juno-scmi.dtsi
> corstone1000-fvp.dtb foundation-v8-gicv3.dts fvp-base-revc.dts
> juno-r1-scmi.dtb Makefile
> corstone1000-fvp.dts foundation-v8-gicv3.dtsi juno-base.dtsi
> juno-r1-scmi.dts rtsm_ve-aemv8a.dtb
> corstone1000-mps3.dtb foundation-v8-gicv3-psci.dtb juno-clocks.dtsi
> juno-r2.dtb rtsm_ve-aemv8a.dts
> corstone1000-mps3.dts foundation-v8-gicv3-psci.dts juno-cs-r1r2.dtsi
> juno-r2.dts rtsm_ve-motherboard.dtsi
> foundation-v8.dtb foundation-v8-psci.dtb juno.dtb
> juno-r2-scmi.dtb rtsm_ve-motherboard-rs2.dtsi
> foundation-v8.dts foundation-v8-psci.dts juno.dts
> juno-r2-scmi.dts vexpress-v2f-1xv7-ca53x2.dtb
> foundation-v8.dtsi foundation-v8-psci.dtsi
> juno-motherboard.dtsi juno-scmi.dtb vexpress-v2f-1xv7-ca53x2.dts
> foundation-v8-gicv2.dtsi foundation-v8-spin-table.dtsi juno-r1.dtb
> juno-scmi.dts vexpress-v2m-rs1.dtsi
> menna@menna:~/Desktop/optee-project/linux/arch/arm64/boot/dts/arm$
>
> ```
>
This is not the morello kernel you compiled with the OPTEE fixes.
That kernel should have a "morello-fvp.dtb" which you need to rename into "fvp.dtb".
Thanks.
>>
>> --
>> Regards,
>> Vincenzo
>>
>
--
Regards,
Vincenzo
This patch series enables macvlan and macvtap support on the morello
board. MACVLAN is an important part of containerised hosts.
V2 adds "Signed off" message
0001 makes changes to tap.c such that CHERI capabilities can be used
and "pointer or value" arguments are addressed in a compat function
0002 enables MACVLAN and MACVTAP support in the tree
The change has been tested using Docker
Harrison Marcks (2):
net: tap: make PCuABI compliant
arm64: morello: enable MACVLAN by default
.../morello_transitional_pcuabi_defconfig | 2 ++
drivers/net/tap.c | 17 +++++++++++++++--
2 files changed, 17 insertions(+), 2 deletions(-)
--
2.34.1
Hi,
The morello/next branch has been rebased from v6.1 to v6.4. Make sure to
reset/rebase any local branch tracking next. The final 6.1-based commit has been
tagged morello-last-6.1.
There should be no user-visible change following this rebase, aside from upstream
changes between 6.1 and 6.4. The rest of this email provides a detailed changelog
for developers.
Cheers,
Kevin
--------------
Noteworthy changes:
- Reverted access_ok() to its original interface by dropping "uaccess: Allow any
address/pointer type for access_ok()". This was motivated by the
incompatibility that patch created with [1], and the weak justification
behind it. We almost always have (or should have) a user pointer available
when calling access_ok(), so changing its interface creates more problems
than it solves. Related changes:
* Added "uaccess: Fix user pointer downcast" and "arm64: uaccess: Extract
user address before untagging" to replace that patch.
* Dropped "kernel/fork: Remove unnecessary cast when using access_ok()"
* Replaced "mm/(gup, mincore): Remove unnecessary cast when using access_ok()"
with:
- "mm/gup: Create user pointer when calling access_ok()"
- "mm/mincore: Temporarily create a user pointer for access_ok()"
* Updated "Documentation: core-api: Add a new document about user pointers"
accordingly.
- The upsizing in "io_uring: Enlarge struct io_cmd_data in PCuABI" had to be
revised upwards due to [2] and [3] (each adding 16 bytes to struct io_sr_msg
due to alignment requirements). struct io_cmd_data is now 2 cachelines
instead of 1.5 (96 -> 128 bytes).
- Tweaked "io_uring: Implement compat versions of uAPI structs and handle them"
to align it with following upstream patch series:
* "User mapped provided buffer rings" [4]: aligned with the renaming of
struct io_uring_buf_reg::pad to ::flags; moved the ring size calculation
from io_alloc_pbuf_ring() and io_pin_pbuf_ring() to
io_register_pbuf_ring(), in order to calculate the size of the ring
correctly in the compat case in both cases (without duplicating it).
* "io_uring: Pass the whole sqe to commands" [5]: as a result of this
series, io_uring commands get a pointer to a full (native) SQE, even in
compat. To avoid giving them access to uninitialised memory,
changed convert_compat64_io_uring_sqe() to zero out the end of the cmd
array in the output SQE.
- Used new vm_flags accessor in "io_uring: Allow capability tag access on the
shared memory" and "aio: Allow capability tag access on the shared memory",
following [6].
- Updated a few more prototypes to pass user_data as __kernel_uintptr_t in
"io_uring: Use user pointer type in the uAPI structs", due to [7].
- [8] provides a similar functionality to "module: Allow arch overrides for ELF
arch check". The latter was dropped and "module: Enable module loading for
PCuABI kernels" was adapted to use the interface introduced by [8] (thanks
Kristina).
Minor changes:
- Updated "arm64: morello: Disable trapping early and unconditionally" to stash
LR into TPIDR_EL0 in init_kernel_el as [9] causes LR to be clobbered by a
subsequent call (on the Morello board, not FVP).
- Aligned "arm64: morello: Signal handling support" with recent changes to
arch/arm64/kernel/signal.c
- Updated NSIGSEGV in all assertions in "arm64: morello: Handle capability faults"
- Updated "tracing/syscalls: Allow amending metadata macro arguments" to take
care of new uses of SYSCALL_METADATA() in arch code (powerpc).
- Added a #include <linux/user_ptr.h> in "arm64: memory: Always return a u64 in
untagged_addr()", as it looks like asm/signal.h no longer (implicitly)
includes it.
- Extended "fs/ioctl: Modify 3rd argument of fops->unlocked_ioctl to user_uintptr_t"
to handle the ioctl wrappers added by [10].
- Replaced "iov_iter: use copy_from_user_with_ptr for struct iovec" with a new
patch "iov_iter: Use get_user_ptr for PCuABI support", as
copy_iovec_from_user() now uses get_user() instead of copy_from_user() (see
[11]). The new patch was moved earlier in the branch to preserve bisectability.
- Moved "tcp: Explicitly create user pointers" earlier in the branch to preserve
bisectability (without warning).
- "arm64: configs: Add Morello transitional PCuABI defconfig" and "arm64:
morello: Enable basic 9P FS support in transitional defconfig" updated as
per make savedefconfig.
Dropped:
- "arm64: signal: Flatten restore_sigframe() error handling" (it doesn't make
much sense any more, considering recent additions to that function).
- "mm/hugetlb: Use appropriate user pointer conversions" and
"mm/shmem: Use appropriate user pointer conversions" (reverted as part of
the recently merged "New user_ptr helpers for uaccess" series).
- "arm64: entry-ftrace.S: Fix build when CONFIG_ARM64_MORELLO=y" (no longer
needed thanks to [12]).
- Landed in mainline and already included in v6.4:
* "uapi/linux/const.h: Prefer ISO-friendly __typeof__"
* "arm64: compat: Remove defines now in asm-generic"
* "net: Finish up ->msg_control{,_user} split"
* "memfd: Pass argument of memfd_fcntl as int"
- "io_uring/kbuf: Fix size for shared buffer ring" (fixed upstream by [13]).
[1] https://lore.kernel.org/all/20230410174345.4376-2-dev@der-flo.net/
[2] https://lore.kernel.org/all/f1a1ba93-1adf-63fa-6f0f-f3182f165841@kernel.dk/
[3] https://lore.kernel.org/all/0b0d4411-c8fd-4272-770b-e030af6919a0@kernel.dk/
[4] https://lore.kernel.org/io-uring/20230314171641.10542-1-axboe@kernel.dk/
[5] https://lore.kernel.org/all/20230504121856.904491-1-leitao@debian.org/
[6] https://lore.kernel.org/all/20230126193752.297968-3-surenb@google.com/
[7] https://lore.kernel.org/all/20221124093559.3780686-6-dylany@meta.com/
[8] https://lore.kernel.org/all/20221128041539.1742489-2-npiggin@gmail.com/
[9] https://lore.kernel.org/all/20230111102236.1430401-6-ardb@kernel.org/
[10] https://lore.kernel.org/all/20221205123903.159838-3-brgl@bgdev.pl/
[11]
https://lore.kernel.org/all/CAHk-=wiC5OBj36LFKYRONF_B19iyuEjK2WQFJpyZ+-w39m…
[12] https://lore.kernel.org/all/20221103170520.931305-5-mark.rutland@arm.com/
[13] https://lore.kernel.org/all/20230218184141.70891-1-wlukowicz01@gmail.com/
This patch series enables macvlan and macvtap support on the morello
board. MACVLAN is an important part of containerised hosts.
0001 makes changes to tap.c such that CHERI capabilities can be used
and "pointer or value" arguments are addressed in a compat function
0002 enables MACVLAN and MACVTAP support in the tree
The change has been tested using Docker
Harrison Marcks (2):
net: tap: make PCuABI compliant
arm64: morello: enable MACVLAN by default
.../morello_transitional_pcuabi_defconfig | 2 ++
drivers/net/tap.c | 17 +++++++++++++++--
2 files changed, 17 insertions(+), 2 deletions(-)
--
2.34.1
Syscalls operating on memory mappings manage their address space via
owning capabilities. They must adhere to a certain set of rules[1] in
order to ensure memory safety. Address space management syscalls are
only allowed to manipulate mappings that are within the range of the
owning capability and have the appropriate permissions. Tests to
vailidate the parameters being passed to the syscall, check its bounds,
range as well as permissions have been added. Additionally, a signal
handler has been registered to handle invalid memory access. Finally, as
certain flags and syscalls conflict with the reservation model or lack
implementation, a check to verify appropriate handling of the same has
also been added.
Review branch:
https://git.morello-project.org/chaitanya_prakash/linux/-/tree/review/mmap_…
This patch series has been tested on:
https://git.morello-project.org/amitdaniel/linux/-/tree/review/extern_reser…
[1] https://git.morello-project.org/morello/kernel/linux/-/wikis/Morello-pure-c…
Changes in V2:
- Added link to the review branch
- Removed unnecessary whitespace
Changes in V1:
https://op-lists.linaro.org/archives/list/linux-morello@op-lists.linaro.org…
Chaitanya S Prakash (8):
kselftests/arm64/morello: Add necessary support for mmap testcases
kselftests/arm64/morello: Add MAP_GROWSDOWN testcase
kselftests/arm64/morello: Add parameter check testcases
kselftests/arm64/morello: Add capability range testcases
kselftests/arm64/morello: Add mmap() bounds check testcases
kselftests/arm64/morello: Add mremap() bounds check testcases
kselftests/arm64/morello: Add mremap() permission testcases
kselftests/arm64/morello: Add brk() testcase
.../testing/selftests/arm64/morello/Makefile | 1 +
.../selftests/arm64/morello/freestanding.h | 62 ++-
tools/testing/selftests/arm64/morello/mmap.c | 479 +++++++++++++++++-
3 files changed, 535 insertions(+), 7 deletions(-)
--
2.25.1
This series of patches enables nfs rootfs
support on the Morello board.
Patch 01 is fixing the inital kernel build error
associated with a wrong function pointer type within
the sunrpc modules due to the unlocked_ioctl fp,
the error occurs upon enabling nfs within the defconfig.
Patch 02 deals with the fallout caused by changes
inferred by patches 01. See details in the description
of the patch.
Patch 03 is enabling nfs rootfs by default in the kernel.
It was confirmed that the kernel can boot with a nfs rootfs.
V3 changes:
- fix commit title @ patch 2
- align the change in proc.c
Pawel Zalewski (3):
net: sunrpc: fix unlocked_ioctl handler signature
proc: change proc_ops.proc_ioct handler signature
arm64: morello: enable nfs rootfs by default
arch/arm64/configs/morello_transitional_pcuabi_defconfig | 2 ++
drivers/pci/proc.c | 4 ++--
include/linux/proc_fs.h | 2 +-
net/sunrpc/cache.c | 6 +++---
net/sunrpc/rpc_pipe.c | 2 +-
5 files changed, 9 insertions(+), 7 deletions(-)
--
2.34.1
This series makes it possible for purecap apps to use the aio_ring
shared memory region to bypass the io_getevents syscall's overhead.
This functionality is also used in libaio.
With these patches, all io_* LTP tests pass in both Purecap and
plain AArch64 modes. Note that the LTP tests only address the basic
functionality of the aio system and a significant portion of the
functionality is untested in LTP.
For a more comprehensive testing, libaio has been updated with the new
uAPI and ported. All the tests in libaio pass accordingly, in both
Purecap and plain AArch64 modes.
v4..v3:
- Restore flush_dcache_page in all places with the exception of one
where is replaced with flush_kernel_vmap_range
- Use ifdef instead of IS_ENABLED in a few places
- Improve formatting
v3..v2:
- Improve the commit messages
- Revert a few unrelated changes
- Change compat_aio_context_t to compat_uptr_t
- Remove io_events_compat union member
- Improve code formatting
- Add copy_to_user_with_ptr in copy_io_events_to_user
- Split copy_from_user_with_ptr for struct __aio_sigset into a
different patch
v2..v1:
- Add Patch 1 that fixes a parameter type for the compat handler
- Split the change the types to user pointers into two patches: one
for aio_context_t, and the other for io_event struct fields.
- vmap all the ring pages at the beginning and cache them in the ctx
- Don't remap the pages while allowing tag access to the shared
memory. Setting the VM flags is enough.
- Change aio_context_t to a void __user *.
- Improve commit messages.
- Refactor some of the functions for compat handling.
- Create valid user pointers ctx_id when received from a compat task
Gitlab issue:
https://git.morello-project.org/morello/kernel/linux/-/issues/49
Review branch:
https://git.morello-project.org/tudcre01/linux/-/commits/morello/aio_v4
Tudor Cretu (7):
aio: Fix type of nr parameter in compat handler of io_submit
aio: Use copy_from_user_with_ptr for struct __aio_sigset
aio: vmap entire aio_ring instead of kmapping each page
aio: Implement compat handling for the io_event struct
aio: Allow capability tag access on the shared memory
aio: Change aio_context_t to a user pointer
aio: Use user pointer type in the io_event struct
fs/aio.c | 283 ++++++++++++++++++++++-------------
include/asm-generic/compat.h | 4 +-
include/uapi/linux/aio_abi.h | 12 +-
3 files changed, 186 insertions(+), 113 deletions(-)
--
2.34.1
PCuABI enablement entails providing support for userspace
capability pointers which are 128-bit long instead of 64-bit.
This commit implements the support for the futex_waitv syscall
while maintaining the 64-bit compatibility.
Signed-off-by: Luca Vizzarro <Luca.Vizzarro(a)arm.com>
---
Hello!
Sending in patch v3 for issue #48:
https://git.morello-project.org/morello/kernel/linux/-/issues/48
Here you can find a branch with this patch:
https://git.morello-project.org/Sevenarth/linux/-/commits/morello/futex_wai…
LTP fully passes. For the CI this fails only on the purecap tests since
there has been a change in the linux headers. Therefore LTP would need
to be recompiled against this for it to work.
v3:
- updated the copy_from_user function to copy_from_user_with_ptr for
purecap
v2 changes:
- fixes bug where the offset of the futex_waitv list was always
dependent on the size of the purecap struct instead of adapting
to the compat one accordingly.
- fixes the pointer cast in compat appropriately. It was mistakenly
using __kernel_uintcap_t instead of __kernel_uintptr_t.
Best,
Luca
---
include/uapi/linux/futex.h | 2 +-
kernel/futex/syscalls.c | 32 +++++++++++++++++++++++++++++++-
kernel/futex/waitwake.c | 4 ++--
3 files changed, 34 insertions(+), 4 deletions(-)
diff --git a/include/uapi/linux/futex.h b/include/uapi/linux/futex.h
index 71a5df8d2689..93f701d89401 100644
--- a/include/uapi/linux/futex.h
+++ b/include/uapi/linux/futex.h
@@ -63,7 +63,7 @@
*/
struct futex_waitv {
__u64 val;
- __u64 uaddr;
+ __kernel_uintptr_t uaddr;
__u32 flags;
__u32 __reserved;
};
diff --git a/kernel/futex/syscalls.c b/kernel/futex/syscalls.c
index ad4ae797bb84..771cfb51a38d 100644
--- a/kernel/futex/syscalls.c
+++ b/kernel/futex/syscalls.c
@@ -190,6 +190,36 @@ SYSCALL_DEFINE6(futex, u32 __user *, uaddr, int, op, u32, val,
/* Mask of available flags for each futex in futex_waitv list */
#define FUTEXV_WAITER_MASK (FUTEX_32 | FUTEX_PRIVATE_FLAG)
+struct compat_futex_waitv {
+ __u64 val;
+ __u64 uaddr;
+ __u32 flags;
+ __u32 __reserved;
+};
+
+static int copy_futex_waitv_from_user(struct futex_waitv *aux,
+ const struct futex_waitv __user *uwaitv,
+ unsigned int i)
+{
+ if (IS_ENABLED(CONFIG_COMPAT64) && in_compat_syscall()) {
+ const struct compat_futex_waitv __user *compat_uwaitv =
+ (const struct compat_futex_waitv __user *)uwaitv;
+ struct compat_futex_waitv compat_aux;
+
+ if (copy_from_user(&compat_aux, &compat_uwaitv[i], sizeof(compat_aux)))
+ return -EFAULT;
+
+ aux->val = compat_aux.val;
+ aux->uaddr = (__kernel_uintptr_t)compat_ptr(compat_aux.uaddr);
+ aux->flags = compat_aux.flags;
+ aux->__reserved = compat_aux.__reserved;
+
+ return 0;
+ }
+
+ return copy_from_user_with_ptr(aux, &uwaitv[i], sizeof(*aux));
+}
+
/**
* futex_parse_waitv - Parse a waitv array from userspace
* @futexv: Kernel side list of waiters to be filled
@@ -206,7 +236,7 @@ static int futex_parse_waitv(struct futex_vector *futexv,
unsigned int i;
for (i = 0; i < nr_futexes; i++) {
- if (copy_from_user(&aux, &uwaitv[i], sizeof(aux)))
+ if (copy_futex_waitv_from_user(&aux, uwaitv, i))
return -EFAULT;
if ((aux.flags & ~FUTEXV_WAITER_MASK) || aux.__reserved)
diff --git a/kernel/futex/waitwake.c b/kernel/futex/waitwake.c
index 1ab5640e7f84..d6b050356f81 100644
--- a/kernel/futex/waitwake.c
+++ b/kernel/futex/waitwake.c
@@ -422,7 +422,7 @@ static int futex_wait_multiple_setup(struct futex_vector *vs, int count, int *wo
if ((vs[i].w.flags & FUTEX_PRIVATE_FLAG) && retry)
continue;
- ret = get_futex_key(u64_to_user_ptr(vs[i].w.uaddr),
+ ret = get_futex_key((u32 __user *)vs[i].w.uaddr,
!(vs[i].w.flags & FUTEX_PRIVATE_FLAG),
&vs[i].q.key, FUTEX_READ);
@@ -433,7 +433,7 @@ static int futex_wait_multiple_setup(struct futex_vector *vs, int count, int *wo
set_current_state(TASK_INTERRUPTIBLE|TASK_FREEZABLE);
for (i = 0; i < count; i++) {
- u32 __user *uaddr = uaddr_to_user_ptr(vs[i].w.uaddr);
+ u32 __user *uaddr = (u32 __user *)vs[i].w.uaddr;
struct futex_q *q = &vs[i].q;
u32 val = (u32)vs[i].w.val;
--
2.34.1
PCuABI enablement entails providing support for userspace
capability pointers which are 128-bit long instead of 64-bit.
This commit implements the support for the futex_waitv syscall
while maintaining the 64-bit compatibility.
Signed-off-by: Luca Vizzarro <Luca.Vizzarro(a)arm.com>
---
Hello!
Sending in patch v2 for issue #48:
https://git.morello-project.org/morello/kernel/linux/-/issues/48
Here you can find a branch with this patch:
https://git.morello-project.org/Sevenarth/linux/-/commits/morello/futex_wai…
LTP fully passes. For the CI this fails only on the purecap tests since
there has been a change in the linux headers. Therefore LTP would need
to be recompiled against this for it to work.
v2 changes:
- fixes bug where the offset of the futex_waitv list was always
dependent on the size of the purecap struct instead of adapting
to the compat one accordingly.
- fixes the pointer cast in compat appropriately. It was mistakenly
using __kernel_uintcap_t instead of __kernel_uintptr_t.
Best,
Luca
---
include/uapi/linux/futex.h | 2 +-
kernel/futex/syscalls.c | 32 +++++++++++++++++++++++++++++++-
kernel/futex/waitwake.c | 4 ++--
3 files changed, 34 insertions(+), 4 deletions(-)
diff --git a/include/uapi/linux/futex.h b/include/uapi/linux/futex.h
index 71a5df8d2689..93f701d89401 100644
--- a/include/uapi/linux/futex.h
+++ b/include/uapi/linux/futex.h
@@ -63,7 +63,7 @@
*/
struct futex_waitv {
__u64 val;
- __u64 uaddr;
+ __kernel_uintptr_t uaddr;
__u32 flags;
__u32 __reserved;
};
diff --git a/kernel/futex/syscalls.c b/kernel/futex/syscalls.c
index ad4ae797bb84..c2e4529083a3 100644
--- a/kernel/futex/syscalls.c
+++ b/kernel/futex/syscalls.c
@@ -190,6 +190,36 @@ SYSCALL_DEFINE6(futex, u32 __user *, uaddr, int, op, u32, val,
/* Mask of available flags for each futex in futex_waitv list */
#define FUTEXV_WAITER_MASK (FUTEX_32 | FUTEX_PRIVATE_FLAG)
+struct compat_futex_waitv {
+ __u64 val;
+ __u64 uaddr;
+ __u32 flags;
+ __u32 __reserved;
+};
+
+static int copy_futex_waitv_from_user(struct futex_waitv *aux,
+ const struct futex_waitv __user *uwaitv,
+ unsigned int i)
+{
+ if (IS_ENABLED(CONFIG_COMPAT64) && in_compat_syscall()) {
+ const struct compat_futex_waitv __user *compat_uwaitv =
+ (const struct compat_futex_waitv __user *)uwaitv;
+ struct compat_futex_waitv compat_aux;
+
+ if (copy_from_user(&compat_aux, &compat_uwaitv[i], sizeof(compat_aux)))
+ return -EFAULT;
+
+ aux->val = compat_aux.val;
+ aux->uaddr = (__kernel_uintptr_t)compat_ptr(compat_aux.uaddr);
+ aux->flags = compat_aux.flags;
+ aux->__reserved = compat_aux.__reserved;
+
+ return 0;
+ }
+
+ return copy_from_user(aux, &uwaitv[i], sizeof(*aux));
+}
+
/**
* futex_parse_waitv - Parse a waitv array from userspace
* @futexv: Kernel side list of waiters to be filled
@@ -206,7 +236,7 @@ static int futex_parse_waitv(struct futex_vector *futexv,
unsigned int i;
for (i = 0; i < nr_futexes; i++) {
- if (copy_from_user(&aux, &uwaitv[i], sizeof(aux)))
+ if (copy_futex_waitv_from_user(&aux, uwaitv, i))
return -EFAULT;
if ((aux.flags & ~FUTEXV_WAITER_MASK) || aux.__reserved)
diff --git a/kernel/futex/waitwake.c b/kernel/futex/waitwake.c
index 1ab5640e7f84..d6b050356f81 100644
--- a/kernel/futex/waitwake.c
+++ b/kernel/futex/waitwake.c
@@ -422,7 +422,7 @@ static int futex_wait_multiple_setup(struct futex_vector *vs, int count, int *wo
if ((vs[i].w.flags & FUTEX_PRIVATE_FLAG) && retry)
continue;
- ret = get_futex_key(u64_to_user_ptr(vs[i].w.uaddr),
+ ret = get_futex_key((u32 __user *)vs[i].w.uaddr,
!(vs[i].w.flags & FUTEX_PRIVATE_FLAG),
&vs[i].q.key, FUTEX_READ);
@@ -433,7 +433,7 @@ static int futex_wait_multiple_setup(struct futex_vector *vs, int count, int *wo
set_current_state(TASK_INTERRUPTIBLE|TASK_FREEZABLE);
for (i = 0; i < count; i++) {
- u32 __user *uaddr = uaddr_to_user_ptr(vs[i].w.uaddr);
+ u32 __user *uaddr = (u32 __user *)vs[i].w.uaddr;
struct futex_q *q = &vs[i].q;
u32 val = (u32)vs[i].w.val;
--
2.34.1