This series makes it possible for purecap apps to use the io_uring
system.
With these patches, all io_uring LTP tests pass in both Purecap and
compat modes. Note that the LTP tests only address the basic
functionality of the io_uring system and a significant portion of the
multiplexed functionality is untested in LTP.
I have finished investigating Purecap liburing tests and examples and
the series is updated accordingly. I am investigating compat liburing
tests at the moment, so another version might be expected.
v3:
- Introduce Patch 5 which exposes the compat handling logic for
epoll_event. This is used then in io_uring/epoll.c.
- Introduce Patch 6 which makes sure that when struct iovec is copied
from userspace, the capability tags are preserved.
- Fix a few sizeof(var) to sizeof(*var).
- Use iovec_from_user so that compat handling logic is applied instead
of copying directly from user
- Add a few missing copy_from_user_with_ptr where suitable.
v2:
- Rebase on top of release 6.1
- Remove VM_READ_CAPS/VM_LOAD_CAPS patches as they are already merged
- Update commit message in PATCH 1
- Add the generic changes PATCH 2 and PATCH 3 to avoid copying user
pointers from/to userspace unnecesarily. These could be upstreamable.
- Split "pulling the cqes memeber out" change into PATCH 4
- The changes for PATCH 5 and 6 are now split into their respective
files after the rebase.
- Format and change organization based on the feedback on the
previous version, including creating helpers copy_*_from_* for various
uAPI structs
- Add comments related to handling of setup flags IORING_SETUP_SQE128
and IORING_SETUP_CQE32
- Add handling for new uAPI structs: io_uring_buf, io_uring_buf_ring,
io_uring_buf_reg, io_uring_sync_cancel_reg.
Gitlab issue:
https://git.morello-project.org/morello/kernel/linux/-/issues/2
Review branch:
https://git.morello-project.org/tudcre01/linux/-/commits/morello/io_uring_v3
Tudor Cretu (9):
compiler_types: Add (u)intcap_t to native_words
io_uring/rw : Restrict copy to only uiov->len from userspace
io_uring/tctx: Copy only the offset field back to user
io_uring: Pull cqes member out from rings struct
epoll: Expose compat handling logic of epoll_event
iov_iter: use copy_from_user_with_ptr for struct iovec
io_uring: Implement compat versions of uAPI structs and handle them
io_uring: Use user pointer type in the uAPI structs
io_uring: Allow capability tag access on the shared memory
fs/eventpoll.c | 39 ++--
include/linux/compiler_types.h | 7 +
include/linux/eventpoll.h | 4 +
include/linux/io_uring_types.h | 160 ++++++++++++++--
include/uapi/linux/io_uring.h | 62 ++++---
io_uring/advise.c | 2 +-
io_uring/cancel.c | 40 +++-
io_uring/cancel.h | 2 +-
io_uring/epoll.c | 4 +-
io_uring/fdinfo.c | 64 ++++++-
io_uring/fs.c | 16 +-
io_uring/io_uring.c | 329 +++++++++++++++++++++++++--------
io_uring/io_uring.h | 126 ++++++++++---
io_uring/kbuf.c | 119 ++++++++++--
io_uring/kbuf.h | 8 +-
io_uring/msg_ring.c | 4 +-
io_uring/net.c | 25 +--
io_uring/openclose.c | 4 +-
io_uring/poll.c | 4 +-
io_uring/rsrc.c | 150 ++++++++++++---
io_uring/rw.c | 17 +-
io_uring/statx.c | 4 +-
io_uring/tctx.c | 57 +++++-
io_uring/timeout.c | 10 +-
io_uring/uring_cmd.c | 5 +
io_uring/uring_cmd.h | 7 +
io_uring/xattr.c | 12 +-
lib/iov_iter.c | 2 +-
28 files changed, 1014 insertions(+), 269 deletions(-)
--
2.34.1
Hi,
These 3 patch fixes the failure in clone test with Gcc toolchain. They
need the recent v4 version of Gcc support patches posted earlier [1].
The full patch series can be found here [2].
Changes in v3:
* Patch 1 updated for for appropriate variable name.
* Patch 2 newly added to add and use waitpid() in several places.
* Patch 3 updated for cleanups.
* Commit log updated.
[1]: git@git.morello-project.org:amitdaniel/linux.git gcc_kselftests_support_v4
[2]: git@git.morello-project.org:amitdaniel/linux.git gcc_kselftests_clone_fixes_v3
Thanks,
Amit Daniel
Amit Daniel Kachhap (2):
Amit Daniel Kachhap (3):
kselftests/arm64: morello: Fix restricted mode tests with Gcc
kselftests/arm64: morello: Add a waitpid() syscall
kselftests/arm64: morello: clone: Remove loop check
tools/testing/selftests/arm64/morello/clone.c | 42 +++++++++++--------
.../selftests/arm64/morello/freestanding.h | 5 +++
.../testing/selftests/arm64/morello/signal.c | 16 +++----
3 files changed, 38 insertions(+), 25 deletions(-)
--
2.25.1
Hi,
These 2 patch fixes the failure in clone test with Gcc toolchain. They
need the recent v4 version of Gcc support patches posted earlier [1].
The full patch series can be found here [2].
[1]: git@git.morello-project.org:amitdaniel/linux.git gcc_kselftests_support_v4
[2]: git@git.morello-project.org:amitdaniel/linux.git gcc_kselftests_clone_fixes_v1
Thanks,
Amit Daniel
Amit Daniel Kachhap (2):
kselftests/arm64: morello: Fix restricted mode tests with Gcc
kselftests/arm64: morello: clone: Remove loop check
tools/testing/selftests/arm64/morello/clone.c | 19 +++++++------------
.../selftests/arm64/morello/freestanding.h | 8 ++++++++
.../arm64/morello/freestanding_start.S | 18 ++++++++++++++++++
3 files changed, 33 insertions(+), 12 deletions(-)
--
2.25.1
Hi,
These 2 patch fixes the failure in clone test with Gcc toolchain. They
need the recent v4 version of Gcc support patches posted earlier [1].
The full patch series can be found here [2].
Changes in v2:
* Patch 1 is modified to calculate function pointer branch address from
PCC.
* Commit log updated.
[1]: git@git.morello-project.org:amitdaniel/linux.git gcc_kselftests_support_v4
[2]: git@git.morello-project.org:amitdaniel/linux.git gcc_kselftests_clone_fixes_v2
Thanks,
Amit Daniel
Amit Daniel Kachhap (2):
kselftests/arm64: morello: Fix restricted mode tests with Gcc
kselftests/arm64: morello: clone: Remove loop check
tools/testing/selftests/arm64/morello/clone.c | 30 +++++++++++--------
.../selftests/arm64/morello/freestanding.h | 8 +++++
2 files changed, 26 insertions(+), 12 deletions(-)
--
2.25.1
The definition in the implementation of nfs4_proc_setlease was originally
changed without updating the definition in the header file.
Fixes: ("fs: Pass argument to fcntl_setlease as int")
Signed-off-by: Luca Vizzarro <Luca.Vizzarro(a)arm.com>
---
Sending in a fix for a bug that was merged and was undetected by the
build script and the CI. Credit goes to Beata for detecting this.
Luca
fs/nfs/nfs4_fs.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
index cfef738d765e..0f9bb0f2f6c5 100644
--- a/fs/nfs/nfs4_fs.h
+++ b/fs/nfs/nfs4_fs.h
@@ -326,7 +326,7 @@ extern int update_open_stateid(struct nfs4_state *state,
const nfs4_stateid *open_stateid,
const nfs4_stateid *deleg_stateid,
fmode_t fmode);
-extern int nfs4_proc_setlease(struct file *file, long arg,
+extern int nfs4_proc_setlease(struct file *file, int arg,
struct file_lock **lease, void **priv);
extern int nfs4_proc_get_lease_time(struct nfs_client *clp,
struct nfs_fsinfo *fsinfo);
--
2.34.1
PCuABI itself is defined by the PCuABI specification. However, the
specification only documents the ABI itself and not internal kernel
implementation aspects. To that effect, create a document under a
new cheri/ subfolder, as well as an index file with some information
about CHERI support in general.
Now that we have a generic PCuABI document, link to it from the
related documents, and remove a now-redundant section from the user
pointer doc. All CHERI / PCuABI-related documents are now reachable
from Documentation/cheri/index.rst.
The PCuABI porting guide was initially added to the root of
Documentation/ for lack of relevant subfolder, we can now move it to
a more appropriate home.
Reviewed-by: Vincenzo Frascino <vincenzo.frascino(a)arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky(a)arm.com>
---
Rendered version:
https://git.morello-project.org/kbrodsky-arm/linux/-/tree/pcuabi_doc/Docume…
Documentation/arm64/morello.rst | 14 +-
Documentation/cheri/index.rst | 22 +++
Documentation/{ => cheri}/pcuabi-porting.rst | 18 +-
Documentation/cheri/pcuabi.rst | 177 +++++++++++++++++++
Documentation/core-api/user_ptr.rst | 25 +--
5 files changed, 220 insertions(+), 36 deletions(-)
create mode 100644 Documentation/cheri/index.rst
rename Documentation/{ => cheri}/pcuabi-porting.rst (96%)
create mode 100644 Documentation/cheri/pcuabi.rst
diff --git a/Documentation/arm64/morello.rst b/Documentation/arm64/morello.rst
index bc0d98596762..00f28b76d1a6 100644
--- a/Documentation/arm64/morello.rst
+++ b/Documentation/arm64/morello.rst
@@ -231,12 +231,12 @@ ABIs
In the default kernel configuration, existing aspects of the standard
AArch64 kernel-user ABI remain unchanged.
-As a highly experimental feature, it is possible to choose a different
-kernel-user ABI, the **pure-capability ABI** (PCuABI), by selecting the
-``CONFIG_CHERI_PURECAP_UABI`` option. In this ABI, all pointers at the
-kernel-user boundary are capabilities, providing a native interface for
-pure-capability executables; see the CHERI C/C++ Programming Guide [4]_
-for an overview of this programming model.
+As an experimental feature, it is possible to choose a different
+kernel-user ABI, the `pure-capability kernel-user ABI`_ (PCuABI), by
+selecting the ``CONFIG_CHERI_PURECAP_UABI`` option. In this ABI, all
+pointers at the kernel-user boundary are capabilities, providing a
+native interface for pure-capability executables; see the CHERI C/C++
+Programming Guide [4]_ for an overview of this programming model.
When ``CONFIG_CHERI_PURECAP_UABI`` is selected, the meaning of
``CONFIG_COMPAT`` is modified: instead of providing support for AArch32
@@ -280,6 +280,8 @@ ABI**. These extensions are also available in PCuABI, with a number of
differences. The transitional PCuABI specification [5]_ takes precedence
where it differs from the present document.
+.. _pure-capability kernel-user ABI: Documentation/cheri/pcuabi.rst
+
Register handling
-----------------
diff --git a/Documentation/cheri/index.rst b/Documentation/cheri/index.rst
new file mode 100644
index 000000000000..6955e298b88e
--- /dev/null
+++ b/Documentation/cheri/index.rst
@@ -0,0 +1,22 @@
+=============
+CHERI support
+=============
+
+This directory contains documents related to the support of `CHERI`_.
+CHERI is an architectural extension introducing the concept of hardware
+capabilities. The CHERI model is available on a number of architectures;
+many aspects of CHERI support are arch-agnostic, however lower-level
+arch-specific enablement is also required. The following CHERI-enabled
+architectures are currently supported in Linux:
+
+* `Morello`_ (arm64-based experimental architecture)
+
+Documentation in this directory pertains only to arch-agnostic aspects of
+CHERI support.
+
+.. toctree::
+ pcuabi
+ pcuabi-porting
+
+.. _CHERI: https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/
+.. _Morello: Documentation/arm64/morello.rst
diff --git a/Documentation/pcuabi-porting.rst b/Documentation/cheri/pcuabi-porting.rst
similarity index 96%
rename from Documentation/pcuabi-porting.rst
rename to Documentation/cheri/pcuabi-porting.rst
index a3ff0c98e6b0..2a38862e869a 100644
--- a/Documentation/pcuabi-porting.rst
+++ b/Documentation/cheri/pcuabi-porting.rst
@@ -3,12 +3,10 @@ Adding PCuABI support to drivers
=================================
This document provides a non-exhaustive overview of the most common
-changes required to support the pure-capability user ABI (PCuABI) in
-arbitrary drivers. It may also be helpful for core subsystems, though
-note that more extensive changes may be required compared to drivers
-with straightforward interactions with userspace.
-
-.. _user pointer documentation: core-api/user_ptr.rst
+changes required to support the `pure-capability kernel-user ABI`_
+(PCuABI) in arbitrary drivers. It may also be helpful for core
+subsystems, though note that more extensive changes may be required
+compared to drivers with straightforward interactions with userspace.
User pointer representation and conversions
===========================================
@@ -342,8 +340,8 @@ typically throw the following error::
error: use of __capability is ambiguous
-A fixup is then required, as described in section "PCuABI-specific
-changes" of the `user pointer documentation`_. For instance::
+A fixup is then required, as described in section "Leveraging ``__user``"
+of the `PCuABI documentation`_. For instance::
diff --git a/net/socket.c b/net/socket.c
index 8597fbacb089..ab2a610825cc 100644
@@ -364,3 +362,7 @@ changes" of the `user pointer documentation`_. For instance::
Fortunately, ``__user`` is mostly used in simple types, and such fixups
are rarely needed in driver code.
+
+.. _user pointer documentation: Documentation/core-api/user_ptr.rst
+.. _PCuABI documentation: Documentation/cheri/pcuabi.rst
+.. _pure-capability kernel-user ABI: `PCuABI documentation`_
diff --git a/Documentation/cheri/pcuabi.rst b/Documentation/cheri/pcuabi.rst
new file mode 100644
index 000000000000..90e8a4200826
--- /dev/null
+++ b/Documentation/cheri/pcuabi.rst
@@ -0,0 +1,177 @@
+===================================
+The pure-capability kernel-user ABI
+===================================
+
+CHERI capabilities can be used in many ways. In the so-called
+pure-capability model, all pointers are represented as capabilities,
+whether they are manipulated explicitly or not. This approach is highly
+attractive as it leverages many of the CHERI mechanisms to strengthen
+memory safety, without disrupting the vast majority of existing C/C++
+software.
+
+The pure-capability model requires a major ABI break, as the
+representation of pointers is fundamentally different from "traditional"
+ABIs, where pointers are simply integer addresses. Supporting such a
+model in userspace therefore requires the introduction of a new
+kernel-user ABI, the pure-capability kernel-user ABI (PCuABI).
+
+A specification for this new uABI, complemented with rationale about its
+design and objectives, is available in the following document:
+
+ `PCuABI specification`_
+
+This specification is currently limited to the Morello architecture, as
+it is the only CHERI-enabled architecture supported in Linux. Adding
+support for other architectures would entail extending the specification
+accordingly.
+
+The present document deals with implementation aspects that are beyond
+the scope of the specification. It aims to provide kernel developers
+with an overview of the changes that have been made to various internal
+kernel APIs in order to support PCuABI.
+
+Note: current limitations
+ Support for PCuABI in Linux is a work in progress, and at this stage
+ it is mostly of a functional nature, with only limited enforcement of
+ capability-related restrictions. The variant of the ABI that is
+ currently implemented in Linux is documented in the `transitional
+ PCuABI specification`_, which is forward-compatible with the full
+ specification. Only **a limited set of syscalls** is supported in this
+ ABI.
+
+Config option
+=============
+
+Selecting the option ``CONFIG_CHERI_PURECAP_UABI`` enables support for
+the pure-capability uABI; in other words, the native userspace ABI
+becomes PCuABI instead of the "traditional" uABI. This option is not
+tied to any particular architecture, but naturally it is only available
+on CHERI-enabled architectures.
+
+
+The hybrid approach
+===================
+
+The way in which PCuABI is currently implemented in Linux is a hybrid
+approach: the native userspace ABI becomes pure-capability while **the
+in-kernel ABI remains unchanged**. Concretely, this means that kernel
+pointers and user pointers are no longer intercompatible; specifically,
+a kernel pointer - still an integer - cannot represent a user pointer -
+now a capability.
+
+Note: different approaches
+ This is only one of a number of plausible strategies to support PCuABI.
+ A more natural approach is to change the in-kernel ABI in line with
+ the userspace ABI, that is to make the kernel itself a pure-capability
+ binary. While this simplifies the handling of user pointers compared
+ to the hybrid approach, and strengthens the kernel itself, building
+ the kernel in the pure-capability ABI is a major undertaking, mainly
+ due to the extremely widespread representation of kernel pointers as
+ ``long``-sized integers. To keep the level of effort reasonable and
+ achieve a complete implementation of PCuABI in a realistic timescale,
+ the hybrid approach has therefore been chosen as a starting point.
+
+
+Leveraging __user
+-----------------
+
+User pointers are currently turned into capabilities by redefining the
+``__user`` macro to expand to ``__capability``. This is a convenient
+approach as all user pointers should already be annotated with
+``__user``, thereby avoiding the extensive changes a new annotation
+would entail.
+
+Unfortunately, the ``_user`` annotation prefixes ``*``, for instance::
+
+ void __user *
+
+This is problematic as ``void __capability *`` is deprecated;
+``__capability`` is only unambiguous when used as a suffix for ``*``.
+In more complex cases, such as double pointers, the compiler is only
+able to parse ``__capability`` as a suffix.
+
+It is therefore occasionally necessary to introduce PCuABI-specific fixup
+blocks to remove that ambiguity by moving ``__capability`` from prefix to
+suffix. It is typically done as follows::
+
+ #ifdef CONFIG_CHERI_PURECAP_UABI
+ void * __capability * __capability p;
+ #else
+ void __user * __user *p;
+ #endif
+
+Fortunately, in the vast majority of cases simple user pointers are used
+and no such fixup is required.
+
+
+Pointer and address types
+=========================
+
+As mentioned previously, user pointers are larger than kernel pointers
+when ``CONFIG_CHERI_PURECAP_UABI`` is selected. Indeed, user pointers
+are represented as capabilities; they are therefore 129-bit wide on
+64-bit architectures: twice the address size, plus an out-of-band tag
+bit. This tag bit is an integral part of the user pointer and can only
+be preserved by representing the user pointer with a compiler-provided
+capability type, such as ``void * __capability`` or ``__uintcap_t``.
+
+For this reason, the representation of certain types changes when the
+kernel is built to support PCuABI. The table below provides the
+*representation* of various types **in the kernel** on a 64-bit
+architecture, depending on the supported user ABI:
+
++----------------------------------+------------------+----------------+--------------------------------------------------------------------------+
+| Type | Traditional uABI | PCuABI | Notes |
++==================================+==================+================+==========================================================================+
+| ``void *`` | 64-bit integer | 64-bit integer | |
++----------------------------------+------------------+----------------+--------------------------------------------------------------------------+
+| ``uintptr_t`` | 64-bit integer | 64-bit integer | |
++----------------------------------+------------------+----------------+--------------------------------------------------------------------------+
+| | ``(unsigned) long`` | 64-bit integer | 64-bit integer | ``ptraddr_t`` is a new generic type that represents an address. |
+| | ``(unsigned) long long`` | | | |
+| | ``ptraddr_t`` | | | |
++----------------------------------+------------------+----------------+--------------------------------------------------------------------------+
+| ``void __user *`` | 64-bit integer | Capability | |
++----------------------------------+------------------+----------------+--------------------------------------------------------------------------+
+| ``user_uintptr_t`` | 64-bit integer | Capability | Represented as ``uintcap_t`` in PCuABI, see below. |
++----------------------------------+------------------+----------------+--------------------------------------------------------------------------+
+| | ``__kernel_uintptr_t`` | 64-bit integer | Capability | * Represented as ``uintcap_t`` in PCuABI, see below. |
+| | ``__kernel_aligned_uintptr_t`` | | | * At least 64-bit regardless of the ABI. |
++----------------------------------+------------------+----------------+--------------------------------------------------------------------------+
+| | ``void __capability *`` | Capability | Capability | Only available on CHERI-enabled architectures (``__CHERI__`` defined). |
+| | ``void * __capability`` | | | |
++----------------------------------+------------------+----------------+--------------------------------------------------------------------------+
+| ``uintcap_t`` | Capability | Capability | * Only available on CHERI-enabled architectures (``__CHERI__`` defined). |
+| | | | * Represented as a capability, but otherwise behaves as a 64-bit integer |
+| | | | (when performing arithmetic, converting to other integer types, etc.). |
++----------------------------------+------------------+----------------+--------------------------------------------------------------------------+
+
+For reference, the table below provides the representation of relevant
+types **in userspace**, depending on the chosen ABI:
+
++----------------------------------+-----------------+---------------------+------------------------------------------------------------------------+
+| Type | Traditional ABI | Pure-capability ABI | Notes |
++==================================+=================+=====================+========================================================================+
+| ``void *`` | 64-bit integer | Capability | |
++----------------------------------+-----------------+---------------------+------------------------------------------------------------------------+
+| ``uintptr_t`` | 64-bit integer | Capability | Represented as ``uintcap_t`` in purecap. |
++----------------------------------+-----------------+---------------------+------------------------------------------------------------------------+
+| | ``(unsigned) long`` | 64-bit integer | 64-bit integer | |
+| | ``(unsigned) long long`` | | | |
+| | ``ptraddr_t`` | | | |
++----------------------------------+-----------------+---------------------+------------------------------------------------------------------------+
+| | ``__kernel_uintptr_t`` | 64-bit integer | Capability | * Represented as ``uintcap_t`` in purecap. |
+| | ``__kernel_aligned_uintptr_t`` | | | * At least 64-bit regardless of the ABI. |
++----------------------------------+-----------------+---------------------+------------------------------------------------------------------------+
+| | ``void __capability *`` | Capability | Capability | Only available on CHERI-enabled architectures (``__CHERI__`` defined). |
+| | ``void * __capability`` | | | |
++----------------------------------+-----------------+---------------------+------------------------------------------------------------------------+
+| ``uintcap_t`` | Capability | Capability | Only available on CHERI-enabled architectures (``__CHERI__`` defined). |
++----------------------------------+-----------------+---------------------+------------------------------------------------------------------------+
+
+For more information about user pointers and related conversions, please
+refer to the `user pointer documentation`_.
+
+.. _PCuABI specification: https://git.morello-project.org/morello/kernel/linux/-/wikis/Morello-pure-c…
+.. _Transitional PCuABI specification: https://git.morello-project.org/morello/kernel/linux/-/wikis/Transitional-M…
+.. _user pointer documentation: Documentation/core-api/user_ptr.rst
diff --git a/Documentation/core-api/user_ptr.rst b/Documentation/core-api/user_ptr.rst
index 21e02d4bd11b..0e14616c0499 100644
--- a/Documentation/core-api/user_ptr.rst
+++ b/Documentation/core-api/user_ptr.rst
@@ -12,7 +12,7 @@ regions:
These two categories of pointers are not interchangeable and, in
particular, the kernel should never directly dereference a user pointer.
-The introduction of the pure-capability kernel-user ABI (PCuABI) has
+The introduction of the `pure-capability kernel-user ABI`_ (PCuABI) has
made this distinction even more important, as in that configuration user
pointers are of a different type altogether and cannot be represented by
kernel pointers or most integer types.
@@ -20,6 +20,8 @@ kernel pointers or most integer types.
This document outlines the available API to represent and manipulate
user pointers in a way that is safe in any kernel-user ABI.
+.. _pure-capability kernel-user ABI: Documentation/cheri/pcuabi.rst
+
Representing user pointers
==========================
@@ -52,27 +54,6 @@ integer types such as ``long``. User **addresses** may however still be
represented like kernel addresses, e.g. using ``long``. The recommended
type for addresses when writing new code is ``ptraddr_t``.
-PCuABI-specific changes
------------------------
-
-When PCuABI is targeted by selecting the ``CONFIG_CHERI_PURECAP_UABI``
-option, user pointers are turned into capabilities by making the
-``__user`` annotation expand to ``__capability``. Unfortunately,
-``_user`` precedes ``*`` and using ``__capability`` as a prefix of ``*``
-is deprecated. It does work in most cases, but in more complex
-situations, such as double pointers, it becomes ambiguous and fails to
-compile.
-
-It is therefore occasionally necessary to have PCuABI-specific fixup
-blocks to solve that ambiguity by moving ``__capability`` as a suffix of
-``*``. It is typically done as follows::
-
- #ifdef CONFIG_CHERI_PURECAP_UABI
- void * __capability * __capability p;
- #else
- void __user * __user *p;
- #endif
-
Converting user pointers
========================
--
2.38.1