This patch series addresses the VM_READ_CAPS/VM_WRITE_CAPS flags issue: https://git.morello-project.org/morello/kernel/linux/-/issues/36
io_uring system uses buffers shared with userspace to read the io events and report their results. The structs that populate the submission and completion queues can contain capabilities. Shared mappings don't have the Load/Store capabilities permission to avoid leaking capabilities outside their original address space, so add two new VM flags that would allow the kernel to set up such mappings.
While at it, also fix pte_modify to allow setting PTE_*_CAPS flags, add new the new rc/wc smaps flags, and remove the automatic addition of PTE_*_CAPS to user mappings.
To note: this wouldn't allow userspace to make arbitrary shared mappings with tag access, the new VM flags would be for internal use only for the time being.
v3: - Improved documentation, comments, and commit message - Fixed condition in Patch 3, now tested properly with Morello GDB
v2: - Removed Patch 1 from the series as it wasn't essential - Added docs to Documentation/filesystems/proc.rst - Removed VM_RW_CAPS - Moved definition of VM_*_CAPS just after the definition of VM_MTE - Added details for a TODO related to file-backed mappings - Introduced Patch 3 that removes an assumption about shared mappings
Review branch: https://git.morello-project.org/tudcre01/linux/-/commits/vm_rw_caps_v3/
Thanks, Tudor
Tudor Cretu (3): arm64: morello: Add VM_READ_CAPS and VM_WRITE_CAPS flags arm64: morello: Explicitly add VM_*_CAPS to private user mappings arm64: morello: Check against VM_WRITE_CAPS in access_remote_cap
Documentation/filesystems/proc.rst | 2 ++ arch/arm64/Kconfig | 1 + arch/arm64/include/asm/mman.h | 26 ++++++++++++++++++++++++-- arch/arm64/include/asm/page.h | 3 ++- arch/arm64/include/asm/pgtable-prot.h | 12 +++++------- arch/arm64/kernel/morello.c | 7 +++---- fs/proc/task_mmu.c | 4 ++++ include/linux/mm.h | 8 ++++++++ 8 files changed, 49 insertions(+), 14 deletions(-)
Some systems (e.g. io_uring) need to load/store capabilities on buffers shared with the userspace. Shared mappings don't have load/store capabilities permissions by default, so add two new VM flags that would allow to set up such mappings.
Note: this won't allow userspace to make arbitrary shared mappings with tag access as the flags are not exposed; the new VM flags are for internal use only.
Signed-off-by: Tudor Cretu tudor.cretu@arm.com --- Documentation/filesystems/proc.rst | 2 ++ arch/arm64/Kconfig | 1 + arch/arm64/include/asm/mman.h | 6 ++++++ fs/proc/task_mmu.c | 4 ++++ include/linux/mm.h | 8 ++++++++ 5 files changed, 21 insertions(+)
diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst index 061744c436d9..6d9b22ebf3e9 100644 --- a/Documentation/filesystems/proc.rst +++ b/Documentation/filesystems/proc.rst @@ -553,6 +553,8 @@ encoded manner. The codes are the following: mg mergable advise flag bt arm64 BTI guarded page mt arm64 MTE allocation tags are enabled + rc arm64 Morello capability tag reads are enabled + wc arm64 Morello capability tag writes are enabled um userfaultfd missing tracking uw userfaultfd wr-protect tracking == ======================================= diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index c784d8664a40..e8e6b0f21a91 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1971,6 +1971,7 @@ config ARM64_MORELLO depends on CC_HAS_MORELLO select ARCH_NO_SWAP select ARCH_HAS_CHERI_CAPABILITIES + select ARCH_USES_HIGH_VMA_FLAGS help The Morello architecture is an experimental extension to Armv8.2-A, which extends the AArch64 state with the principles proposed in diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h index e3e28f7daf62..eb0b862121a2 100644 --- a/arch/arm64/include/asm/mman.h +++ b/arch/arm64/include/asm/mman.h @@ -55,6 +55,12 @@ static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags) if (vm_flags & VM_MTE) prot |= PTE_ATTRINDX(MT_NORMAL_TAGGED);
+ if (vm_flags & VM_READ_CAPS) + prot |= PTE_LOAD_CAPS; + + if (vm_flags & VM_WRITE_CAPS) + prot |= PTE_STORE_CAPS; + return __pgprot(prot); } #define arch_vm_get_page_prot(vm_flags) arch_vm_get_page_prot(vm_flags) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index f46060eb91b5..4f56772da016 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -697,6 +697,10 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_MINOR [ilog2(VM_UFFD_MINOR)] = "ui", #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */ +#ifdef CONFIG_ARM64_MORELLO + [ilog2(VM_READ_CAPS)] = "rc", + [ilog2(VM_WRITE_CAPS)] = "wc", +#endif }; size_t i;
diff --git a/include/linux/mm.h b/include/linux/mm.h index 9b7b730db4e9..b399d6ca1d83 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -357,6 +357,14 @@ extern unsigned int kobjsize(const void *objp); # define VM_MTE_ALLOWED VM_NONE #endif
+#ifdef CONFIG_ARM64_MORELLO +# define VM_READ_CAPS VM_HIGH_ARCH_2 /* Permit capability tag loads */ +# define VM_WRITE_CAPS VM_HIGH_ARCH_3 /* Permit capability tag stores */ +#else +# define VM_READ_CAPS VM_NONE +# define VM_WRITE_CAPS VM_NONE +#endif + #ifndef VM_GROWSUP # define VM_GROWSUP VM_NONE #endif
Don't add PTE_*_CAPS to default user mappings in pgtable-prot.h. Instead, add VM_{READ,WRITE}_CAPS to all private user mappings. This ensures that the rc/wc smaps flags are set. The instances where the VM_*_CAPS flags are added by default are: - standard mappings created by the kernel itself - the stack and brk area by updating the VM_DATA_DEFAULT_FLAGS flags - private user mappings created by userspace through mmap()
This is a partial revert of: "arm64: morello: Enable access to capabilities in memory" (reverting the addition of the PTE flags to the private user mappings).
Signed-off-by: Tudor Cretu tudor.cretu@arm.com --- arch/arm64/include/asm/mman.h | 20 ++++++++++++++++++-- arch/arm64/include/asm/page.h | 3 ++- arch/arm64/include/asm/pgtable-prot.h | 12 +++++------- 3 files changed, 25 insertions(+), 10 deletions(-)
diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h index eb0b862121a2..a0efa2b22440 100644 --- a/arch/arm64/include/asm/mman.h +++ b/arch/arm64/include/asm/mman.h @@ -23,15 +23,31 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags) { + unsigned long ret = 0; + /* * Only allow MTE on anonymous mappings as these are guaranteed to be * backed by tags-capable memory. The vm_flags may be overridden by a * filesystem supporting MTE (RAM-based). */ if (system_supports_mte() && (flags & MAP_ANONYMOUS)) - return VM_MTE_ALLOWED; + ret |= VM_MTE_ALLOWED; + + /* + * Allow capability tag access for private mappings as they don't pose + * the risk of leaking capabilities outside their original address-space. + * + * TODO [Morello]: There are certain situations where it is not possible + * to enable capability access in file-backed mappings, even private. + * This is notably the case for DAX, where backing pages are directly + * mapped, and the underlying storage is unlikely to support capability + * tags. Might need to explicitly allow or explicitly disallow certain + * filesystems. + */ + if (system_supports_morello() && ((flags & MAP_TYPE) == 0x02 /* MAP_PRIVATE */)) + ret |= VM_READ_CAPS | VM_WRITE_CAPS;
- return 0; + return ret; } #define arch_calc_vm_flag_bits(flags) arch_calc_vm_flag_bits(flags)
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h index 993a27ea6f54..98fad7027e05 100644 --- a/arch/arm64/include/asm/page.h +++ b/arch/arm64/include/asm/page.h @@ -47,7 +47,8 @@ int pfn_is_map_memory(unsigned long pfn);
#endif /* !__ASSEMBLY__ */
-#define VM_DATA_DEFAULT_FLAGS (VM_DATA_FLAGS_TSK_EXEC | VM_MTE_ALLOWED) +#define VM_DATA_DEFAULT_FLAGS (VM_DATA_FLAGS_TSK_EXEC | VM_MTE_ALLOWED | \ + VM_READ_CAPS | VM_WRITE_CAPS)
#include <asm-generic/getorder.h>
diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h index 5463f9bc0602..ab746df170e3 100644 --- a/arch/arm64/include/asm/pgtable-prot.h +++ b/arch/arm64/include/asm/pgtable-prot.h @@ -90,11 +90,9 @@ extern bool arm64_use_ng_mappings; #define PAGE_NONE __pgprot(((_USER_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_RDONLY | PTE_UXN) /* shared+writable pages are clean by default, hence PTE_RDONLY|PTE_WRITE */ #define PAGE_SHARED __pgprot(_USER_PAGE_DEFAULT | PTE_RDONLY | PTE_UXN | PTE_WRITE) -#define PAGE_SHARED_RO __pgprot(_USER_PAGE_DEFAULT | PTE_RDONLY | PTE_UXN) -#define PAGE_SHARED_RO_EXEC __pgprot(_USER_PAGE_DEFAULT | PTE_RDONLY) #define PAGE_SHARED_EXEC __pgprot(_USER_PAGE_DEFAULT | PTE_RDONLY | PTE_WRITE) -#define PAGE_READONLY __pgprot(_USER_PAGE_DEFAULT | PTE_RDONLY | PTE_MAYBE_LS_CAPS | PTE_UXN) -#define PAGE_READONLY_EXEC __pgprot(_USER_PAGE_DEFAULT | PTE_RDONLY | PTE_MAYBE_LS_CAPS) +#define PAGE_READONLY __pgprot(_USER_PAGE_DEFAULT | PTE_RDONLY | PTE_UXN) +#define PAGE_READONLY_EXEC __pgprot(_USER_PAGE_DEFAULT | PTE_RDONLY) #define PAGE_EXECONLY __pgprot(_PAGE_DEFAULT | PTE_RDONLY | PTE_NG | PTE_PXN)
#define __P000 PAGE_NONE @@ -107,11 +105,11 @@ extern bool arm64_use_ng_mappings; #define __P111 PAGE_READONLY_EXEC
#define __S000 PAGE_NONE -#define __S001 PAGE_SHARED_RO +#define __S001 PAGE_READONLY #define __S010 PAGE_SHARED #define __S011 PAGE_SHARED -#define __S100 PAGE_SHARED_RO_EXEC /* PAGE_EXECONLY if Enhanced PAN */ -#define __S101 PAGE_SHARED_RO_EXEC +#define __S100 PAGE_READONLY_EXEC /* PAGE_EXECONLY if Enhanced PAN */ +#define __S101 PAGE_READONLY_EXEC #define __S110 PAGE_SHARED_EXEC #define __S111 PAGE_SHARED_EXEC
Remove the hardcoded assumption that all shared mappings are untagged. Mapping's store capability permission is given explicitly using VM_WRITE_CAPS, so check against that.
Signed-off-by: Tudor Cretu tudor.cretu@arm.com --- arch/arm64/kernel/morello.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/kernel/morello.c b/arch/arm64/kernel/morello.c index ccbf26e77919..2a57104c2bd7 100644 --- a/arch/arm64/kernel/morello.c +++ b/arch/arm64/kernel/morello.c @@ -187,14 +187,13 @@ static int access_remote_cap(struct task_struct *tsk, struct mm_struct *mm,
if (write) { /* - * Disallow writing a valid (tagged) capability to an untagged - * mapping (currently all shared mappings are untagged, this may - * change in the future). + * Disallow writing a valid (tagged) capability to a mapping + * without store capability permission. * * Reading/writing an untagged capability is always allowed * (just like regular load and store instructions). */ - if (user_cap->tag && (vma->vm_flags & VM_SHARED)) { + if (user_cap->tag && !(vma->vm_flags & VM_WRITE_CAPS)) { ret = -EOPNOTSUPP; goto out_put; }
On 25/11/2022 12:55, Tudor Cretu wrote:
This patch series addresses the VM_READ_CAPS/VM_WRITE_CAPS flags issue: https://git.morello-project.org/morello/kernel/linux/-/issues/36
io_uring system uses buffers shared with userspace to read the io events and report their results. The structs that populate the submission and completion queues can contain capabilities. Shared mappings don't have the Load/Store capabilities permission to avoid leaking capabilities outside their original address space, so add two new VM flags that would allow the kernel to set up such mappings.
While at it, also fix pte_modify to allow setting PTE_*_CAPS flags, add new the new rc/wc smaps flags, and remove the automatic addition of PTE_*_CAPS to user mappings.
To note: this wouldn't allow userspace to make arbitrary shared mappings with tag access, the new VM flags would be for internal use only for the time being.
v3:
- Improved documentation, comments, and commit message
- Fixed condition in Patch 3, now tested properly with Morello GDB
v2:
- Removed Patch 1 from the series as it wasn't essential
- Added docs to Documentation/filesystems/proc.rst
- Removed VM_RW_CAPS
- Moved definition of VM_*_CAPS just after the definition of VM_MTE
- Added details for a TODO related to file-backed mappings
- Introduced Patch 3 that removes an assumption about shared mappings
Review branch: https://git.morello-project.org/tudcre01/linux/-/commits/vm_rw_caps_v3/
Thanks, Tudor
Tudor Cretu (3): arm64: morello: Add VM_READ_CAPS and VM_WRITE_CAPS flags arm64: morello: Explicitly add VM_*_CAPS to private user mappings arm64: morello: Check against VM_WRITE_CAPS in access_remote_cap
Looks all good to me, will give some time to anyone else who'd like to have a look before merging these.
Kevin
Documentation/filesystems/proc.rst | 2 ++ arch/arm64/Kconfig | 1 + arch/arm64/include/asm/mman.h | 26 ++++++++++++++++++++++++-- arch/arm64/include/asm/page.h | 3 ++- arch/arm64/include/asm/pgtable-prot.h | 12 +++++------- arch/arm64/kernel/morello.c | 7 +++---- fs/proc/task_mmu.c | 4 ++++ include/linux/mm.h | 8 ++++++++ 8 files changed, 49 insertions(+), 14 deletions(-)
On 25/11/2022 13:54, Kevin Brodsky wrote:
On 25/11/2022 12:55, Tudor Cretu wrote:
This patch series addresses the VM_READ_CAPS/VM_WRITE_CAPS flags issue: https://git.morello-project.org/morello/kernel/linux/-/issues/36
io_uring system uses buffers shared with userspace to read the io events and report their results. The structs that populate the submission and completion queues can contain capabilities. Shared mappings don't have the Load/Store capabilities permission to avoid leaking capabilities outside their original address space, so add two new VM flags that would allow the kernel to set up such mappings.
While at it, also fix pte_modify to allow setting PTE_*_CAPS flags, add new the new rc/wc smaps flags, and remove the automatic addition of PTE_*_CAPS to user mappings.
To note: this wouldn't allow userspace to make arbitrary shared mappings with tag access, the new VM flags would be for internal use only for the time being.
v3:
- Improved documentation, comments, and commit message
- Fixed condition in Patch 3, now tested properly with Morello GDB
v2:
- Removed Patch 1 from the series as it wasn't essential
- Added docs to Documentation/filesystems/proc.rst
- Removed VM_RW_CAPS
- Moved definition of VM_*_CAPS just after the definition of VM_MTE
- Added details for a TODO related to file-backed mappings
- Introduced Patch 3 that removes an assumption about shared mappings
Review branch: https://git.morello-project.org/tudcre01/linux/-/commits/vm_rw_caps_v3/
Thanks, Tudor
Tudor Cretu (3): arm64: morello: Add VM_READ_CAPS and VM_WRITE_CAPS flags arm64: morello: Explicitly add VM_*_CAPS to private user mappings arm64: morello: Check against VM_WRITE_CAPS in access_remote_cap
Looks all good to me, will give some time to anyone else who'd like to have a look before merging these.
Now in next, thanks!
Kevin
Kevin
Documentation/filesystems/proc.rst | 2 ++ arch/arm64/Kconfig | 1 + arch/arm64/include/asm/mman.h | 26 ++++++++++++++++++++++++-- arch/arm64/include/asm/page.h | 3 ++- arch/arm64/include/asm/pgtable-prot.h | 12 +++++------- arch/arm64/kernel/morello.c | 7 +++---- fs/proc/task_mmu.c | 4 ++++ include/linux/mm.h | 8 ++++++++ 8 files changed, 49 insertions(+), 14 deletions(-)
linux-morello mailing list -- linux-morello@op-lists.linaro.org To unsubscribe send an email to linux-morello-leave@op-lists.linaro.org
linux-morello@op-lists.linaro.org