On 1/12/24 20:24, Kevin Brodsky wrote:
On 08/01/2024 11:23, Amit Daniel Kachhap wrote:
In CHERI architecture, all the ranges cannot be represented in capability so add the necessary CHERI base and length alignment checks when generating the free unmapped virtual address or evaluating the fixed input address.
The PCuABI reservation interface stores the un-usable alignment gaps at the start and end. These gaps should be considered when finding the free unmapped address space.
In case of fixed valid capability type addresses, the requested address range should completely overlap with the reservation range. In case of fixed null capability addresses, they are verified to not overlap with any existing reservation range.
Signed-off-by: Amit Daniel Kachhap amitdaniel.kachhap@arm.com
include/linux/mm.h | 8 ++++ mm/mmap.c | 97 +++++++++++++++++++++++++++++++++++++++++----- 2 files changed, 95 insertions(+), 10 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h index a4b7381b4977..5b79120c999a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3239,6 +3239,10 @@ static inline unsigned long vm_start_gap(struct vm_area_struct *vma) { unsigned long vm_start = vma->vm_start; +#ifdef CONFIG_CHERI_PURECAP_UABI
It would be nice to get rid of all this #ifdef'ing, this is really a lot of noise. In most cases we already test the mm flag so it's just a matter of ensuring the conditional code can build. This should be quite straightforward if we introduce a few more helpers, for instance to get the reservation start and end (fallback implementation being the mapping start and end).
ok I will try adding more helpers.
- if (test_bit(MMF_PCUABI_RESERVE, &vma->vm_mm->flags))
This test is done so many times that it probably deserves a helper.
vm_start = vma->reserv_start;
+#endif if (vma->vm_flags & VM_GROWSDOWN) { vm_start -= stack_guard_gap; if (vm_start > vma->vm_start) @@ -3251,6 +3255,10 @@ static inline unsigned long vm_end_gap(struct vm_area_struct *vma) { unsigned long vm_end = vma->vm_end; +#ifdef CONFIG_CHERI_PURECAP_UABI
- if (test_bit(MMF_PCUABI_RESERVE, &vma->vm_mm->flags))
vm_end = vma->reserv_start + vma->reserv_len;
+#endif if (vma->vm_flags & VM_GROWSUP) { vm_end += stack_guard_gap; if (vm_end < vma->vm_end) diff --git a/mm/mmap.c b/mm/mmap.c index 7f2246cbc969..6027da2c248b 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -48,6 +48,7 @@ #include <linux/sched/mm.h> #include <linux/ksm.h> +#include <linux/cap_addr_mgmt.h> #include <linux/uaccess.h> #include <asm/cacheflush.h> #include <asm/tlb.h> @@ -1561,6 +1562,11 @@ static unsigned long unmapped_area(struct vm_unmapped_area_info *info) /* Adjust search length to account for worst case alignment overhead */ length = info->length + info->align_mask; +#if defined(CONFIG_CHERI_PURECAP_UABI)
- /* Cheri Representable length is sufficient for alignment */
I still don't understand how we know this is the case. The gap we find needs to be of length cheri_representable_length(info->length), *but also* its alignment needs to be at least CHERI_REPRESENTABLE_ALIGNMENT(info->length). There is no guarantee that > mas_empty_area() will return a sufficiently aligned gap, and that's why the original code increases the requested length - the following line will ensure that the returned address is indeed aligned (discarding the bottom part of the gap if it's not sufficiently aligned):
Yes you are right here. There is also 1 bug in current the code. Even info->length should be equal to cheri_representable_length(). I will update in the next iteration.
gap += (info->align_offset - gap) & info->align_mask;
- if (test_bit(MMF_PCUABI_RESERVE, ¤t->mm->flags))
length = cheri_representable_length(info->length);
+#endif if (length < info->length) return -ENOMEM; @@ -1612,6 +1618,11 @@ static unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info) MA_STATE(mas, ¤t->mm->mm_mt, 0, 0); /* Adjust search length to account for worst case alignment overhead */ length = info->length + info->align_mask; +#if defined(CONFIG_CHERI_PURECAP_UABI)
- /* Cheri Representable length is sufficient for alignment */
- if (test_bit(MMF_PCUABI_RESERVE, ¤t->mm->flags))
length = cheri_representable_length(info->length);
+#endif if (length < info->length) return -ENOMEM; @@ -1637,6 +1648,10 @@ static unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info) tmp = mas_prev(&mas, 0); if (tmp && vm_end_gap(tmp) > gap) { high_limit = tmp->vm_start; +#if defined(CONFIG_CHERI_PURECAP_UABI)
if (test_bit(MMF_PCUABI_RESERVE, ¤t->mm->flags))
high_limit = tmp->reserv_start;
+#endif mas_reset(&mas); goto retry; } @@ -1688,20 +1703,46 @@ generic_get_unmapped_area(struct file *filp, user_uintptr_t user_ptr, struct vm_unmapped_area_info info; unsigned long addr = (ptraddr_t)user_ptr; const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags);
- unsigned long align_len = len;
- if (len > mmap_end - mmap_min_addr)
+#if defined(CONFIG_CHERI_PURECAP_UABI)
- if (test_bit(MMF_PCUABI_RESERVE, &mm->flags))
align_len = cheri_representable_length(len);
+#endif
- if (align_len > mmap_end - mmap_min_addr) return -ENOMEM;
- if (flags & MAP_FIXED)
- if (flags & MAP_FIXED) {
+#if defined(CONFIG_CHERI_PURECAP_UABI)
if (test_bit(MMF_PCUABI_RESERVE, &mm->flags) && cheri_tag_get(user_ptr)) {
AFAICT this cheri_tag_get(user_ptr) is the only reason we need the previous patch, propagating the capability to get_unmapped_area(). This doesn't really seem to be worth it, and more importantly it doesn't work if we don't change in-kernel callers of vm_mmap() to use capabilities. So consistently with what I suggested on patch 3 in RFC v2, I don't think we should consider capabilities at all here. In the MAP_FIXED case, let's check that the range is either entirely inside a reservation, or otherwise that we can create a (representable) reservation for it. It is up to the syscall handler to do the actual capability checking.
Yes tag determines here if the address range is within reservation bound or not. I felt that this approach is generic in nature. But yes for sure this can be done in syscall handler itself and patch 4 can entirely be dropped.
/* Ensure that this range is within the reservation bound */
vma = find_vma(mm, addr);
if (!vma || !reserv_vma_valid_address(vma, addr, len))
return -ERESERVATION;
return addr;
} else if (!test_bit(MMF_PCUABI_RESERVE, &mm->flags))
return addr;
+#else return addr; +#endif
- }
if (addr) { addr = PAGE_ALIGN(addr); +#if defined(CONFIG_CHERI_PURECAP_UABI)
if (test_bit(MMF_PCUABI_RESERVE, &mm->flags))
addr = round_up(addr, CHERI_REPRESENTABLE_ALIGNMENT(len));
round_down() surely?
This function is generic_get_unmapped_area() where free memory are in increasing order so round_up is used. In generic_get_unmapped_area_top_down(), this will be otherwise.
+#endif vma = find_vma_prev(mm, addr, &prev);
if (mmap_end - len >= addr && addr >= mmap_min_addr &&
(!vma || addr + len <= vm_start_gap(vma)) &&
if (mmap_end - align_len >= addr && addr >= mmap_min_addr &&
(!prev || addr >= vm_end_gap(prev))) return addr;(!vma || addr + align_len <= vm_start_gap(vma)) &&
+#if defined(CONFIG_CHERI_PURECAP_UABI)
else if (flags & MAP_FIXED)
/* This non-tagged fixed address overlaps with other reservation */
return -ERESERVATION;
I don't think this is ever hit, considering that we always return above if flags & MAP_FIXED.
This is a fallback when non-tagged fixed address is not free and overlapping and hence cannot be used.
Thanks, Amit Daniel
Kevin
[...]