[linux-morello] Re: [PATCH v2] lib/vsprintf: Add support for architectural capabilities

15 Nov 2022

On Thu, Nov 10, 2022 at 02:29:04PM +0100, Kevin Brodsky wrote:
...
On 01/11/2022 00:27, Beata Michalska wrote:
...
Introduce new printk's pointer format extension to support capabilities,
with one being denoted by specifying the 'l' length modifier alongside
the 'p' type character => "%lp". In addition to the raw capability
format, adding '#' flag will generate the simplified one, being slightly
more verbose. In both cases, the actual result is subject to
pointer hashing (address only) unless either 'x' qualifier or
no_hash_pointers command line parameter is being provided.
For more details see the docs, updated here as well.
While at it, add some testcases to validate new format alongside
checkpatchpl new warning, to raise the awareness of potential misuse.
Signed-off-by: Beata Michalska beata.michalska@arm.com
v2:

Rephrase/rearrange docs + adding link to the CHERI guidelines on printf
formats (also dropping 'hybrid-mode' mentions)
Code readability/formatting improvements (as per review comments)
Adding '0x' prefix for relevant simplified format components
Removing missing permissions from the final output in simplified format
Use pointer_string directly when no hashing is expected
Added support for vbin_printf/bstr_printf (trace_printk is now able to print
capabilities)

Review branch available at:
 https://git.morello-project.org/Bea/linux/-/tree/morello/cap_printk

Documentation/core-api/printk-formats.rst |  68 +++++++
 lib/test_printf.c                         | 133 ++++++++++++
 lib/vsprintf.c                            | 233 +++++++++++++++++++++-
 scripts/checkpatch.pl                     |  13 +-
 4 files changed, 443 insertions(+), 4 deletions(-)

diff --git a/Documentation/core-api/printk-formats.rst b/Documentation/core-api/printk-formats.rst
index 5e89497ba314..bd97a9750e54 100644
--- a/Documentation/core-api/printk-formats.rst
+++ b/Documentation/core-api/printk-formats.rst
@@ -202,6 +202,74 @@ Example::
 
   printk("test: difference between pointers: %td\n", ptr2 - ptr1);
 
+Capabilities
+------------



+::


%lp[x]	1:ffffc00000010005:0123456789abcdef
%#lp[x]	0x0123456789abcdef [rwxRWE,0x0000000000000000-0xffffffffffffffff]

Nit: leading tab to be consistent with the other formats. Same remark
for the other code blocks, the indentation should be done using hard
tabs where possible.
Fixed in v3.
...
...




+For printing CHERI architectural capabilities (when supported, otherwise
+the result of using it is undefined). Formatting based on:
+https://github.com/CTSRD-CHERI/cheri-c-programming/wiki/Displaying-Capabilit...
+with a small detour of dropping support for basic format in favour of the raw
+one, with the assumption that the actual address can be printed with existing
+printk formats (subject to appropriate casting when dealing with capabilities
+and non-capability formats, which otherwise renders undefined behavior).



+The ``l`` modifier distinguishes capability pointers from standard (integer)
+pointers, giving the following capability format being printed:



+::
The nice thing about this syntax is that "::" expands to ":" at the end
of a paragraph, so you get exactly the same result by saying:
[...] being printed::
<code>
Thanks for that (applied in v3)
...
...


   Capability tag:High bits:Low bits




        <tag>:<127:64>:<63:0>





+Specifying ``#`` modifier results in printing capabilities in a simplified format:



+::


<address> [<permissions>,<base>-<top>] (<attr>)



+where:



+::


   - permissions - none or any combination of:


           - 'r' : CHERI_PERM_LOAD


           - 'w' : CHERI_PERM_STORE


           - 'x' : CHERI_PERM_EXECUTE


           - 'R' : CHERI_PERM_LOAD_CAP


           - 'W' : CHERI_PERM_STORE_CAP


           - 'E' : ARM_CAP_PERMISSION_EXECUTIVE


   - base  - lower bound


   - top   - upper bound


   - attr  - none or any combination of:


           - 'invalid' : capability's tag is clear


           - 'sentry'  : capability is a sealed entry


           - 'sealed'  : capability is sealed with a type other than the


                         sealed entry object type





+The above applies to formats with either `'no_hash_pointers'` parameter being
Did you mean ``...`` instead of `'...'`?
I did -> fixed in v3.
...
...
+enabled or ``x`` modifier being provided - otherwise subject to hashing
+(address only, with remaining bits being disregarded)



+Refer to ``'Unmodified Addresses'`` for potential security implications.
You can refer to title sections directly as they create a hyperlink
target implicitly [1]: `Unmodified Addresses`_
Ditto
...
[1]
https://docutils.sourceforge.io/docs/user/rst/quickref.html#implicit-hyperli...
...



+Currently only Morello architecture is being supported.



+Note:
+An attempt to use this format with a non-capability type is undefined behaviour.
+There are no safety nests, so extra consciousness is advised! Handle with care.
s/nests/nets/ (though no nests either!), also I'm not sure
"consciousness" is really what you mean here, "caution" maybe?
Indeed - fixed in v3.
...
You may want to choose between US and UK spelling as well
(behaviour/behavior) :)
...
+Same applies when capabilities are not supported, as the use of either the ``#``
+flag or the ``l`` length modifier is undefined behaviour when combined with
+%p.



Struct Resources
diff --git a/lib/test_printf.c b/lib/test_printf.c
index 07309c45f327..71d5b8d1c9b5 100644
--- a/lib/test_printf.c
+++ b/lib/test_printf.c
@@ -24,6 +24,10 @@
 
 #include <linux/property.h>
 
+#ifdef __CHERI__
+#include <cheriintrin.h>
+#endif



#include "../tools/testing/selftests/kselftest_module.h"
 
 #define BUF_SIZE 256
@@ -755,6 +759,134 @@ errptr(void)
 #endif
 }
 
+static void __init
+capability_pointer(void)
+{
+#ifdef __CHERI__



There's generally no empty line at the start of a block, same remark
concerning the first loop below.
...

enum action {
/* No action - use as is */


CAP_AXN_NONE,


/* Clear capability tag */


CAP_TAG_CLEAR,


/* Derive null-capability */


CAP_NULL_DERIVE


};

struct __setup {

Could be anonymous I think.
...

char *plain_fmt;


char *fmt;


char *expected;



Technically const char *.
The array itself is aready const.
Still I've added the const qualifiers in v3.
...
...

int   action;


} const setup[] = {
{ "%lp",  "%lpx", "1:da00400059ab89ab:"PTR_STR, CAP_AXN_NONE },


{ "%#lp", "%#lpx",


  "0x"PTR_STR" [rwRW,0xffff0123456789ab-0xffff0123456799ab]",


  CAP_AXN_NONE


},


{ "%lp",  "%lpx",  "0:da00400059ab89ab:"PTR_STR, CAP_TAG_CLEAR },


{ "%#lp", "%#lpx",


  "0x"PTR_STR" [rwRW,0xffff0123456789ab-0xffff0123456799ab] (invalid)",


  CAP_TAG_CLEAR


},


{ "%lp",  "%lpx",  PTR_STR, CAP_NULL_DERIVE },


{ "%#lp", "%#lpx", PTR_STR, CAP_NULL_DERIVE }



Nit: I think it would more readable to have the same layout on every line.
Should be more readable in v3.
...
...

};

char buf[PLAIN_BUF_SIZE];
const char * const cap_fmt[] = {"%lp", "%#lp"};

Maybe cap_fmt_non_hashed to clarify why we have it in addition to the
setup array?
Done.
...
...

int nchars;

void * __capability cap = cheri_ddc_get();

/* Expecting basic permissions to be set */
size_t perms = CHERI_PERM_GLOBAL | CHERI_PERM_LOAD | CHERI_PERM_LOAD_CAP |
       CHERI_PERM_STORE | CHERI_PERM_STORE_CAP | CHERI_PERM_SEAL;



Curious about Seal, why specify it when the simplified format ignores it?
That was acrually a leftover - thanks for pointing that out.
...
...


size_t bounds = 4096;

cap = cheri_address_set(cap, (ptraddr_t)PTR);
cap = cheri_perms_and(cap, perms);
/*
* Basic checks so that the actual output can be safely compared


* to the expected one


*/


if (!cheri_tag_get(cap) || cheri_perms_get(cap) != perms
|| cheri_is_sealed(cap)) {


pr_warn("Failed to create capability for testing - skipping");


skipped_tests += no_hash_pointers ? 4 + ARRAY_SIZE(setup) :


				    4 + ARRAY_SIZE(setup)/2;


return;


}

cap = cheri_bounds_set_exact(cap, bounds);

/* Verify hashing */
if (no_hash_pointers) {
pr_warn("Skipping capability hashing tests\n");


skipped_tests += 4;



I'm not sure where that value (4) comes from.
It was 2 tests per format, though it's actually 1 per format (with 2
conditions). Reworked in v3.
...
...

goto non_hashed;


}

for (int i = 0; i < ARRAY_SIZE(cap_fmt); ++i) {

nchars = snprintf(buf, PLAIN_BUF_SIZE, cap_fmt[i], cap);



/*


 * Should be ADDR_WIDTH in this case , but hey, let's not be picky


 * about it, at least not here ... not yet ...


 */


if (nchars != PTR_WIDTH) {


	/*


	 * This also covers the case when the value has not been


	 * actually hashed as otherwise nchars would be greater than



Had trouble parsing that sentence, I think s/otherwise/in that case/
would help.
Was trying to avoid sequence of  'the case ... in that case'
Reworked in v3.
...
...

	 * PTR_WIDTH


	 */


	pr_warn("Something went wrong with hashing capability value\n");


	goto failed;


}



if (strncmp(buf, PTR_VAL_NO_CRNG, PTR_WIDTH) == 0) {


	pr_warn("crng possibly not yet initialized - capability value buffer contains \"%s\"",


			PTR_VAL_NO_CRNG);


} else if (strncmp(buf, ZEROS, strlen(ZEROS)) != 0) {


	pr_warn("Unexpected format for supposedly hashed capability value\n");


	goto failed;


}


}


+non_hashed:

for (int i = 0; i < ARRAY_SIZE(setup); ++i) {
void * __capability current_cap = cap;



Nit: it would be a bit more idiomatic to leave current_cap uninitialised
here and do it explicitly in each case of the switch below.
Done
...
...


switch (setup[i].action) {


case CAP_TAG_CLEAR:


	current_cap = cheri_tag_clear(current_cap);


	break;


case CAP_NULL_DERIVE:


	/* Null-derived capability */


	current_cap = (void *__capability)


		(__cheri_addr uintcap_t)((void *)cheri_base_get(current_cap));



The base and address happen to be the same here, but better to use what
you actually mean.
Also to avoid quite convoluted casts, and to make the "null-derivation"
more explicit, I think the best is to set the address of the null
capability, i.e. something like:
cheri_address_set(0, cheri_address_get(cap))
Done
...
...

	break;


default:


	break;


}



if (no_hash_pointers)


	test(setup[i].expected, setup[i].plain_fmt, current_cap);


else


	++skipped_tests;



test(setup[i].expected, setup[i].fmt, current_cap);


}

return;


+failed:

failed_tests++;

+#endif /* CONFIG_ARM64_MORELLO */
+}



static void __init
 test_pointer(void)
 {
@@ -781,6 +913,7 @@ test_pointer(void)
   errptr();
   fwnode_pointer();
   fourcc_pointer();

capability_pointer();

}
 
 static void __init selftest(void)
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 40d26a07a133..af08619ebd42 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -51,6 +51,10 @@
 #include <asm/byteorder.h>	/* cpu_to_le16 */
 #include <asm/unaligned.h>
 
+#ifdef __CHERI__
+#include <cheriintrin.h>
+#endif



#include <linux/string_helpers.h>
 #include "kstrtox.h"
 
@@ -435,7 +439,10 @@ enum format_type {
   FORMAT_TYPE_UINT,
   FORMAT_TYPE_INT,
   FORMAT_TYPE_SIZE_T,

FORMAT_TYPE_PTRDIFF


FORMAT_TYPE_PTRDIFF,

+#ifdef __CHERI__

FORMAT_TYPE_CAPABILITY,

+#endif
 };
 
 struct printf_spec {
@@ -2488,6 +2495,181 @@ char *pointer(const char *fmt, char *buf, char *end, void *ptr,
   }
 }
 
+#ifdef __CHERI__
+/*


Support for printing capabilities with %[#]lp[x] format.



It stands slightly in contradiction to kernel extensions



for pointer printk formats as it is using the 'l' length



modifier instead of relying on extended format specifiers.



Currently supported formats:







lp[x]  - Hex value of full capability bits preceded with capability tag:



	 <tag>:<127:64>:<63:0>





	 (* outcome subject to kernel pointer hashing )





#lp[x] - Simplified format as:



	 <address> [permissions=[rwxRWE],<base>-<top>] (attr=[invalid,sentry,sealed])





	 (* outcome subject to kernel pointer hashing )









Details at:



Documentation/core-api/printk-formats.rst



(keep both updated case making any changes)



s/case/when/?
Ditto
...
...








only the address undergoes hashing, remaining bits are being disregarded










*/

+static noinline_for_stack
+char *capability(const char *fmt, char *buf, char *end, void * __capability cap,

 struct printf_spec spec)



+{
+/*


Write to the buffer only when there is a space for it, otherwise



just advance the buffer marker to account for the space needed to



store full value here


*/

+#define update_buf_single(buf, end, c)	\

do { if (buf < end) *buf++ = c; else ++buf; } while (0)

/*
* For null-derived capabilities switch to basic format


* (address only).


* Same applies when hashing is active.


*/


if ((!cheri_tag_get(cap) &&
!__builtin_cheri_copy_from_high(cap))



Nit: without the #ifdef these two lines can be merged now, bit more
readable.
Done
...
...

|| (likely(!no_hash_pointers) && *fmt != 'x')) {



/* Avoid adding prefix */


spec.flags &= ~SPECIAL;


return pointer(fmt, buf, end, (__cheri_fromcap void *)(cap), spec);



No need for parentheses around cap.
Noted
...
...

}

if (spec.flags & SPECIAL) { /* Simplified format for capabilities */
int orig_field_width = spec.field_width;


cheri_perms_t perms = cheri_perms_get(cap);


ptraddr_t base, top;


const char *start = buf;


char *attr_start;


unsigned int i;


int attrib = 0;


int orig_flags;



/* Note: order matters to match the format expected */


struct {


	cheri_perms_t cperm; char id;


} static const __perms[] = {


	{ CHERI_PERM_LOAD,              'r' },


	{ CHERI_PERM_STORE,             'w' },


	{ CHERI_PERM_EXECUTE,           'x' },


	{ CHERI_PERM_LOAD_CAP,          'R' },


	{ CHERI_PERM_STORE_CAP,         'W' },



+#ifdef CONFIG_ARM64_MORELLO

	{ ARM_CAP_PERMISSION_EXECUTIVE,	'E' }



+#endif

};



/*


 * Things get slightly confusing here when it comes to


 * precision. The standard format specification claims that


 * precision might cause truncation of the actual output,


 * contrary to width.


 * CHERI on the other hand suggests, that precision used


 * with the 'p' specifier should determine the width of


 * addresses though following current formatting for pointer types,


 * unless explicitly stated, the width will be enforced, leaving


 * precision being ineffective. Furthermore, according to the guide,


 * in theory precision should not affect the way the Raw and Simplified


 * formats are being handled (the non-address related parts).


 * Stick with that one.


 * Note: Using precision with pointers/capabilities with


 * active hashing might lead to slightly unexpected output.


 * That's the result of applying the precision while


 * respecting the field width.


 *


 * Width in this case refers to the width of final string generated for


 * the simplified format


 */


spec.field_width = -1;


orig_flags = spec.flags;


spec.flags &= ~(ZEROPAD | LEFT);



buf = pointer_string(buf, end, (__cheri_fromcap void *)cap, spec);



We should avoid __cheri_fromcap / __cheri_tocap until GCC has caught up
and implemented them.
Done.
...
...


update_buf_single(buf, end, ' ');


update_buf_single(buf, end, '[');


for (i = 0; i < ARRAY_SIZE(__perms); ++i) {


	if (perms & __perms[i].cperm)


		update_buf_single(buf, end, __perms[i].id);


}


update_buf_single(buf, end, ',');



base = cheri_base_get(cap);


buf = pointer_string(buf, end, (void *)base, spec);


update_buf_single(buf, end, '-');



top =  base + cheri_length_get(cap);


buf = pointer_string(buf, end, (void *)top, spec);


update_buf_single(buf, end, ']');



/* Attributes */


/* Reset precision to output full attribute identifiers here */


spec.precision = -1;



/*


 * Keep track of the attribute start section to format


 * it properly in case there are attributes to be reported.


 * Otherwise simply rolling on a buffer might lead to overflowing


 * past the terminating character if the buffer is big enough.


 */


attr_start = buf;


buf += 2;



if (!cheri_tag_get(cap)) {


	buf = string_nocheck(buf, end, "invalid", spec);


	++attrib;


}


if (cheri_is_sentry(cap)) {


	if (attrib++)


		update_buf_single(buf, end, ',');


	buf = string_nocheck(buf, end, "sentry", spec);



}


if (cheri_is_sealed(cap)) {


	if (attrib++)


		update_buf_single(buf, end, ',');


	buf = string_nocheck(buf, end, "sealed", spec);


}



if (attrib) {


	update_buf_single(attr_start, end, ' ');


	update_buf_single(attr_start, end, '(');


	update_buf_single(buf, end, ')');


} else {


	buf = attr_start; /* Rollback on space and opening bracket */


}



/* Restore the originally requested width */


spec.field_width = orig_field_width;


spec.flags |= orig_flags;



return widen_string(buf, buf - start, end, spec);


}

/*
* Raw format: hex dump of full capability


*/


update_buf_single(buf, end, cheri_tag_get(cap) ? '1' : '0');
update_buf_single(buf, end, ':');
buf = pointer_string(buf, end,
		 (void *) __builtin_cheri_copy_from_high(cap),


		 spec);


update_buf_single(buf, end, ':');
return pointer_string(buf, end, (__cheri_fromcap void *)(cap),
		  spec);



The indentation is slightly strange here and just above, it doesn't
match the ( above, but still has trailing spaces. In fact this line all
fits in 80 chars.
Also no need for parentheses around cap here.
Done.
...
...



+#undef update_buf_single
+}
+#endif /* __CHERI__ */



/*

Helper function to decode printf style format.
Each call decode a token from the format and return the

@@ -2624,6 +2806,10 @@ int format_decode(const char *fmt, struct printf_spec *spec)
 
   case 'p':
   	spec->type = FORMAT_TYPE_PTR;
+#ifdef __CHERI__

if (qualifier == 'l')


	spec->type = FORMAT_TYPE_CAPABILITY;



+#endif
   	return ++fmt - start;
 
   case '%':
@@ -2717,6 +2903,7 @@ set_precision(struct printf_spec *spec, int prec)



``%n`` is unsupported



``%p*`` is handled by pointer()






``%lp[x]`` and ``%#lp[x]`` is handled by capability()



See pointer() or Documentation/core-api/printk-formats.rst for more
extensive description.



@@ -2818,7 +3005,15 @@ int vsnprintf(char *buf, size_t size, const char *fmt, va_list args)
   			*str = '%';
   		++str;
   		break;



+#ifdef __CHERI__

case FORMAT_TYPE_CAPABILITY:


	str = capability(fmt, str, end,


			 va_arg(args, void * __capability),


			 spec);


	while (isalnum(*fmt))


		fmt++;


	break;



+#endif
   	case FORMAT_TYPE_INVALID:
   		/*
   		 * Presumably the arguments passed gcc's type
@@ -3138,7 +3333,28 @@ int vbin_printf(u32 *bin_buf, size_t size, const char *fmt, va_list args)
   		while (isalnum(*fmt))
   			fmt++;
   		break;
+#ifdef __CHERI__

case FORMAT_TYPE_CAPABILITY:


	/*


	 * Capabilities need to be handled now: subject to


	 * LoadCap/StoreCap permission which would make this


	 * unsafe if storing the capability in the buffer


	 * directly. Otherwise breaking down the capability


	 * to extract the information requested (format)


	 * would be rather counter-productive



As discussed offline - it does make sense to expand capabilities here,
but not really for that reason (the capability load/store is ultimately
done through DDC and we can rely on it having all relevant permissions).
Alignment in the destination buffer is probably the best reason, and
possibly the fact that capabilities cannot be stored there if the buffer
is truly arbitrary.
Rephrased in v3.
Thanks.
...
...

	 */


	str = capability(fmt, str, end,


			 va_arg(args, void * __capability),


			 spec);


	if (str + 1 < end)


		*str++ = '\0';


	else


		end[-1] = '\0'; /* Must be null terminated */


	while (isalnum(*fmt))


		fmt++;




	break;



+#endif
   	default:
   		switch (spec.type) {
 
@@ -3319,7 +3535,20 @@ int bstr_printf(char *buf, size_t size, const char *fmt, const u32 *bin_buf)
   			fmt++;
   		break;
   	}
+#ifdef __CHERI__

case FORMAT_TYPE_CAPABILITY: {


	long length = strlen(args);



	memcpy(str, args, min(length, (end - str)));



Don't you need to check if str < end like in the other cases? Also,
shouldn't length match the size you pass to memcpy() (in the two lines
below), in other words shouldn't it be just like what FORMAT_TYPE_PTR does?
Yes, my bad -> fixed in v3.
---
BR
B.
...
Kevin
...

	str += length;


	args += length + 1;



	while (isalnum(*fmt))


		fmt++;


	break;


}



+#endif
   	case FORMAT_TYPE_PERCENT_CHAR:
   		if (str < end)
   			*str = '%';
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 577e02998701..867e0a01ae9f 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -6760,7 +6760,7 @@ sub process {
   			my $fmt = get_quoted_string($lines[$count - 1], raw_line($count, 0));
   			$fmt =~ s/%%//g;

		while ($fmt =~ /(\%[\*\d\.]*p(\w)(\w*))/g) {




		while ($fmt =~ /(\%[\*\d\.#l]*p(\w)(\w*))/g) {
		$specifier = $1;
		$extension = $2;
		$qualifier = $3;



@@ -6768,7 +6768,8 @@ sub process {
   				    ($extension eq "f" &&
   				     defined $qualifier && $qualifier !~ /^w/) ||
   				    ($extension eq "4" &&

			     defined $qualifier && $qualifier !~ /^cc/)) {




			     defined $qualifier && $qualifier !~ /^cc/) ||


			    ($specifier =~ /lp/ && $extension ne "x")) {
			$bad_specifier = $specifier;
			last;
		}



@@ -6780,6 +6781,14 @@ sub process {
   					     "Using vsprintf specifier '%px' potentially exposes the kernel memory layout, if you don't really need the address please consider using '%p'.\n" . "$here\n$stat_real\n");
   				}
   			}


		if ($fmt =~ /\%[\*\d\.#]*lp/) {


			my $stat_real = get_stat_real($linenr, $lc);



			WARN("VSPRINTF_POINTER_EXTENSION",


				"Using vsprintf specifier '\%[#]lp[x]' is undefined behaviour when used with non-capability types. Make sure you got it right.\n" . "$here\n$stat_real\n");


		}


	if ($bad_specifier ne "") {
		my $stat_real = get_stat_real($linenr, $lc);
		my $ext_type = "Invalid";




    

2026

2025

2024

2023

2022

[linux-morello] Re: [PATCH v2] lib/vsprintf: Add support for architectural capabilities

Signed-off-by: Beata Michalska beata.michalska@arm.com

Struct Resources