A subtle ABI change was introduced in compat64 by "clone: Alter clone to accept capabilities". Indeed, since there is no compat handler for clone, the native one is also used for compat64. This is where the way we convert compat64 syscall arguments to native is showing its limits: if a syscall wrapper expects a capability-sized argument, compat_ptr() is used to convert the user-provided 64-bit value to a capability.
In general, this is what we want, and in fact it is the case for the parent_tidptr and child_tidptr arguments of clone, which are ordinary pointers to user data. However, the newsp and tls arguments are special: they specify the value to set registers to. We should not alter these values in any way: in PCuABI, they are capabilities and we set CSP/CTPIDR accordingly, but in hybrid, they are still 64-bit values and we should only set the lower 64 bits of CSP/CTPIDR. This is not the case in compat64 as compat_ptr() is called to turn these 64-bit values into capabilities.
The most correct solution would be to introduce a generic compat clone wrapper, but this is rather painful as clone has 4 possible prototypes depending on the architecture. Given that the issue is completely specific to the hybrid ABI, overriding the compat64 syscall wrapper in sys_compat64.c feels like a reasonable compromise.
Signed-off-by: Kevin Brodsky kevin.brodsky@arm.com ---
I've realised this inconsistency while thinking about the initialisation of capability registers ("New CHERI API and separation root capabilities" series), as well as reviewing the clone3 series. We are already doing the right thing for clone3, time to align clone.
arch/arm64/kernel/sys_compat64.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
diff --git a/arch/arm64/kernel/sys_compat64.c b/arch/arm64/kernel/sys_compat64.c index 819b895ec21d..b3c4cf9f3af8 100644 --- a/arch/arm64/kernel/sys_compat64.c +++ b/arch/arm64/kernel/sys_compat64.c @@ -9,6 +9,7 @@
#include <linux/compat.h> #include <linux/compiler.h> +#include <linux/sched/task.h> #include <linux/syscalls.h>
#include <asm/syscall.h> @@ -83,6 +84,33 @@ #define __arm64_compatentry_compat_sys_setitimer __arm64_compatentry_sys_setitimer #define __arm64_compatentry_compat_sys_getrusage __arm64_compatentry_sys_getrusage
+/* + * This is exactly the same definition as the native clone, except that newsp + * and tls are defined as unsigned long, not user_uintptr_t. When the native ABI + * is PCuABI, this prevents capabilities from being implicitly created for the + * stack/TLS in compat64 by the syscall wrapper. This ensures alignment with the + * hybrid ABI (i.e. CSP/CTPIDR are set to the 64-bit values passed to clone()). + */ +COMPAT_SYSCALL_DEFINE5(arm64_clone, unsigned long, clone_flags, unsigned long, newsp, + int __user *, parent_tidptr, + unsigned long, tls, + int __user *, child_tidptr) +{ + struct kernel_clone_args args = { + .flags = (lower_32_bits(clone_flags) & ~CSIGNAL), + .pidfd = parent_tidptr, + .child_tid = child_tidptr, + .parent_tid = parent_tidptr, + .exit_signal = (lower_32_bits(clone_flags) & CSIGNAL), + .stack = newsp, + .tls = tls, + }; + + return kernel_clone(&args); +} + +#define __arm64_compatentry_sys_clone __arm64_compatentry_compat_sys_arm64_clone + asmlinkage long sys_ni_syscall(void);
asmlinkage long __arm64_compatentry_sys_ni_syscall(const struct pt_regs *__unused)
On 20-12-2022 14:50, Kevin Brodsky wrote:
A subtle ABI change was introduced in compat64 by "clone: Alter clone to accept capabilities". Indeed, since there is no compat handler for clone, the native one is also used for compat64. This is where the way we convert compat64 syscall arguments to native is showing its limits: if a syscall wrapper expects a capability-sized argument, compat_ptr() is used to convert the user-provided 64-bit value to a capability.
In general, this is what we want, and in fact it is the case for the parent_tidptr and child_tidptr arguments of clone, which are ordinary pointers to user data. However, the newsp and tls arguments are special: they specify the value to set registers to. We should not alter these values in any way: in PCuABI, they are capabilities and we set CSP/CTPIDR accordingly, but in hybrid, they are still 64-bit values and we should only set the lower 64 bits of CSP/CTPIDR. This is not the case in compat64 as compat_ptr() is called to turn these 64-bit values into capabilities.
Hi Kevin,
The approach looks good, but I have only one question. Where specifically newsp and tls arguments would become valid capabilities from compat_ptr?
As far as I understand it, compat_ptr is used only when "clone_args_get_user_ptr" is used, but newsp and tls are obtained using "clone_args_get", so the wrapper would never convert it to a capability. Unless I am missing something...
Thanks, Tudor
The most correct solution would be to introduce a generic compat clone wrapper, but this is rather painful as clone has 4 possible prototypes depending on the architecture. Given that the issue is completely specific to the hybrid ABI, overriding the compat64 syscall wrapper in sys_compat64.c feels like a reasonable compromise.
Signed-off-by: Kevin Brodsky kevin.brodsky@arm.com
I've realised this inconsistency while thinking about the initialisation of capability registers ("New CHERI API and separation root capabilities" series), as well as reviewing the clone3 series. We are already doing the right thing for clone3, time to align clone.
arch/arm64/kernel/sys_compat64.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
diff --git a/arch/arm64/kernel/sys_compat64.c b/arch/arm64/kernel/sys_compat64.c index 819b895ec21d..b3c4cf9f3af8 100644 --- a/arch/arm64/kernel/sys_compat64.c +++ b/arch/arm64/kernel/sys_compat64.c @@ -9,6 +9,7 @@ #include <linux/compat.h> #include <linux/compiler.h> +#include <linux/sched/task.h> #include <linux/syscalls.h> #include <asm/syscall.h> @@ -83,6 +84,33 @@ #define __arm64_compatentry_compat_sys_setitimer __arm64_compatentry_sys_setitimer #define __arm64_compatentry_compat_sys_getrusage __arm64_compatentry_sys_getrusage +/*
- This is exactly the same definition as the native clone, except that newsp
- and tls are defined as unsigned long, not user_uintptr_t. When the native ABI
- is PCuABI, this prevents capabilities from being implicitly created for the
- stack/TLS in compat64 by the syscall wrapper. This ensures alignment with the
- hybrid ABI (i.e. CSP/CTPIDR are set to the 64-bit values passed to clone()).
- */
+COMPAT_SYSCALL_DEFINE5(arm64_clone, unsigned long, clone_flags, unsigned long, newsp,
int __user *, parent_tidptr,
unsigned long, tls,
int __user *, child_tidptr)
+{
- struct kernel_clone_args args = {
.flags = (lower_32_bits(clone_flags) & ~CSIGNAL),
.pidfd = parent_tidptr,
.child_tid = child_tidptr,
.parent_tid = parent_tidptr,
.exit_signal = (lower_32_bits(clone_flags) & CSIGNAL),
.stack = newsp,
.tls = tls,
- };
- return kernel_clone(&args);
+}
+#define __arm64_compatentry_sys_clone __arm64_compatentry_compat_sys_arm64_clone
- asmlinkage long sys_ni_syscall(void);
asmlinkage long __arm64_compatentry_sys_ni_syscall(const struct pt_regs *__unused)
On 30/01/2023 12:17, Tudor Cretu wrote:
A subtle ABI change was introduced in compat64 by "clone: Alter clone to accept capabilities". Indeed, since there is no compat handler for clone, the native one is also used for compat64. This is where the way we convert compat64 syscall arguments to native is showing its limits: if a syscall wrapper expects a capability-sized argument, compat_ptr() is used to convert the user-provided 64-bit value to a capability.
In general, this is what we want, and in fact it is the case for the parent_tidptr and child_tidptr arguments of clone, which are ordinary pointers to user data. However, the newsp and tls arguments are special: they specify the value to set registers to. We should not alter these values in any way: in PCuABI, they are capabilities and we set CSP/CTPIDR accordingly, but in hybrid, they are still 64-bit values and we should only set the lower 64 bits of CSP/CTPIDR. This is not the case in compat64 as compat_ptr() is called to turn these 64-bit values into capabilities.
Hi Kevin,
The approach looks good, but I have only one question. Where specifically newsp and tls arguments would become valid capabilities from compat_ptr?
As far as I understand it, compat_ptr is used only when "clone_args_get_user_ptr" is used, but newsp and tls are obtained using "clone_args_get", so the wrapper would never convert it to a capability. Unless I am missing something...
I think you are: this is clone, not clone3 :) As to where the automatic compat_ptr() conversion I mention in the first paragraph occurs, that's specifically in __SC_DELOUSE() in <linux/compat.h>.
Thanks for the review!
Kevin
On 30-01-2023 12:59, Kevin Brodsky wrote:
On 30/01/2023 12:17, Tudor Cretu wrote:
A subtle ABI change was introduced in compat64 by "clone: Alter clone to accept capabilities". Indeed, since there is no compat handler for clone, the native one is also used for compat64. This is where the way we convert compat64 syscall arguments to native is showing its limits: if a syscall wrapper expects a capability-sized argument, compat_ptr() is used to convert the user-provided 64-bit value to a capability.
In general, this is what we want, and in fact it is the case for the parent_tidptr and child_tidptr arguments of clone, which are ordinary pointers to user data. However, the newsp and tls arguments are special: they specify the value to set registers to. We should not alter these values in any way: in PCuABI, they are capabilities and we set CSP/CTPIDR accordingly, but in hybrid, they are still 64-bit values and we should only set the lower 64 bits of CSP/CTPIDR. This is not the case in compat64 as compat_ptr() is called to turn these 64-bit values into capabilities.
Hi Kevin,
The approach looks good, but I have only one question. Where specifically newsp and tls arguments would become valid capabilities from compat_ptr?
As far as I understand it, compat_ptr is used only when "clone_args_get_user_ptr" is used, but newsp and tls are obtained using "clone_args_get", so the wrapper would never convert it to a capability. Unless I am missing something...
I think you are: this is clone, not clone3 :) As to where the automatic compat_ptr() conversion I mention in the first paragraph occurs, that's specifically in __SC_DELOUSE() in <linux/compat.h>.
Thanks, I missed this part! All good from me!
Tudor
Thanks for the review!
Kevin
Hi Kevin,
On Tue, Dec 20, 2022 at 02:50:52PM +0000, Kevin Brodsky wrote:
A subtle ABI change was introduced in compat64 by "clone: Alter clone to accept capabilities". Indeed, since there is no compat handler for clone, the native one is also used for compat64. This is where the way we convert compat64 syscall arguments to native is showing its limits: if a syscall wrapper expects a capability-sized argument, compat_ptr() is used to convert the user-provided 64-bit value to a capability.
That's a nice catch -> should this also bear a "Fixes" tag ?
In general, this is what we want, and in fact it is the case for the parent_tidptr and child_tidptr arguments of clone, which are ordinary pointers to user data. However, the newsp and tls arguments are special: they specify the value to set registers to. We should not alter these values in any way: in PCuABI, they are capabilities and we set CSP/CTPIDR accordingly, but in hybrid, they are still 64-bit values and we should only set the lower 64 bits of CSP/CTPIDR. This is not the case in compat64 as compat_ptr() is called to turn these 64-bit values into capabilities.
The most correct solution would be to introduce a generic compat clone wrapper, but this is rather painful as clone has 4 possible prototypes depending on the architecture. Given that the issue is completely specific to the hybrid ABI, overriding the compat64 syscall wrapper in sys_compat64.c feels like a reasonable compromise.
Signed-off-by: Kevin Brodsky kevin.brodsky@arm.com
I've realised this inconsistency while thinking about the initialisation of capability registers ("New CHERI API and separation root capabilities" series), as well as reviewing the clone3 series. We are already doing the right thing for clone3, time to align clone.
arch/arm64/kernel/sys_compat64.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
diff --git a/arch/arm64/kernel/sys_compat64.c b/arch/arm64/kernel/sys_compat64.c index 819b895ec21d..b3c4cf9f3af8 100644 --- a/arch/arm64/kernel/sys_compat64.c +++ b/arch/arm64/kernel/sys_compat64.c @@ -9,6 +9,7 @@ #include <linux/compat.h> #include <linux/compiler.h> +#include <linux/sched/task.h> #include <linux/syscalls.h> #include <asm/syscall.h> @@ -83,6 +84,33 @@ #define __arm64_compatentry_compat_sys_setitimer __arm64_compatentry_sys_setitimer #define __arm64_compatentry_compat_sys_getrusage __arm64_compatentry_sys_getrusage +/*
- This is exactly the same definition as the native clone, except that newsp
- and tls are defined as unsigned long, not user_uintptr_t. When the native ABI
- is PCuABI, this prevents capabilities from being implicitly created for the
- stack/TLS in compat64 by the syscall wrapper. This ensures alignment with the
- hybrid ABI (i.e. CSP/CTPIDR are set to the 64-bit values passed to clone()).
- */
+COMPAT_SYSCALL_DEFINE5(arm64_clone, unsigned long, clone_flags, unsigned long, newsp,
int __user *, parent_tidptr,
unsigned long, tls,
int __user *, child_tidptr)
+{
- struct kernel_clone_args args = {
.flags = (lower_32_bits(clone_flags) & ~CSIGNAL),
.pidfd = parent_tidptr,
.child_tid = child_tidptr,
.parent_tid = parent_tidptr,
.exit_signal = (lower_32_bits(clone_flags) & CSIGNAL),
.stack = newsp,
.tls = tls,
- };
- return kernel_clone(&args);
+}
+#define __arm64_compatentry_sys_clone __arm64_compatentry_compat_sys_arm64_clone
asmlinkage long sys_ni_syscall(void);
So overall this solution makes sense and it does the right thing, though I am somewhat not entirely convinced this is the way to go about it. It has this nice advantage of fixing up things upfront and propagating the correct values down the stream, but this could be also achieved by adding small tweaks to the original syscall handler although I do appreciate it would be less 'clean' than the suggested solution. Alternatively we could amend copy_thread (which is already compat/morello aware) and handle the difference there ?
--- BR B.
asmlinkage long __arm64_compatentry_sys_ni_syscall(const struct pt_regs *__unused)
2.38.1
linux-morello mailing list -- linux-morello@op-lists.linaro.org To unsubscribe send an email to linux-morello-leave@op-lists.linaro.org
On Mon, Jan 30, 2023 at 06:03:22PM +0000, Beata Michalska wrote:
Hi Kevin,
On Tue, Dec 20, 2022 at 02:50:52PM +0000, Kevin Brodsky wrote:
A subtle ABI change was introduced in compat64 by "clone: Alter clone to accept capabilities". Indeed, since there is no compat handler for clone, the native one is also used for compat64. This is where the way we convert compat64 syscall arguments to native is showing its limits: if a syscall wrapper expects a capability-sized argument, compat_ptr() is used to convert the user-provided 64-bit value to a capability.
That's a nice catch -> should this also bear a "Fixes" tag ?
In general, this is what we want, and in fact it is the case for the parent_tidptr and child_tidptr arguments of clone, which are ordinary pointers to user data. However, the newsp and tls arguments are special: they specify the value to set registers to. We should not alter these values in any way: in PCuABI, they are capabilities and we set CSP/CTPIDR accordingly, but in hybrid, they are still 64-bit values and we should only set the lower 64 bits of CSP/CTPIDR. This is not the case in compat64 as compat_ptr() is called to turn these 64-bit values into capabilities.
The most correct solution would be to introduce a generic compat clone wrapper, but this is rather painful as clone has 4 possible prototypes depending on the architecture. Given that the issue is completely specific to the hybrid ABI, overriding the compat64 syscall wrapper in sys_compat64.c feels like a reasonable compromise.
Signed-off-by: Kevin Brodsky kevin.brodsky@arm.com
I've realised this inconsistency while thinking about the initialisation of capability registers ("New CHERI API and separation root capabilities" series), as well as reviewing the clone3 series. We are already doing the right thing for clone3, time to align clone.
arch/arm64/kernel/sys_compat64.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
diff --git a/arch/arm64/kernel/sys_compat64.c b/arch/arm64/kernel/sys_compat64.c index 819b895ec21d..b3c4cf9f3af8 100644 --- a/arch/arm64/kernel/sys_compat64.c +++ b/arch/arm64/kernel/sys_compat64.c @@ -9,6 +9,7 @@ #include <linux/compat.h> #include <linux/compiler.h> +#include <linux/sched/task.h> #include <linux/syscalls.h> #include <asm/syscall.h> @@ -83,6 +84,33 @@ #define __arm64_compatentry_compat_sys_setitimer __arm64_compatentry_sys_setitimer #define __arm64_compatentry_compat_sys_getrusage __arm64_compatentry_sys_getrusage +/*
- This is exactly the same definition as the native clone, except that newsp
- and tls are defined as unsigned long, not user_uintptr_t. When the native ABI
- is PCuABI, this prevents capabilities from being implicitly created for the
- stack/TLS in compat64 by the syscall wrapper. This ensures alignment with the
- hybrid ABI (i.e. CSP/CTPIDR are set to the 64-bit values passed to clone()).
- */
+COMPAT_SYSCALL_DEFINE5(arm64_clone, unsigned long, clone_flags, unsigned long, newsp,
int __user *, parent_tidptr,
unsigned long, tls,
int __user *, child_tidptr)
+{
- struct kernel_clone_args args = {
.flags = (lower_32_bits(clone_flags) & ~CSIGNAL),
.pidfd = parent_tidptr,
.child_tid = child_tidptr,
.parent_tid = parent_tidptr,
.exit_signal = (lower_32_bits(clone_flags) & CSIGNAL),
.stack = newsp,
.tls = tls,
- };
- return kernel_clone(&args);
+}
+#define __arm64_compatentry_sys_clone __arm64_compatentry_compat_sys_arm64_clone
asmlinkage long sys_ni_syscall(void);
So overall this solution makes sense and it does the right thing, though I am somewhat not entirely convinced this is the way to go about it. It has this nice advantage of fixing up things upfront and propagating the correct values down the stream, but this could be also achieved by adding small tweaks to the original syscall handler although I do appreciate it would be less 'clean' than the suggested solution. Alternatively we could amend copy_thread (which is already compat/morello aware) and handle the difference there ?
Continuing on that thought ... We could add the below to native handler : this way it would somewhat align to clone3 (to some extent)
struct kernel_clone_args args = { ... .stack = in_compat_syscall() ? (ptraddr_t)newsp : newsp; .tls = in_compat_syscall() ? (ptraddr_t)tls : tls; }
or: bool compat_mode = in_compat_syscall(); struct kernel_clone_args args = { ... .stack = compat_mode ? (ptraddr_t)newsp : newsp; .tls = compat_mode ? (ptraddr_t)tls : tls; }
This would keep things in one place and get things more evidently distinguished when it comes to those two (still) user pointers. It does stick out a bit more (as I believe). What do you think?
---
BR B.
asmlinkage long __arm64_compatentry_sys_ni_syscall(const struct pt_regs *__unused)
2.38.1
linux-morello mailing list -- linux-morello@op-lists.linaro.org To unsubscribe send an email to linux-morello-leave@op-lists.linaro.org
linux-morello mailing list -- linux-morello@op-lists.linaro.org To unsubscribe send an email to linux-morello-leave@op-lists.linaro.org
On 30/01/2023 23:05, Beata Michalska wrote:
On Mon, Jan 30, 2023 at 06:03:22PM +0000, Beata Michalska wrote:
Hi Kevin,
On Tue, Dec 20, 2022 at 02:50:52PM +0000, Kevin Brodsky wrote:
A subtle ABI change was introduced in compat64 by "clone: Alter clone to accept capabilities". Indeed, since there is no compat handler for clone, the native one is also used for compat64. This is where the way we convert compat64 syscall arguments to native is showing its limits: if a syscall wrapper expects a capability-sized argument, compat_ptr() is used to convert the user-provided 64-bit value to a capability.
That's a nice catch -> should this also bear a "Fixes" tag ?
I wasn't sure it was warranted but sure I can add that.
In general, this is what we want, and in fact it is the case for the parent_tidptr and child_tidptr arguments of clone, which are ordinary pointers to user data. However, the newsp and tls arguments are special: they specify the value to set registers to. We should not alter these values in any way: in PCuABI, they are capabilities and we set CSP/CTPIDR accordingly, but in hybrid, they are still 64-bit values and we should only set the lower 64 bits of CSP/CTPIDR. This is not the case in compat64 as compat_ptr() is called to turn these 64-bit values into capabilities.
The most correct solution would be to introduce a generic compat clone wrapper, but this is rather painful as clone has 4 possible prototypes depending on the architecture. Given that the issue is completely specific to the hybrid ABI, overriding the compat64 syscall wrapper in sys_compat64.c feels like a reasonable compromise.
Signed-off-by: Kevin Brodsky kevin.brodsky@arm.com
I've realised this inconsistency while thinking about the initialisation of capability registers ("New CHERI API and separation root capabilities" series), as well as reviewing the clone3 series. We are already doing the right thing for clone3, time to align clone.
arch/arm64/kernel/sys_compat64.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
diff --git a/arch/arm64/kernel/sys_compat64.c b/arch/arm64/kernel/sys_compat64.c index 819b895ec21d..b3c4cf9f3af8 100644 --- a/arch/arm64/kernel/sys_compat64.c +++ b/arch/arm64/kernel/sys_compat64.c @@ -9,6 +9,7 @@ #include <linux/compat.h> #include <linux/compiler.h> +#include <linux/sched/task.h> #include <linux/syscalls.h> #include <asm/syscall.h> @@ -83,6 +84,33 @@ #define __arm64_compatentry_compat_sys_setitimer __arm64_compatentry_sys_setitimer #define __arm64_compatentry_compat_sys_getrusage __arm64_compatentry_sys_getrusage +/*
- This is exactly the same definition as the native clone, except that newsp
- and tls are defined as unsigned long, not user_uintptr_t. When the native ABI
- is PCuABI, this prevents capabilities from being implicitly created for the
- stack/TLS in compat64 by the syscall wrapper. This ensures alignment with the
- hybrid ABI (i.e. CSP/CTPIDR are set to the 64-bit values passed to clone()).
- */
+COMPAT_SYSCALL_DEFINE5(arm64_clone, unsigned long, clone_flags, unsigned long, newsp,
int __user *, parent_tidptr,
unsigned long, tls,
int __user *, child_tidptr)
+{
- struct kernel_clone_args args = {
.flags = (lower_32_bits(clone_flags) & ~CSIGNAL),
.pidfd = parent_tidptr,
.child_tid = child_tidptr,
.parent_tid = parent_tidptr,
.exit_signal = (lower_32_bits(clone_flags) & CSIGNAL),
.stack = newsp,
.tls = tls,
- };
- return kernel_clone(&args);
+}
+#define __arm64_compatentry_sys_clone __arm64_compatentry_compat_sys_arm64_clone
asmlinkage long sys_ni_syscall(void);
So overall this solution makes sense and it does the right thing, though I am somewhat not entirely convinced this is the way to go about it. It has this nice advantage of fixing up things upfront and propagating the correct values down the stream, but this could be also achieved by adding small tweaks to the original syscall handler although I do appreciate it would be less 'clean' than the suggested solution. Alternatively we could amend copy_thread (which is already compat/morello aware) and handle the difference there ?
Continuing on that thought ... We could add the below to native handler : this way it would somewhat align to clone3 (to some extent)
struct kernel_clone_args args = { ... .stack = in_compat_syscall() ? (ptraddr_t)newsp : newsp; .tls = in_compat_syscall() ? (ptraddr_t)tls : tls; }
or: bool compat_mode = in_compat_syscall(); struct kernel_clone_args args = { ... .stack = compat_mode ? (ptraddr_t)newsp : newsp; .tls = compat_mode ? (ptraddr_t)tls : tls; }
This would keep things in one place and get things more evidently distinguished when it comes to those two (still) user pointers. It does stick out a bit more (as I believe). What do you think?
On the whole you're probably right, it's a little less clean as we still create capabilities unnecessarily in compat64, but adding an ad hoc handler in sys_compat64.c is far from ideal. I'm not keen on changing copy_thread() but changing the native clone handler does make some sense w.r.t. clone3, it remains fairly symmetric.
We can't quite use a ternary as both alternatives need to be of the same type, but I will do something like that. I'll add your Co-developed-by and wait for your S-o-b if you're happy with that.
Kevin
linux-morello@op-lists.linaro.org