On 15/05/2023 17:05, Tudor Cretu wrote:
For the io_pin_pages() in rsrc.c, they could be used for both read abd write operations as far as I can tell...
That seems to be he case, in fact AFAICT this can only be used together with IORING_OP_READ_FIXED and IORING_OP_WRITE_FIXED.
I've reached the same conclusion.
Overall I feel like we need to have a deeper think about this. There are multiple situations to consider, and multiple places where the checking could happen. __io_import_iovec() summarises the multiple situations quite well.
From what I've gathered:
- For IORING_REGISTER_BUFFERS and friends, it's fairly straightforward:
they can only be used together with IORING_OP_{READ,WRITE}_FIXED. Crucially, these requests are still required to specify addr and len, within the registered region. One option here would be to check only when processing the operation, not at the point of registration, since the latter can only be used through the former.
I think doing the check only when processing the operation is fine. But we should leave a comment at the registration point on why there is no need to check.
- For IORING_REGISTER_PBUF_RING and friends, the situation is different.
They can be used with operations specifying IOSQE_BUFFER_SELECT (4 allowed, see above), but here the operation does not specify the addr / len. We could still do the checking when processing the operation, since io_buffer_select() returns the original user pointer. It could be better to do it at the point of registration instead though (which is possible permission-wise because we know the operation will write to the buffer, as per above).
I think just doing the check at the registration point sounds good.
I'm sure I've missed something, so I'd be very grateful if someone could dig deeper into this and figure out all the angles. We should come up with a consistent approach, for instance checking as much as possible as soon as possible (when registering) or checking only at the point of processing the operation, not at the point of registration.
I've gone through the code and I confirm that your summary above is exhaustive and... exceptionally concise, well done!
I agree that a consistent approach would be nice, but I don't really think it's worth complicating the code in the case of IORING_REGISTER_BUFFERS or double-checking in the case of IORING_REGISTER_PBUF_RING. As long as all the possible registration-operation paths are checked and the decisions are documented, it should be enough. After all, these are exceptions to the standard... This is not a strong opinion, of course.
Thanks for going through that code too! I'd tend to agree, some asymmetry is probably better than shoehorning checks in this case.
Having had another look though, I think there is more complexity than just what I've mentioned. Grepping buf_index in io_uring/ gives a better idea of all users of registered buffers, and there are indeed more cases: - IORING_OP_SEND_ZC with IORING_RECVSEND_FIXED_BUF. This one should be no different from IORING_OP_WRITE_FIXED, with the same check inserted. - IORING_OP_PROVIDE_BUFFERS. I suppose it's equivalent to IORING_REGISTER_PBUF_RING and we should therefore perform the check there? - IORING_OP_URING_CMD with IORING_URING_CMD_FIXED. Undocumented so even more fun! Should be the same as IORING_OP_{READ,WRITE}_FIXED, though which exactly, I'm not sure... io_uring_cmd_import_fixed() looks relevant, though I'm not sure it's the only way to use a fixed buffer in this case.
Something that makes it quite hard to understand it all is that although buf_group has been put in a union with buf_index to distinguish the two cases, the kernel often uses buf_index to mean buf_group... And of course the man page being out of date in multiple ways is not helping.
Hopefully that's all of them found now, but a second look is definitely appreciated.
Kevin