On 17-05-2023 09:38, Kevin Brodsky wrote:
On 15/05/2023 17:05, Tudor Cretu wrote:
For the io_pin_pages() in rsrc.c, they could be used for both read and write operations as far as I can tell...
That seems to be he case, in fact AFAICT this can only be used together with IORING_OP_READ_FIXED and IORING_OP_WRITE_FIXED.
I've reached the same conclusion.
Overall I feel like we need to have a deeper think about this. There are multiple situations to consider, and multiple places where the checking could happen. __io_import_iovec() summarises the multiple situations quite well.
From what I've gathered:
- For IORING_REGISTER_BUFFERS and friends, it's fairly straightforward:
they can only be used together with IORING_OP_{READ,WRITE}_FIXED. Crucially, these requests are still required to specify addr and len, within the registered region. One option here would be to check only when processing the operation, not at the point of registration, since the latter can only be used through the former.
I think doing the check only when processing the operation is fine. But we should leave a comment at the registration point on why there is no need to check.
- For IORING_REGISTER_PBUF_RING and friends, the situation is different.
They can be used with operations specifying IOSQE_BUFFER_SELECT (4 allowed, see above), but here the operation does not specify the addr / len. We could still do the checking when processing the operation, since io_buffer_select() returns the original user pointer. It could be better to do it at the point of registration instead though (which is possible permission-wise because we know the operation will write to the buffer, as per above).
I think just doing the check at the registration point sounds good.
I'm sure I've missed something, so I'd be very grateful if someone could dig deeper into this and figure out all the angles. We should come up with a consistent approach, for instance checking as much as possible as soon as possible (when registering) or checking only at the point of processing the operation, not at the point of registration.
I've gone through the code and I confirm that your summary above is exhaustive and... exceptionally concise, well done!
I agree that a consistent approach would be nice, but I don't really think it's worth complicating the code in the case of IORING_REGISTER_BUFFERS or double-checking in the case of IORING_REGISTER_PBUF_RING. As long as all the possible registration-operation paths are checked and the decisions are documented, it should be enough. After all, these are exceptions to the standard... This is not a strong opinion, of course.
Thanks for going through that code too! I'd tend to agree, some asymmetry is probably better than shoehorning checks in this case.
Having had another look though, I think there is more complexity than just what I've mentioned. Grepping buf_index in io_uring/ gives a better idea of all users of registered buffers, and there are indeed more cases:
Great finds! I've gone through liburing as well this time, and IORING_OP_SEND_ZC with IORING_RECVSEND_FIXED_BUF and IORING_OP_URING_CMD with IORING_URING_CMD_FIXED seem to be the only cases besides IORING_OP_{READ,WRITE}_FIXED that can use buffers registered with IORING_REGISTER_BUFFERS/IORING_REGISTER_BUFFERS2/IORING_REGISTER_BUFFERS_UPDATE.
- IORING_OP_SEND_ZC with IORING_RECVSEND_FIXED_BUF. This one should be
no different from IORING_OP_WRITE_FIXED, with the same check inserted.
This looks correct from what I can tell.
- IORING_OP_PROVIDE_BUFFERS. I suppose it's equivalent to
IORING_REGISTER_PBUF_RING and we should therefore perform the check there?
I've noticed this one yesterday, and I agree, is part of the "IORING_REGISTER_PBUF_RING and friends" group and there should be a check there.
- IORING_OP_URING_CMD with IORING_URING_CMD_FIXED. Undocumented so even
more fun! Should be the same as IORING_OP_{READ,WRITE}_FIXED, though which exactly, I'm not sure... io_uring_cmd_import_fixed() looks relevant, though I'm not sure it's the only way to use a fixed buffer in this case.
IIUC, io_uring_cmd_import_fixed() is the only (intended) way to use the fixed buffer for IORING_OP_URING_CMD requests. struct io_uring_cmd doesn't have the buffer exposed, only struct io_kiocb does. And the only place where the buffer (`imu` field from struct io_kiocb) is accessed is from io_uring_cmd_import_fixed().
Tudor
Something that makes it quite hard to understand it all is that although buf_group has been put in a union with buf_index to distinguish the two cases, the kernel often uses buf_index to mean buf_group... And of course the man page being out of date in multiple ways is not helping.
Hopefully that's all of them found now, but a second look is definitely appreciated.
Kevin