On Sun, 01 Aug 2021 14:43:37 +0100, Arnd Bergmann arnd@kernel.org wrote:
On Sat, Jul 31, 2021 at 11:31 AM Marc Zyngier maz@kernel.org wrote:
On Fri, 30 Jul 2021 11:05:30 +0100, Arnd Bergmann arnd@kernel.org wrote:
On Fri, Jul 30, 2021 at 10:52 AM Marc Zyngier maz@kernel.org wrote:
The EOI for the level interrupt is the message being enqueued after the guest is done processing the interrupt. I can see four ways this could go:
- If the interrupt has not been made pending again yet, nothing happens
until it eventually becomes pending again (this is the obvious case)
- If it is already pending, chip->irq_eoi() causes the new event descriptor
to be queued on the eventq, and it signals the host about new buffers being available. The /host/ samples the level of the line, notices it is pending and puts the resulting buffer back on the 'used' queue even before returning from the guest 'notify' function. The guest virtio core code keeps processing the 'used' buffers and gets back into the interrupt handler.
- Same as 2., but the host handles the virtqueue ->notify asynchronously,
and the guest has already checked the 'used' queue before the host adds back the buffer to tell it that the line is still active. Now the guest gets notified again after it returns from the virtqueue interrupt, in order to process the new 'used' buffer.
- The gpio line actually goes into inactive state until after the new event is queued in chip->irq_eoi(), but becomes active immediately afterwards. Now the host gets interrupted and handles this by queuing the event reply but cannot interrupt the guest because it is still processing the original virtqueue event. However, since the event is queued, it will be processed the same way as 2. or 3. by the virtio core.
Do you see a problem with scenarios 2, 3 or 4, or with another case that I may have missed?
Thanks, that's really useful. I don't think you missed much. What the documentation is missing though is an actual interrupt controller specification. Although it describes the protocol in great length, at no point does it explain what the irq_response does in terms of interrupt life cycle (there is no interrupt life cycle at all). Same goes for the various states that an interrupt can get. As someone who spends way too much time reading interrupt controller specs, this is a first class defect.
Ok, makes sense.
Another point I have just realised is that this spec confuses interrupt mask with interrupt enable. It describes masking interrupts as an effect of setting the trigger type, but that's completely unusable. There is a strong expectation from SW that a masked interrupt doesn't loose signals. If the interrupt is masked by setting its trigger to NONE, then an edge interrupt coming at this point will be lost. No cookie for you.
I'm fine with this odd way of *enabling* the interrupt, but masking cannot lose any signal, ever.
Viresh's previous first version actually had this, I asked him to simplify it in order to reduce the number of possible states, as it seemed to me that it would be better not to have eight possible states when you have the various combinations of enabled/disabled, unmasked/masked and armed/not-armed.
What exactly is armed/not-armed? Is that equivalent to pending?
The current version folds unmasked/masked into armed/not-armed by masking the interrupt whenever the reply has been queued to the guest, until it gets queued again on the eventq.
Huh. I understood that it was mask end enabled that were folded together. I'm lost.
This avoids the problem that using the controlq to mask explicitly mask the interrupt cannot be done from atomic context since it has to wait_for_completion() until the controlq message is returned. Queing up the eventq message to unmask the even can be done in atomic context, since this is a posted operation.
What are the requirements for a ->irq_mask() operation? Can this be called in atomic context? If it can, is it allowed to be posted (e.g. by queuing a new command without waiting for the reply)?
irq_mask() can definitely be called in atomic context. But in this case the driver must implement the bus_lock madness and commit changes to the backend on unlock. See how most of the I2C GPIO implementations work.
Obviously, this has an effect on performance and complexity. It would be a lot better if there was a way to perform the mask/unmask operations synchronously, without the need of a buffer. For example, virtio-console has the so called 'emergency write' feature, which allows for write operations without any buffer.
The later would allow the implementation of a "fire and forget" mask/unmask that would still be synchronous.
Alternatively, would it be possible to simply emulate the 'irq_mask()' operation in software, by not delivering the next interrupt to the handler until the driver calls irq_unmask()? Once it's been delivered, it would just not get requeued in this case.
That'd be a poor-man's solution, as you would still get host notifications, which could have an impact in the case of a screaming interrupt.
Thanks,
M.