Hi All,
We are facing regression on our hardware platform, Kunpeng9xx series, after the commit 8fcc4ae6faf8 ("arm64: acpi: Make apei_claim_sea() synchronize with APEI's irq work") was applied and this commit was merged into the mainline v5.8-rc1.
The regression occur with the user-mode SEA, synchronous external abort occur in user space, and with our customer delivered firmware which reports the ARM processor error record for the SEA, to let kernel record the error.
After the analysis it was identified that the reason of this issue is that do_sea() return directly for the user-mode SEA if the apei_claim_sea() handled the ARM processor error record, but the APEI GHES driver does not effectively process the ARM processor error record for the cache errors, by doing the memory failure handling. Currently the ghes_handle_memory_failure() is called only for the memory error record.
The following patch fix the regression by doing the memory failure handling for the ARM processor error reported for the user-mode SEA. https://lore.kernel.org/linux-acpi/94a38a33-a949-3cce-d617-e1476912596e@huaw...
V5 patch was send-out after reviews by over 4 months.
Can ARM folks please acknowledge the patch and perhaps the need for a meeting to explain the issue and resolution +quickly resolve any open questions?
Thanks Xiaofei