Hi Lorenzo
Any comment about this issue? The patch has been waiting for James Morse to review. It seems he is not in this group. Could you help to invite him to join this discussion, or ask for his opinion in private? This issue is urgent for us. Thank you very much.
Thanks Xiaofei Tan
Today's Topics:
- Regression of synchronous external aborts occur in user-mode (tanxiaofei)
Message: 1 Date: Tue, 27 Apr 2021 12:08:57 +0000 From: tanxiaofei tanxiaofei@huawei.com To: "linaro-open-discussions@op-lists.linaro.org" linaro-open-discussions@op-lists.linaro.org Cc: "lorenzo.pieralisi@arm.com" lorenzo.pieralisi@arm.com, Shiju Jose shiju.jose@huawei.com, Jonathan Cameron jonathan.cameron@huawei.com, "Guohanjun (Hanjun Guo)" guohanjun@huawei.com Subject: [Linaro-open-discussions] Regression of synchronous external aborts occur in user-mode Message-ID: a7c49b5d41384d9b877edb2c2909de41@huawei.com Content-Type: text/plain; charset="us-ascii"
Hi All,
We are facing regression on our hardware platform, Kunpeng9xx series, after the commit 8fcc4ae6faf8 ("arm64: acpi: Make apei_claim_sea() synchronize with APEI's irq work") was applied and this commit was merged into the mainline v5.8-rc1.
The regression occur with the user-mode SEA, synchronous external abort occur in user space, and with our customer delivered firmware which reports the ARM processor error record for the SEA, to let kernel record the error.
After the analysis it was identified that the reason of this issue is that do_sea() return directly for the user-mode SEA if the apei_claim_sea() handled the ARM processor error record, but the APEI GHES driver does not effectively process the ARM processor error record for the cache errors, by doing the memory failure handling. Currently the ghes_handle_memory_failure() is called only for the memory error record.
The following patch fix the regression by doing the memory failure handling for the ARM processor error reported for the user-mode SEA. https://lore.kernel.org/linux-acpi/94a38a33-a949-3cce-d617-e1476912596e@huaw...
V5 patch was send-out after reviews by over 4 months.
Can ARM folks please acknowledge the patch and perhaps the need for a meeting to explain the issue and resolution +quickly resolve any open questions?
Thanks Xiaofei
Subject: Digest Footer
Linaro-open-discussions mailing list Linaro-open-discussions@op-lists.linaro.org https://op-lists.linaro.org/mailman/listinfo/linaro-open-discussions
End of Linaro-open-discussions Digest, Vol 7, Issue 9
On Mon, May 10, 2021 at 09:35:55PM +0800, Xiaofei Tan wrote:
Hi Lorenzo
Any comment about this issue? The patch has been waiting for James Morse to review. It seems he is not in this group. Could you help to invite him to join this discussion, or ask for his opinion in private? This issue is urgent for us. Thank you very much.
You should ping him again on LAKML - that's the best course of action.
Thanks, Lorenzo
Thanks Xiaofei Tan
Today's Topics:
- Regression of synchronous external aborts occur in user-mode (tanxiaofei)
Message: 1 Date: Tue, 27 Apr 2021 12:08:57 +0000 From: tanxiaofei tanxiaofei@huawei.com To: "linaro-open-discussions@op-lists.linaro.org" linaro-open-discussions@op-lists.linaro.org Cc: "lorenzo.pieralisi@arm.com" lorenzo.pieralisi@arm.com, Shiju Jose shiju.jose@huawei.com, Jonathan Cameron jonathan.cameron@huawei.com, "Guohanjun (Hanjun Guo)" guohanjun@huawei.com Subject: [Linaro-open-discussions] Regression of synchronous external aborts occur in user-mode Message-ID: a7c49b5d41384d9b877edb2c2909de41@huawei.com Content-Type: text/plain; charset="us-ascii"
Hi All,
We are facing regression on our hardware platform, Kunpeng9xx series, after the commit 8fcc4ae6faf8 ("arm64: acpi: Make apei_claim_sea() synchronize with APEI's irq work") was applied and this commit was merged into the mainline v5.8-rc1.
The regression occur with the user-mode SEA, synchronous external abort occur in user space, and with our customer delivered firmware which reports the ARM processor error record for the SEA, to let kernel record the error.
After the analysis it was identified that the reason of this issue is that do_sea() return directly for the user-mode SEA if the apei_claim_sea() handled the ARM processor error record, but the APEI GHES driver does not effectively process the ARM processor error record for the cache errors, by doing the memory failure handling. Currently the ghes_handle_memory_failure() is called only for the memory error record.
The following patch fix the regression by doing the memory failure handling for the ARM processor error reported for the user-mode SEA. https://lore.kernel.org/linux-acpi/94a38a33-a949-3cce-d617-e1476912596e@huaw...
V5 patch was send-out after reviews by over 4 months.
Can ARM folks please acknowledge the patch and perhaps the need for a meeting to explain the issue and resolution +quickly resolve any open questions?
Thanks Xiaofei
Subject: Digest Footer
Linaro-open-discussions mailing list Linaro-open-discussions@op-lists.linaro.org https://op-lists.linaro.org/mailman/listinfo/linaro-open-discussions
End of Linaro-open-discussions Digest, Vol 7, Issue 9
Hi Lorenzo
I have ping him several times. Hmm, i can try it again.
Thanks Xiaofei Tan
On 2021/5/10 21:51, Lorenzo Pieralisi wrote:
On Mon, May 10, 2021 at 09:35:55PM +0800, Xiaofei Tan wrote:
Hi Lorenzo
Any comment about this issue? The patch has been waiting for James Morse to review. It seems he is not in this group. Could you help to invite him to join this discussion, or ask for his opinion in private? This issue is urgent for us. Thank you very much.
You should ping him again on LAKML - that's the best course of action.
Thanks, Lorenzo
Thanks Xiaofei Tan
Today's Topics:
- Regression of synchronous external aborts occur in user-mode (tanxiaofei)
Message: 1 Date: Tue, 27 Apr 2021 12:08:57 +0000 From: tanxiaofei tanxiaofei@huawei.com To: "linaro-open-discussions@op-lists.linaro.org" linaro-open-discussions@op-lists.linaro.org Cc: "lorenzo.pieralisi@arm.com" lorenzo.pieralisi@arm.com, Shiju Jose shiju.jose@huawei.com, Jonathan Cameron jonathan.cameron@huawei.com, "Guohanjun (Hanjun Guo)" guohanjun@huawei.com Subject: [Linaro-open-discussions] Regression of synchronous external aborts occur in user-mode Message-ID: a7c49b5d41384d9b877edb2c2909de41@huawei.com Content-Type: text/plain; charset="us-ascii"
Hi All,
We are facing regression on our hardware platform, Kunpeng9xx series, after the commit 8fcc4ae6faf8 ("arm64: acpi: Make apei_claim_sea() synchronize with APEI's irq work") was applied and this commit was merged into the mainline v5.8-rc1.
The regression occur with the user-mode SEA, synchronous external abort occur in user space, and with our customer delivered firmware which reports the ARM processor error record for the SEA, to let kernel record the error.
After the analysis it was identified that the reason of this issue is that do_sea() return directly for the user-mode SEA if the apei_claim_sea() handled the ARM processor error record, but the APEI GHES driver does not effectively process the ARM processor error record for the cache errors, by doing the memory failure handling. Currently the ghes_handle_memory_failure() is called only for the memory error record.
The following patch fix the regression by doing the memory failure handling for the ARM processor error reported for the user-mode SEA. https://lore.kernel.org/linux-acpi/94a38a33-a949-3cce-d617-e1476912596e@huaw...
V5 patch was send-out after reviews by over 4 months.
Can ARM folks please acknowledge the patch and perhaps the need for a meeting to explain the issue and resolution +quickly resolve any open questions?
Thanks Xiaofei
Subject: Digest Footer
Linaro-open-discussions mailing list Linaro-open-discussions@op-lists.linaro.org https://op-lists.linaro.org/mailman/listinfo/linaro-open-discussions
End of Linaro-open-discussions Digest, Vol 7, Issue 9
.
linaro-open-discussions@op-lists.linaro.org