On Thu, 8 Oct 2020, Alex Bennée wrote:
Hi Alex,
On 08/10/2020 11:54, Alex Bennée wrote:
Stefano Stabellini stefano.stabellini@xilinx.com writes:
On Fri, 2 Oct 2020, Alex Bennée wrote:
Hi Julian,
Another data point, this time on real HW:
Xen 4.15-unstable (c/s Wed Sep 30 12:25:05 2020 +0100 git:e680dde6fd) EFI loader
<snip> >> (XEN) >> (XEN) Xen stack trace from sp=00000000002ffcf0: >> (XEN) 00000000002ffd20 000000000020404c 00000000002a5448 0000000000000001 >> (XEN) 00000000002ffd40 f11e576000204150 00000000002ffd50 00000000002c417c >> (XEN) 00000000002b25c0 0000000000000001 00000000002b2620 000080042ffde5d0 >> (XEN) 00000000002ffd90 00000000002bdbe0 000080042ffde5d0 0000000000000001 >> (XEN) 00000000002ddc68 0000000000000000 0000000000000004 00000000ffffffc8 >> (XEN) 00000000002ffdd0 00000000002bf09c 0000000000000004 0000000000000004 >> (XEN) 00000000002b0580 000000000033e430 0101010101010101 0000000000000000 >> (XEN) 00000000002ffdf0 00000000002cbaec 0000000000000004 0000000000000004 >> (XEN) 000000003f8790a0 00000000002001b8 00000000aab35000 00000000aa935000 >> (XEN) 00000000b2273000 0000000000000000 0000000000400000 00000000b22ad018 >> (XEN) 00000000aabe7138 0000000000000001 0000000000000001 8000000000000002 >> (XEN) 0000000000000000 00000000002e1948 00000000b2273000 0000000000009000 >> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000300000000 >> (XEN) 0000000000000000 00000040ffffffff 00000000ffffffff 0000000000000000 >> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> (XEN) 0000000000000000 0000000000000000 >> (XEN) Xen call trace: >> (XEN) [<000000000025ed4c>] strlen+0x10/0x84 (PC) >> (XEN) [<00000000002030f8>] dt_device_is_compatible+0x48/0x84 (LR) >> (XEN) [<000000000020404c>] dt_match_node+0x70/0x10c >> (XEN) [<00000000002c417c>] device_init+0x90/0xdc >> (XEN) [<00000000002bdbe0>] iommu_hardware_setup+0x5c/0x188 >> (XEN) [<00000000002bf09c>] iommu_setup+0x30/0x188 >> (XEN) [<00000000002cbaec>] start_xen+0xac4/0xc88 >> (XEN) [<00000000002001b8>] arm64/head.o#primary_switched+0x10/0x30 > > This seems to be the same bug in GRUB noticed by Masami, where GRUB is > adding strings to device tree at runtime without '\0' at the end. Julien > posted a potential (untested) fix for GRUB:
What version of GRUB are you using?
2.04-11 - so The current Debian Testing package with your patch applied. I'd avoided build grub from scratch to avoid screwing up the system but if it's just a drop in .efi binary then I can build from the vanilla tree.
I tried to repro the issue but I cannot see any problems with the EFI+Grub+DeviceTree setup that I have here.
My suggestion would be to rebuild Grub and play around with the finalize_params_xen_boot function Julien patched. I would try increasing additional_size to 0x2000 to see if it helps.
I would also add a printk to dt_device_is_compatible: it would be very verbose, but it would tell you exactly which one is the offending string. However, if it is a DTB size miscalculation by Grub, that is not going to help much.