Hi all,
This is a fusion of Mike's notes and my own. Please add anything I missed! People may well be misidentified (sorry about that). Was very good active discussion. Thanks to all involved and Mike in particular for organizing it and taking live notes.
General request
- Slides for all topics next time to introduce topics as not everyone on call will have necessary background (and those that do might need reminding!) Hanjun is sending Mike his slides (uncore DVFS) to add to the collaborate page.
IORT - Reserve memory regions (RMR) ===================================
* Shameer gave summary - IORT Revision E (https://developer.arm.com/documentation/den0049/latest/) introduced new node type. - A way to describe memory regions that should have unity mapping in the SMMU. - Use case is a PCIe RAID card that has FW that uses a pool of host memory (hidden from OS).
* Status - Patches out for ACPICA - Question raised by ACPICA reviewers on whether spec is final - Spec appears final (Lorenzo to check) but may be minor unrelated fix in doc to come (Sami). - Patches out for kernel on relevant lists. - Mail from Steven Price (Arm) - (Sami Mujawar who was on the call also involved) interested for EFI framebuffer use case.
* Open questions - Equivalent from AMD has flag to indicate that unity mapping only needed until driver has taken over (end of kernel boot assumed). Avoids and issue of holes in address space for VMs. - Huawei not raising this as a requirement, but Lorenzo observed interesting and deserves discussion. - Kexec interaction needs discussions. Steve looking at this an will bring to list. - Lorenzo brought up issue of IORT spec using PCI BDF (stream ID?) which may be reenumerated. - Noted x86 doesn't do this but ARM traditionally does. - There is a DSM that tells the kernel not to reenumerate the PCI bus which ACPI obeys. - Jonathan suggestion was potentially opportunity to cache original stream ID before doing the reenumeration in kernel. - Lorenzo observed we may need a universal solution for all OSes on this. Lorenzo took AI to go away and think about it before next call. - Stalling issues on patch? Probably only Kexec though should be careful around possible future regressions on the BDF issue (not a blocker) - Related DT story. Huawei server team not interested as no DT support and can't test. Lorenzo suggested looping in Thierry Reding and reference a patch set (probably https://lore.kernel.org/linux-iommu/20200904130000.691933-1-thierry.reding@g...) Huawei more than happy to have others add the DT support :)
AI summary: * Kexec discussion - on list. * Use of BDF discussion - revisit here next time. * DT alignment. Don't want different solutions for each firmware type. * Lorenzo / Sami to check IORT revision E is final.
SVA ===
Zangfei gave summary: - Huawei has devices that are not PCIe but are presented as such. - They support stall mode for SVA (spec violation) - Resistance from kernel maintainers to maintaining a white list for any quirk. Fine to fix it once (JPB), but not to keep doing so. - Note that stall mode not yet supported at all (JPB to send out this cycle). - If longer term fix need add can't be done via PCISIG etc then need to convince PCI and SMMU maintainers. Noted that quirk is very little code.
* Other SVA topics. - Mentioned virtual SVA (no actually problems just expressing interest!) - Would need Eric Auger, wasn't on topic list so Eric not on call.
AI: Nothing planned until after JPB has upstreamed stall mode. Hard to have discussion before that.
DVFS ====
guohanjun
Solutions exist for * CPU DVFS (voltage + frequency scaling) * PCIe device power states etc
No standard way of controlling Uncore voltage and frequency for ACPI based systems.
3 options: 1. MMIO / kernel driver. 2. PSCI via trusted firmware and system management controller. 3. ACPI (wrapping up an op region and SCMI)
Clarifications / discussions. * Vincent G: Power states, or voltage frequency of interest? Ans Voltage Freq * Considered SCMI? Ans: Works only for DT as SCMI under ACPI is wrapped up in AML so looks like an ACPI interface. * Sudeep H: Necessary to trace CPU freq? Yes. * Sudeep H: Why not do it in firmware entirely? Ans. Not just CPU. For example PCI device accessing memory may well need the ring bus to be fast. * Vincent G: Bandwidth affected? Yes. VG: mobile does this by specifying a BW requirement (via SCMI.- * Sudeep H Observed need to expose it via ACPI spec. (option 3 above). * Sudeep H: Does PCI also need fine-grain control? We might need to add to the spec. * Sudeep H: What are the requirements? gaohanjun: Now we just frequency scaling. * Jonathan C: Noted PCI power state is not enough. It's workload dependent. * Sudeep H: We need to gather all the info, need to talk in ASWG about DVFS * Jonathan C: For now direct control probably makes sense. Whilst it would be nice to have a detailed enough system description in a standard way to make general software that is a big spec job. * Jonathan C: Seems like true standard SW will not happen any time soon.
AI: RFC to the linux-pm / linux-acpi Rafael and those in this discussion to ask about interest in adding per device DVFS to ACPI spec. Possibly pursue code first ACPI approach.
If I've miss listed or "volunteered" anyone for AIs they didn't agree to then please correct that.
Thanks all for contributions. I for one found it a very useful call!
Jonathan