On Mon, Nov 23, 2020 at 3:13 PM Lorenzo Pieralisi via Linaro-open-discussions linaro-open-discussions@op-lists.linaro.org wrote:
On Thu, Nov 05, 2020 at 12:36:21PM +0000, Jonathan Cameron via Linaro-open-discussions wrote:
Hi all,
This is a fusion of Mike's notes and my own. Please add anything I
missed!
People may well be misidentified (sorry about that). Was very good
active discussion.
Thanks to all involved and Mike in particular for organizing it and
taking live notes.
General request
- Slides for all topics next time to introduce topics as not everyone
on call will have necessary
background (and those that do might need reminding!) Hanjun is sending Mike his slides (uncore DVFS) to add to the
collaborate page.
IORT - Reserve memory regions (RMR)
- Shameer gave summary
- IORT Revision E (
https://developer.arm.com/documentation/den0049/latest/) introduced new node type.
- A way to describe memory regions that should have unity mapping in
the SMMU.
- Use case is a PCIe RAID card that has FW that uses a pool of host
memory (hidden from OS).
- Status
- Patches out for ACPICA
- Question raised by ACPICA reviewers on whether spec is final
- Spec appears final (Lorenzo to check) but may be minor unrelated
fix in doc to come (Sami).
- Patches out for kernel on relevant lists.
- Mail from Steven Price (Arm) - (Sami Mujawar who was on the call
also involved) interested for EFI
framebuffer use case.
- Open questions
- Equivalent from AMD has flag to indicate that unity mapping only
needed until driver has taken over
(end of kernel boot assumed). Avoids and issue of holes in address
space for VMs.
- Huawei not raising this as a requirement, but Lorenzo observed
interesting and deserves discussion.
- Kexec interaction needs discussions. Steve looking at this an will
bring to list.
- Lorenzo brought up issue of IORT spec using PCI BDF (stream ID?)
which may be reenumerated.
- Noted x86 doesn't do this but ARM traditionally does.
- There is a DSM that tells the kernel not to reenumerate the PCI bus
which ACPI obeys.
- Jonathan suggestion was potentially opportunity to cache original
stream ID before doing the
reenumeration in kernel.
- Lorenzo observed we may need a universal solution for all OSes on
this.
Lorenzo took AI to go away and think about it before next call.
- Stalling issues on patch? Probably only Kexec though should be
careful around possible future
regressions on the BDF issue (not a blocker)
- Related DT story. Huawei server team not interested as no DT support
and can't test.
Lorenzo suggested looping in Thierry Reding and reference a patch set (probably
https://lore.kernel.org/linux-iommu/20200904130000.691933-1-thierry.reding@g... )
Huawei more than happy to have others add the DT support :)
AI summary:
- Kexec discussion - on list.
- Use of BDF discussion - revisit here next time.
I have an update on this topic and the RMR flags to free up IOVA.
Is December 2nd confirmed as next session ?
It is if we have an agenda, which we now do, I added your topic as confirmed.
If possible 3PM GMT works better for me;
Linaro has a standing internal meeting with its members at 3.00 pm on December 2nd, how is 4.30 pm GMT or retain the 2.00 pm slot?
the NUMA topic raised by Jonathan is another interesting topic for debate. Other than that we can slot in the topics that weren't discussed last time:
https://collaborate.linaro.org/display/LOD/Linaro+Open+Discussions+Home
even though those require a bit of preparation so the sooner we finalize the schedule the better.
Please let me know, thanks.
I think we should prepare any topic slides and have the call, does anyone have any additional agenda items?
Lorenzo
- DT alignment. Don't want different solutions for each firmware type.
- Lorenzo / Sami to check IORT revision E is final.
SVA
Zangfei gave summary:
- Huawei has devices that are not PCIe but are presented as such.
- They support stall mode for SVA (spec violation)
- Resistance from kernel maintainers to maintaining a white list for
any quirk. Fine to fix
it once (JPB), but not to keep doing so.
- Note that stall mode not yet supported at all (JPB to send out this
cycle).
- If longer term fix need add can't be done via PCISIG etc then need to
convince
PCI and SMMU maintainers. Noted that quirk is very little code.
- Other SVA topics.
- Mentioned virtual SVA (no actually problems just expressing
interest!)
- Would need Eric Auger, wasn't on topic list so Eric not on call.
AI: Nothing planned until after JPB has upstreamed stall mode. Hard to
have discussion before that.
DVFS
guohanjun
Solutions exist for
- CPU DVFS (voltage + frequency scaling)
- PCIe device power states etc
No standard way of controlling Uncore voltage and frequency for ACPI
based systems.
3 options:
- MMIO / kernel driver.
- PSCI via trusted firmware and system management controller.
- ACPI (wrapping up an op region and SCMI)
Clarifications / discussions.
- Vincent G: Power states, or voltage frequency of interest? Ans
Voltage Freq
- Considered SCMI? Ans: Works only for DT as SCMI under ACPI is
wrapped up in AML
so looks like an ACPI interface.
- Sudeep H: Necessary to trace CPU freq? Yes.
- Sudeep H: Why not do it in firmware entirely? Ans. Not just CPU.
For example PCI device accessing
memory may well need the ring bus to be fast.
- Vincent G: Bandwidth affected? Yes. VG: mobile does this by
specifying a BW requirement (via SCMI.-
- Sudeep H Observed need to expose it via ACPI spec. (option 3 above).
- Sudeep H: Does PCI also need fine-grain control? We might need to add
to the spec.
- Sudeep H: What are the requirements? gaohanjun: Now we just
frequency scaling.
- Jonathan C: Noted PCI power state is not enough. It's workload
dependent.
- Sudeep H: We need to gather all the info, need to talk in ASWG about
DVFS
- Jonathan C: For now direct control probably makes sense. Whilst it
would be nice to have
a detailed enough system description in a standard way to make
general software that is a
big spec job.
- Jonathan C: Seems like true standard SW will not happen any time soon.
AI: RFC to the linux-pm / linux-acpi Rafael and those in this discussion
to ask about
interest in adding per device DVFS to ACPI spec. Possibly pursue
code first ACPI
approach.
If I've miss listed or "volunteered" anyone for AIs they didn't agree to
then please
correct that.
Thanks all for contributions. I for one found it a very useful call!
Jonathan
-- Linaro-open-discussions mailing list https://collaborate.linaro.org/display/LOD/Linaro+Open+Discussions+Home https://op-lists.linaro.org/mailman/listinfo/linaro-open-discussions
-- Linaro-open-discussions mailing list https://collaborate.linaro.org/display/LOD/Linaro+Open+Discussions+Home https://op-lists.linaro.org/mailman/listinfo/linaro-open-discussions
On Mon, 23 Nov 2020 16:44:56 +0000 Mike Holmes mike.holmes@linaro.org wrote:
On Mon, Nov 23, 2020 at 3:13 PM Lorenzo Pieralisi via Linaro-open-discussions linaro-open-discussions@op-lists.linaro.org wrote:
On Thu, Nov 05, 2020 at 12:36:21PM +0000, Jonathan Cameron via Linaro-open-discussions wrote:
Hi all,
This is a fusion of Mike's notes and my own. Please add anything I
missed!
People may well be misidentified (sorry about that). Was very good
active discussion.
Thanks to all involved and Mike in particular for organizing it and
taking live notes.
General request
- Slides for all topics next time to introduce topics as not everyone
on call will have necessary
background (and those that do might need reminding!) Hanjun is sending Mike his slides (uncore DVFS) to add to the
collaborate page.
IORT - Reserve memory regions (RMR)
- Shameer gave summary
- IORT Revision E (
https://developer.arm.com/documentation/den0049/latest/) introduced new node type.
- A way to describe memory regions that should have unity mapping in
the SMMU.
- Use case is a PCIe RAID card that has FW that uses a pool of host
memory (hidden from OS).
- Status
- Patches out for ACPICA
- Question raised by ACPICA reviewers on whether spec is final
- Spec appears final (Lorenzo to check) but may be minor unrelated
fix in doc to come (Sami).
- Patches out for kernel on relevant lists.
- Mail from Steven Price (Arm) - (Sami Mujawar who was on the call
also involved) interested for EFI
framebuffer use case.
- Open questions
- Equivalent from AMD has flag to indicate that unity mapping only
needed until driver has taken over
(end of kernel boot assumed). Avoids and issue of holes in address
space for VMs.
- Huawei not raising this as a requirement, but Lorenzo observed
interesting and deserves discussion.
- Kexec interaction needs discussions. Steve looking at this an will
bring to list.
- Lorenzo brought up issue of IORT spec using PCI BDF (stream ID?)
which may be reenumerated.
- Noted x86 doesn't do this but ARM traditionally does.
- There is a DSM that tells the kernel not to reenumerate the PCI bus
which ACPI obeys.
- Jonathan suggestion was potentially opportunity to cache original
stream ID before doing the
reenumeration in kernel.
- Lorenzo observed we may need a universal solution for all OSes on
this.
Lorenzo took AI to go away and think about it before next call.
- Stalling issues on patch? Probably only Kexec though should be
careful around possible future
regressions on the BDF issue (not a blocker)
- Related DT story. Huawei server team not interested as no DT support
and can't test.
Lorenzo suggested looping in Thierry Reding and reference a patch set (probably
https://lore.kernel.org/linux-iommu/20200904130000.691933-1-thierry.reding@g... )
Huawei more than happy to have others add the DT support :)
AI summary:
- Kexec discussion - on list.
- Use of BDF discussion - revisit here next time.
I have an update on this topic and the RMR flags to free up IOVA.
Is December 2nd confirmed as next session ?
It is if we have an agenda, which we now do, I added your topic as confirmed.
If possible 3PM GMT works better for me;
Linaro has a standing internal meeting with its members at 3.00 pm on December 2nd, how is 4.30 pm GMT or retain the 2.00 pm slot?
I have another call at 4.30 but any time before that works for me. If that's a problem for Lorenzo, perhaps we can shift the day slightly or go earlier in the day?
the NUMA topic raised by Jonathan is another interesting topic for debate. Other than that we can slot in the topics that weren't discussed last time:
https://collaborate.linaro.org/display/LOD/Linaro+Open+Discussions+Home
even though those require a bit of preparation so the sooner we finalize the schedule the better.
Please let me know, thanks.
I think we should prepare any topic slides and have the call, does anyone have any additional agenda items?
I'll chase up our end over the next few days,
Thanks,
Jonathan
Lorenzo
- DT alignment. Don't want different solutions for each firmware type.
- Lorenzo / Sami to check IORT revision E is final.
SVA
Zangfei gave summary:
- Huawei has devices that are not PCIe but are presented as such.
- They support stall mode for SVA (spec violation)
- Resistance from kernel maintainers to maintaining a white list for
any quirk. Fine to fix
it once (JPB), but not to keep doing so.
- Note that stall mode not yet supported at all (JPB to send out this
cycle).
- If longer term fix need add can't be done via PCISIG etc then need to
convince
PCI and SMMU maintainers. Noted that quirk is very little code.
- Other SVA topics.
- Mentioned virtual SVA (no actually problems just expressing
interest!)
- Would need Eric Auger, wasn't on topic list so Eric not on call.
AI: Nothing planned until after JPB has upstreamed stall mode. Hard to
have discussion before that.
DVFS
guohanjun
Solutions exist for
- CPU DVFS (voltage + frequency scaling)
- PCIe device power states etc
No standard way of controlling Uncore voltage and frequency for ACPI
based systems.
3 options:
- MMIO / kernel driver.
- PSCI via trusted firmware and system management controller.
- ACPI (wrapping up an op region and SCMI)
Clarifications / discussions.
- Vincent G: Power states, or voltage frequency of interest? Ans
Voltage Freq
- Considered SCMI? Ans: Works only for DT as SCMI under ACPI is
wrapped up in AML
so looks like an ACPI interface.
- Sudeep H: Necessary to trace CPU freq? Yes.
- Sudeep H: Why not do it in firmware entirely? Ans. Not just CPU.
For example PCI device accessing
memory may well need the ring bus to be fast.
- Vincent G: Bandwidth affected? Yes. VG: mobile does this by
specifying a BW requirement (via SCMI.-
- Sudeep H Observed need to expose it via ACPI spec. (option 3 above).
- Sudeep H: Does PCI also need fine-grain control? We might need to add
to the spec.
- Sudeep H: What are the requirements? gaohanjun: Now we just
frequency scaling.
- Jonathan C: Noted PCI power state is not enough. It's workload
dependent.
- Sudeep H: We need to gather all the info, need to talk in ASWG about
DVFS
- Jonathan C: For now direct control probably makes sense. Whilst it
would be nice to have
a detailed enough system description in a standard way to make
general software that is a
big spec job.
- Jonathan C: Seems like true standard SW will not happen any time soon.
AI: RFC to the linux-pm / linux-acpi Rafael and those in this discussion
to ask about
interest in adding per device DVFS to ACPI spec. Possibly pursue
code first ACPI
approach.
If I've miss listed or "volunteered" anyone for AIs they didn't agree to
then please
correct that.
Thanks all for contributions. I for one found it a very useful call!
Jonathan
-- Linaro-open-discussions mailing list https://collaborate.linaro.org/display/LOD/Linaro+Open+Discussions+Home https://op-lists.linaro.org/mailman/listinfo/linaro-open-discussions
-- Linaro-open-discussions mailing list https://collaborate.linaro.org/display/LOD/Linaro+Open+Discussions+Home https://op-lists.linaro.org/mailman/listinfo/linaro-open-discussions
On Mon, Nov 23, 2020 at 05:44:15PM +0000, Jonathan Cameron wrote:
[...]
If possible 3PM GMT works better for me;
Linaro has a standing internal meeting with its members at 3.00 pm on December 2nd, how is 4.30 pm GMT or retain the 2.00 pm slot?
I have another call at 4.30 but any time before that works for me. If that's a problem for Lorenzo, perhaps we can shift the day slightly or go earlier in the day?
The only slot I can in the afternoon is 3PM to 5PM on Wednesdays.
Morning it is perfectly fine but it may be too early for some people.
2PM I really can't make it I am sorry. Maybe we can try in the morning ?
Please let me know, thanks a lot, Lorenzo
the NUMA topic raised by Jonathan is another interesting topic for debate. Other than that we can slot in the topics that weren't discussed last time:
https://collaborate.linaro.org/display/LOD/Linaro+Open+Discussions+Home
even though those require a bit of preparation so the sooner we finalize the schedule the better.
Please let me know, thanks.
I think we should prepare any topic slides and have the call, does anyone have any additional agenda items?
I'll chase up our end over the next few days,
Thanks,
Jonathan
Lorenzo
- DT alignment. Don't want different solutions for each firmware type.
- Lorenzo / Sami to check IORT revision E is final.
SVA
Zangfei gave summary:
- Huawei has devices that are not PCIe but are presented as such.
- They support stall mode for SVA (spec violation)
- Resistance from kernel maintainers to maintaining a white list for
any quirk. Fine to fix
it once (JPB), but not to keep doing so.
- Note that stall mode not yet supported at all (JPB to send out this
cycle).
- If longer term fix need add can't be done via PCISIG etc then need to
convince
PCI and SMMU maintainers. Noted that quirk is very little code.
- Other SVA topics.
- Mentioned virtual SVA (no actually problems just expressing
interest!)
- Would need Eric Auger, wasn't on topic list so Eric not on call.
AI: Nothing planned until after JPB has upstreamed stall mode. Hard to
have discussion before that.
DVFS
guohanjun
Solutions exist for
- CPU DVFS (voltage + frequency scaling)
- PCIe device power states etc
No standard way of controlling Uncore voltage and frequency for ACPI
based systems.
3 options:
- MMIO / kernel driver.
- PSCI via trusted firmware and system management controller.
- ACPI (wrapping up an op region and SCMI)
Clarifications / discussions.
- Vincent G: Power states, or voltage frequency of interest? Ans
Voltage Freq
- Considered SCMI? Ans: Works only for DT as SCMI under ACPI is
wrapped up in AML
so looks like an ACPI interface.
- Sudeep H: Necessary to trace CPU freq? Yes.
- Sudeep H: Why not do it in firmware entirely? Ans. Not just CPU.
For example PCI device accessing
memory may well need the ring bus to be fast.
- Vincent G: Bandwidth affected? Yes. VG: mobile does this by
specifying a BW requirement (via SCMI.-
- Sudeep H Observed need to expose it via ACPI spec. (option 3 above).
- Sudeep H: Does PCI also need fine-grain control? We might need to add
to the spec.
- Sudeep H: What are the requirements? gaohanjun: Now we just
frequency scaling.
- Jonathan C: Noted PCI power state is not enough. It's workload
dependent.
- Sudeep H: We need to gather all the info, need to talk in ASWG about
DVFS
- Jonathan C: For now direct control probably makes sense. Whilst it
would be nice to have
a detailed enough system description in a standard way to make
general software that is a
big spec job.
- Jonathan C: Seems like true standard SW will not happen any time soon.
AI: RFC to the linux-pm / linux-acpi Rafael and those in this discussion
to ask about
interest in adding per device DVFS to ACPI spec. Possibly pursue
code first ACPI
approach.
If I've miss listed or "volunteered" anyone for AIs they didn't agree to
then please
correct that.
Thanks all for contributions. I for one found it a very useful call!
Jonathan
-- Linaro-open-discussions mailing list https://collaborate.linaro.org/display/LOD/Linaro+Open+Discussions+Home https://op-lists.linaro.org/mailman/listinfo/linaro-open-discussions
-- Linaro-open-discussions mailing list https://collaborate.linaro.org/display/LOD/Linaro+Open+Discussions+Home https://op-lists.linaro.org/mailman/listinfo/linaro-open-discussions
linaro-open-discussions@op-lists.linaro.org