Dear LAVA Community,
I am reaching out to report a regression (or significant change in
behavior) regarding how the LAVA dispatcher handles high-frequency terminal
output and escape sequences during flashing operations.
Background:
After upgrading from LAVA 2022.06 to a recent version LAVA 2026.02 , our
flashing jobs—which utilize mfgtoolcli (NXP) to write images to i.MX6/8
devices—have begun failing consistently with a deploy-flasher timeout.
The Issue:
The flashing tool outputs a high-frequency progress bar using carriage
returns (\r) and ANSI escape codes (e.g., \e[1F, \e[2K).
1. Log Bloat: In the older version, these were handled gracefully. In
the current version, every escape sequence is captured as a new log entry,
resulting in logs exceeding 5MB for a single flash.
2. Dispatcher Lag: The dispatcher appears to bottleneck while processing
this flood of data. This "processing lag" causes the internal action timer
to hit the default 500s limit, even if the physical flashing process
succeeds locally on the worker.
3. Timeout Overrides: We’ve observed that deploy-flasher often ignores
the timeout values specified in the Job YAML, defaulting to 500 seconds
unless explicitly overridden in the Device Dictionary.
Error Examples:
finish programming rootfs-a>> [1F
[2K 1 100%
[=============================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================
... continues >]
Donedeploy-flasher timed out after 500 seconds>> uuu (Universal
Update Utility) for nxp imx chips -- libuuu_1.4.243-0-ged48c51>> >>
Wait for Known USB Device Appear...
[?25l [1F [1F [1F [1F [1F [1F1:241312 1/ 0 [
]
>> [1F [1F [1F [1F [1F [1F
[1F [1F [1F [1F [1F [1F [1F [1F [1F [1F [1F [1F [1F [1F [1F [
Steps we have attempted without success:
- Setting TERM=dumb and piping to cat (the tool continues to output
sequences).
- Using tr -d '\r' (reduces lines but the data volume still triggers the
timeout).
- Increasing timeouts in the Job YAML (often ignored by the sub-action).
Question:
Has there been a change in the dispatcher's log-collection priority or
buffering logic that would cause this bottleneck? Are there recommended
"best practices" for handling tools that force interactive progress bars in
the newer LAVA architecture?
Best Regards
Pavan Kumar