Hi, I am trying to make use of environment variables defined on my lava-server inside interactive tests.
I have had some success with this, but I am not sure why one method works over the other.
For example, I do something like the following:
echo 'wget --user=some-name --password=${SECRET_PASSWORD}' > somefile.sh
And that works just fine. But when I already have a script (say it exists on the DUT already) that just takes it as a parameter, it doesn't work.
Example:
some_script.sh -p ${SECRET_PASSWORD} -u some-name
I have also tried accessing it directly in the script, similar to what somefile.sh would look like on the inside, but had no success with that.
I have tried exporting the environment variable as another one to see if that would work, but to no avail.
Is there something I'm missing? What I can I do to achieve this without hardcoding it anywhere in my scripts?
Regards,
Michael
Hi,
I upgraded my LAVA instance to 2025.04 yesterday I skipped one release
and the upgrade was from 2024.09. Most of the jobs I run in use
deploy-to-flasher with custom script handling the board flashing. Up
until 2024.09 (I don't know about 2025.02) "downloads" URLs resulted
in all files stored in a flat directory. My flashing scripts made use
of this "feature". In 2025.04 this has changed. This commit:
https://gitlab.com/lava/lava/-/commit/4d9f0ebdae9ca53baf6633f4a35e716183bd2…
makes the files stored in separate directories. It feels like a step
in a right direction but it broke my flashing scripts. To make a quick
fix I did this in the beginning of the script:
DOWNLOAD_DIRS=$(find . -maxdepth 1 -mindepth 1 -type d -printf '%f ')
for d in $DOWNLOAD_DIRS; do
FILE_TO_COPY=$(find ./$d -maxdepth 1 -mindepth 1 -type f -printf '%f')
ln "$PWD/$d/$FILE_TO_COPY" "$PWD/$FILE_TO_COPY"
done
This isn't great but it solves the issue for now. After a short
discussion with Chase we came to conclusion that solution that looks
"more correct" is to implement support for "uniquify" for "download"
URLs. Chase sent a patch here:
https://gitlab.com/lava/lava/-/merge_requests/2795
This should be available in the next release.
I hope this message helps someone with similar issues :)
Best Regards,
Milosz
Hi,
I am trying to write a test to conditionally extract files from a zip file for use in some scripts, e.g. unzip the file, copy/move the files I need out, remove the zip file and directory created from unzipping it.
However, the zip file size exceeds the disk space of the DUT.
I have considered doing something like a MultiNode job, where the DUT is booted as well as a QEMU instance, and the file is downloaded and extracted
on the QEMU and moved onto the DUT, but I would prefer not to do that if possible.
Is there any way to download the zip file mid-test to the worker instead, extract the files I need, and move them onto the DUT? Ideally no modifications to
the code of LAVA, but if that is necessary then I can figure out an interim solution and come back to it later.
Regards,
Michael
Hi,
I am trying to get LDAP working in my LAVA instance, and have managed to get logging into it working. The issue comes when I try to use Django's MIRROR_GROUPS setting.
I am already aware that it is not built into LAVA to be configurable, and I have already taken steps to add the relevant lines in lava_server/settings/common.py to make it work (initialize to None, values.get() in the update method, then run eval on the value in the LDAP if section), but it still won't work. All other required settings are clearly working just fine, and I can even set USER_FLAGS_BY_GROUP just fine, but I would prefer to mirror certain groups that users are members of and assign permissions to the groups.
Do I need to pre-create the groups before logging in to LAVA or am I missing something else/doing something wrong?
Original common.py source: https://gitlab.com/lava/lava/-/blob/master/lava_server/settings/common.py
Hi All,
I am trying to setup LAVA and execute test cases. I am using Intel x86-64 machine.
LAVA able to boot the Linux without any issues, but it is stuck just before login prompt.
I am able to boot the machine with manual procedure.
Pls find the log below.
[[0;32m OK [0m] Started [0;1;39mBerkeley Internet Name Domain (DNS)[0m.
[[0;32m OK [0m] Started [0;1;39mPostfix Mail Transport Agent[0m.
[[0;32m OK [0m] Reached target [0;1;39mMulti-User System[0m.
[[0;32m OK [0m] Reached target [0;1;39mHost and Network Name Lookups[0m.
Starting [0;1;39mRecord Runlevel Change in UTMP[0m...
[[0;32m OK [0m] Finished [0;1;39mRecord Runlevel Change in UTMP[0m.
Linux Update 6 intel-x86-64 ttyUSB0
<Login prompt print expected here ("intel-x86-64 login:"), but not displayed on the log>
bootloader-interrupt timed out after 280 seconds
end: 2.3.2 bootloader-interrupt (duration 00:04:40) [common]
case: bootloader-interrupt
case_id: 185
definition: lava
duration: 280.00
extra: ...
level: 2.3.2
namespace: common
result: fail
health check yaml file snippet:
- boot:
timeout:
minutes: 5
method: ipxe
commands: nfs
failure_retry: 3
prompts:
- '[a-zA-Z0-9\-\_]+@[a-zA-Z0-9\-\_]+:.*?#'
auto_login:
login_prompt: "intel-x86-64 login:"
username: root
password_prompt: "password:"
password: root
login_commands:
- cat /proc/cmdline
- cd /
- ls
- test:
failure_retry: 3
timeout:
minutes: 5
definitions:
- repository:
metadata:
format: Lava-Test Test Definition 1.0
name: smoke-tests-basic
description: "Basic system test command for Linaro Ubuntu images"
run:
steps:
- printenv
Thanks
Rajesh
Hi All,
I am trying to setup LAVA and execute test cases. I am using Intel x86-64 machine.
LAVA able to boot the Linux without any issues, but it is stuck just before login prompt.
I am able to boot the machine with manual procedure.
Pls find the log below.
[[0;32m OK [0m] Started [0;1;39mBerkeley Internet Name Domain (DNS)[0m.
[[0;32m OK [0m] Started [0;1;39mPostfix Mail Transport Agent[0m.
[[0;32m OK [0m] Reached target [0;1;39mMulti-User System[0m.
[[0;32m OK [0m] Reached target [0;1;39mHost and Network Name Lookups[0m.
Starting [0;1;39mRecord Runlevel Change in UTMP[0m...
[[0;32m OK [0m] Finished [0;1;39mRecord Runlevel Change in UTMP[0m.
Linux Update 6 intel-x86-64 ttyUSB0
<Login prompt print expected here ("intel-x86-64 login:"), but not displayed on the log>
bootloader-interrupt timed out after 280 seconds
end: 2.3.2 bootloader-interrupt (duration 00:04:40) [common]
case: bootloader-interrupt
case_id: 185
definition: lava
duration: 280.00
extra: ...
level: 2.3.2
namespace: common
result: fail
health check yaml file snippet:
- boot:
timeout:
minutes: 5
method: ipxe
commands: nfs
failure_retry: 3
prompts:
- '[a-zA-Z0-9\-\_]+@[a-zA-Z0-9\-\_]+:.*?#'
auto_login:
login_prompt: "intel-x86-64 login:"
username: root
password_prompt: "password:"
password: root
login_commands:
- cat /proc/cmdline
- cd /
- ls
- test:
failure_retry: 3
timeout:
minutes: 5
definitions:
- repository:
metadata:
format: Lava-Test Test Definition 1.0
name: smoke-tests-basic
description: "Basic system test command for Linaro Ubuntu images"
run:
steps:
- printenv
Thanks
Rajesh
We have a requirement for some test cases that the artifacts need to be modified (negative scenario) in order
to verify some scenarios. we are currently using postprocess under deploy action for this purpose.
As of part postprocess, we need to mount partitions of disk image and make some changes based on the requirement.
However, we are unable to mount the partitions using losetup command.
docker:
image: debian:bookworm
steps:
- apt-get update
- loop_dev=$(losetup -P -f --show image.wic)
- echo $loop_dev
The job failed with below error,
losetup -P -f --show image.wic
losetup: cannot find an unused loop device
It looks like the docker container needs root privileges. It works when we run locally with "privileged:true".
Is there any way we can pass "privileged: true" option to docker image through LAVA job definitions?
Any inputs regarding this matter are highly appreciated.
Hello,
I'm trying to verify secure boot on LAVA lab, I've verified the positive case of secure boot and confirmed the secure boot is enabled from the boot logs (kernel-start-message = "EFI stub: UEFI Secure Boot is enabled.").
Next, I would like to verify the negative test case as well by tampering the secure boot image where I expect the image boot shall fail.
Is there way in LAVA to make the job complete(e.g. waiting for specific message in boot logs like "Can't find image information. ") even though the image fails to boot on the DUT?
Ref. https://lava.ciplatform.org/scheduler/job/1236667#L81
Hi,
I am not able to submit multiple jobs on my single node lava server, currently even 1 job is running the second job waits till the 1st job is completed. I have tried to increase the job limit of my worker to 3 via lava Admin UI but still I cannot run 2 jobs in parallel. Could you please let me know how can I increase the no. jobs.
Hi,
I have been looking into how to secure LAVA, as so far it's just been learning how it works. There's the obvious http/https stuff, and in the documentation mentions firewalls for remote workers (which we'll be using anyways). We already have things in place for physical security and plans on how to improve, so no worries there.
Is there anything not covered in the documentation that should be considered and how could we go about putting that into place?
Best regards,
Michael
I saw a file 'lava-dispatcher/actions/boot/gdb.py' in the LAVA source code, which seems to define a gdb method in action->boot. But I didn't find a description of it in the LAVA documentation. The document I checked is `https://docs.lavasoftware.org/lava/actions-boot.html?highlight=boot#boot-action-reference`.
Reading the source code may lead to some misunderstandings and is time-consuming. I hope someone who knows its function can explain to me what this new method does. Thank you very much!
I want to add some key:value pair in metadata of json that POST by notification of job, but I can not figure it out. I had tried add metadata in job defination and test defination, but the back json did not have my data in metadata field, it was just a [].
Then I read the lava source code, I find the metadata data are produced by method `get_metadata_dict()` in class `TestJob`. This method is called at method `create_job_data` in class `TestJob`, it creates the json data which will post to callback url. `get_metadata_dict()` check whether there is an attribute `testdata`, but I can not find the place where this attribute‘s assignment.
Could someone give me a way to do this? Thanks you very much! 🙏
Hello,
We are testing our custom debian image and have added banner messages before login and after login and after the implementation LAVA job is failing on auto-login.
Please find example banner messages:
|_ _/
| |
| |
_| |
|_____\
Info:
Example.....: ABC
Example......: ABC
IP:
etho.......:123
Info:
1. some info
Some Info:
1. ###
Some more info: ####
login :
It looks like due to a huge banner message before the login prompt it is getting confused and due to this, it gets stuck at auto login and timeouts. Please suggest if we can ignore this in LAVA to login and execute the test case
Please find the job details:
device_type: qemu
job_name: banner
timeouts:
job:
minutes: 30
action:
minutes: 25
connection:
minutes: 5
priority: medium
visibility: public
context:
arch: amd64
lava_test_results_dir: "/home/lava-%s"
actions:
- deploy:
timeout:
minutes: 15
to: tmpfs
images:
rootfs:
url: file:///home/image
image_arg: XXXX
- boot:
timeout:
minutes: 15
method: qemu
media: tmpfs
prompts:
- 'login:'
auto_login:
login_prompt: "login:"
username: abc
password_prompt: "Password:"
password: "abc"
- test:
timeout:
minutes: 15
definitions:
- repository: git(a)test.git
from: git
path: test.yaml
name: basic
Thanks,
Sweta
Hello,
I'm currently using version 2024.09, and when running the test monitor, I encounter the following error at the top of my job: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')). I don't notice anything unusual in the test logs. Could this be a bug with the test monitor?
Best Regards,
Andy
Hi,
I think there is something terribly wrong with job timeouts in 2024.05
release. Total job timeout seems to be ignored. This was working fine
on 2024.01 release. I'll try to debug, but IMHO this is a pretty
critical bug that should be addressed immediately. I'll report it in
gitlab as well.
Best Regards,
Milosz
Hello,
I am spinning up debian image using Qemu and want to copy a custom package from my lava server to debian vm for testing. I could see that support for tar is planned https://docs.lavasoftware.org/lava/actions-test.html#inline-test-definition… . But is there any way I can include my custom package in tarbar created at Deploy Action to ship to my VM easily?
Hello,
I have multiple machines over which I test using LAVA job using primary ssh connection. As my devices OS gets installed every now and then I have to manually update the ssh public key before executing the job so that LAVA can connect to DUT over passwordless SSH.
Is there a way I can include a script in device dictionary to copy the keys automatically before requesting a ssh connection? As I understand password option is not there. Kindly suggest.
Hello,
I want to spin up an image with tpm enabled using qemu, and I need to execute swtpm socket command before image boot-ups via qemu. Please suggest if I could execute this command or script in qemu.jinja2 file or any other file.
Thanks,
Sweta
Hi,
I want clone a gitlab repo with id and token during job execution. To hide the token I am passing the value in my job. How can I export the value of GIT_TOKEN in environment file of LAVA? So that whenever job is executed it will read the value from its environment file. I tried to use export and tried to add "GIT_TOKEN=abcd" in env.yaml file. But I guess its not correct format. I couldn't find what kind of values we can add in env.yaml in documents. I also added GIT_TOKEN value in /etc/profile of the LAVA server still the job didn't pick it. Could you please suggest how can we clone the repo in LAVA job without exposing its password. Also I understand that ssh key of root will work but I want to avoid using ssh key of root. Kindly suggest
- test:
timeout:
minutes: 15
definitions:
- repository: https://gitid:$GIT_TOKEN@gitlab.com/lava-tests.git
from: git
path: tests/cisscan/cis.yaml
branch: pipeline
name: cis-benchmark
Hi all,
I run into some unexpected behavior when it comes to LAVA job IDs
reported by the "lavacli jobs sumit" command for multi node tests.
What I get is a list of newline separated job IDs like:
lavacli jobs submit <my-job-description>
132.0
132.1
To my understanding this tries to express that 132.1 is a sub job of
132. OK, but this is a problem if I now try to download the junit
report for the second job (=132.1).
It seems 132.1 is not a valid job ID (API wise), it has to be
translated to 133 to work properly.
This is still doable, but I'm unsure if I can relay on 133 really being
the 132.1 job in case there are several job submissions at the same
time.
Is that a bug of lavacli? What is the reasoning for returning a sub job
ID which seems to be invisible to the rest of the LAVA infrastructure /
API?
I'm currently running lavacli 1.2-1, as shipped by Debian 12.
Any input welcome...
Best regards,
Florian
Hello,
I am new to LAVA, I want to boot up my hardware(NET-I System)which is a x86_64 and install debian bookworm image over it and execute test cases. My h/w is connected to my LAVA server via a serial device.Below is the high level job that I am planning to use. But where should I specify the serial device information and other information?Could please share any example job and device .jija2 that I can refer regarding what all information needs to be provided for LAVA server to properly connect to H/W and also is u-boot correct for my usecase? or if there is any other suggestions :
device_type: your_device_type
job_name: debian_boot_test
timeouts:
job:
minutes: 15
action:
minutes: 5
connection:
minutes: 2
actions:
- deploy:
timeout:
minutes: 10
to: tftp
images:
rootfs:
url: <Debian Image>
- boot:
method: u-boot
commands: your_device_boot_commands
prompts:
- 'login:'
parameters:
boot_options: root=/dev/ram0
- test:
timeout:
minutes: 5
definitions:
- repository:
metadata:
format: Lava-Test Test Definition 1.0
name: debian-check-test
description: "Test to verify Debian OS"
os:
- debian
run:
steps:
- echo 'Checking if OS is Debian'
- 'if [ "$(lsb_release -is)" = "Debian" ]; then echo "OS is Debian"; else echo "OS is not Debian"; fi'
from: inline
name: debian-check-test
path: inline/debian-check-test.yaml
Hello,
I would like to change path to /mnt from below:
"lava_test_dir": "/lava-%s",
"lava_test_results_dir": "/lava-%s",
I am using an immutable image and due to which don't have permission to write, so cannot create directory eg /lava-34 due to which cannot execute test. We have partition /mnt which we can mount and run the test cases over our os. But for that to work I need to change path. I tried changing it in /usr/lib/python3/dist-packages/lava_dispatcher/deployment_data.py , but jobs fails once I change it. I have only changed the debian section as the image we are using is a debian image.
Could you please help me if there is any option to make these changes while executing job or any other file for devices.
Thanks,
Sweta
Hi ,
I am trying to setup LAVA on Single Master Worker node.
When i am trying to submit job it is running but no device is getting selected nor I am getting any output on UI. When I checked the lava-scheduler it is failing with below error:
root@debian:/etc/apache2/sites-available# systemctl status lava-scheduler
× lava-scheduler.service - LAVA scheduler
Loaded: loaded (/lib/systemd/system/lava-scheduler.service; enabled; preset: enabled)
Active: failed (Result: start-limit-hit) since Tue 2024-07-02 18:52:36 IST; 7min ago
Duration: 1.472s
Process: 10345 ExecStart=/usr/bin/lava-server manage lava-scheduler --level $LOGLEVEL --log-file $LOGFILE $EVENT_URL $IPV6 (code=exited, status=0/SUCCES>
Main PID: 10345 (code=exited, status=0/SUCCESS)
CPU: 1.431s
Jul 02 18:52:35 debian systemd[1]: lava-scheduler.service: Deactivated successfully.
Jul 02 18:52:35 debian systemd[1]: lava-scheduler.service: Consumed 1.431s CPU time.
Jul 02 18:52:36 debian systemd[1]: lava-scheduler.service: Scheduled restart job, restart counter is at 5.
Jul 02 18:52:36 debian systemd[1]: Stopped lava-scheduler.service - LAVA scheduler.
Jul 02 18:52:36 debian systemd[1]: lava-scheduler.service: Consumed 1.431s CPU time.
Jul 02 18:52:36 debian systemd[1]: lava-scheduler.service: Start request repeated too quickly.
Jul 02 18:52:36 debian systemd[1]: lava-scheduler.service: Failed with result 'start-limit-hit'.
Jul 02 18:52:36 debian systemd[1]: Failed to start lava-scheduler.service - LAVA scheduler.
root@debian:/etc/apache2/sites-available# tail -f /var/log/lava-server/lava-scheduler.log
{% extends 'qemu.jinja2' %}
^^^^^^^^^^^^^^^^^^^^^^^^^
[Previous line repeated 977 more times]
File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 1494, in is_up_to_date
return self._uptodate()
^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/jinja2/loaders.py", line 212, in uptodate
return os.path.getmtime(filename) == mtime
^^^^^^^^^^^^^^^^^^^^^^^^^^
RecursionError: maximum recursion depth exceeded
Also when I am accessing /admin/lava_scheduler_app/device/ I am getting " 500 Internal Server Error
maximum recursion depth exceeded"
I have registered my worker node as debian,hostname as debian and device type is qemu. Could you please what I am missing.
Just as the subject says. I am using lava-test-case to confirm whether this particular command is done running and is successful or not, because the test has previously always ended prematurely and not done what I needed it to do. When I run the command manually outside of a lava test job it works just fine, no errors and doesn't end early. It does take a while to complete, but I don't think that's the issue as I use wget to download some large files and those have taken several minutes longer than this one command is supposed to.
How do I get lava to wait for this command to end? Or to change how it checks for command failure?
I get "Received signal: <ENDTC>" almost immediately after running the command, and subsequently the result=fail signal. I really don't know why it won't work, so if there is a change I can make to the files or a config somewhere I would like to know.
Best regards,
Michael
Hi,
I was wondering if it is possible to write a multinode test to boot a device, run a script to do some stuff then have it reboot one way or another, then run tests.
There is intention for pipeline integration down the line, so if it isn't possible to do that with multinode I have considered the idea of setting it up so one regular test runs and runs the script, turns DUT off, then another test runs to boot and actually test the device. This (obviously) has its drawbacks, but it's something of a backup.
If I understand correctly, where certain actions are placed in a multinode test actually matters, so it (in my mind) is possible to have a boot->test to boot a DUT then use test commands to run a script, and then after that another boot->test to boot the device or just login again and then actually run tests. So the final sort of action set up would be boot->test->boot->test, where the first two are for running the script and rebooting, and the latter two are actually testing.
If there are other ideas that are better or equally viable, then Im open to those too, but multinode was the first that came to mind cos of secondary connections and the like.
Best regards,
Michael
Hello
I am experiencing troubles connecting to apt.lavasoftware.org from a VPS
located in the "Paris France SD6" datacenter hosted by gandi.net,
The problematic VPS IP address is 46.226.105.174.
Testing from another VPS, still hosted at gandi.net but located in a
different datacenter (Luxembourg) presents no issues. Both VPS runs
Debian 12 Bookworm.
The symptoms I'm experiencing seems to suggest the TLS handshake gets
truncated as suggested by the 'gnutls-cli' output below reported
-------------------------------------------------------------------------------
# gnutls-cli -d9999 -p 443 apt.lavasoftware.org
...
|<5>| REC[0x56498f5d7740]: Sent Packet[1] Handshake(22) in epoch 0 and length: 402
|<11>| HWRITE: wrote 1 bytes, 0 bytes left.
|<11>| WRITE FLUSH: 402 bytes in buffer.
|<11>| WRITE: wrote 402 bytes, 0 bytes left.
|<3>| ASSERT: ../../lib/buffers.c[get_last_packet]:1185
|<10>| READ: Got 0 bytes from 0x3
|<10>| READ: read 0 bytes from 0x3
|<3>| ASSERT: ../../lib/buffers.c[_gnutls_io_read_buffered]:593
|<3>| ASSERT: ../../lib/record.c[recv_headers]:1195
|<3>| ASSERT: ../../lib/record.c[_gnutls_recv_in_buffers]:1321
|<3>| ASSERT: ../../lib/buffers.c[_gnutls_handshake_io_recv_int]:1467
|<3>| ASSERT: ../../lib/handshake.c[_gnutls_recv_handshake]:1600
|<3>| ASSERT: ../../lib/handshake.c[handshake_client]:3075
|<13>| BUF[HSK]: Emptied buffer
*** Fatal error: The TLS connection was non-properly terminated.
-------------------------------------------------------------------------------
Could you kindly check if my VPS is maybe in a range of IP addresses
blacklisted by the service provider that hosts lavasoftware.org ?
Thanks
j
Hi all,
I have been having issues doing solely boot and test with serial (specific issue(s) not diagnosed, but presumably something to do with telnet), but that's not the discussion here.
I want to be try booting the device with the serial method then deploying/running tests using ssh, since it seems like ssh can't power on and power off the DUT. If that is wrong then please let me know so I can try that.
I have two proposed ideas:
1) boot with serial -> deploy test (however works) -> run test with serial -> power off
2) boot with serial -> deploy test (however works) -> run test with ssh -> power off
I started with a multinode job (I still don't quite understand multinode yet) and this is my current job definition, if you could please help point out issues with it as it doesn't work:
#multinode job for controller deployment
job_name: controller deploy test
protocols:
lava-multinode:
roles:
host:
context:
lava_test_results_dir: /tmp/lava-%s
device_type: controller
timeout:
minutes: 10
count: 1
guest:
context:
lava_test_results_dir: /tmp/lava-%s
request: lava-start
count: 3
expect_role: host
timeout:
minutes: 10
connection: ssh
host_role: host
timeouts:
job:
minutes: 15
action:
minutes: 5
connection:
minutes: 2
priority: medium
visibility: public
actions:
- deploy:
role:
- host
timeout:
minutes: 10
to: tftp
authorize: ssh
kernel:
url: file:///kernel.img
type: uimage
ramdisk:
url: file:///ramdisk.gz
compression: gz
dtb:
url: file:///u-boot.dtb
- deploy:
role:
- guest
timeout:
minutes: 10
to: ssh
protocols:
lava-multinode:
- action: prepare-scp-overlay
request: lava-wait
messageID: ipv4
message:
ipaddr: $ipaddr
- boot:
role:
- host
timeout:
minutes: 5
method: minimal
prompts: ["# "]
auto_login:
login_prompt: "login: "
username: root
- boot:
role:
- guest
timeout:
minutes: 5
prompts: ["# $"]
parameters:
hostID: ipv4 # messageID
host_key: ipaddr # message key
method: ssh
connection: ssh
- test:
role:
- host
timeout:
minutes: 15
definitions:
- repository:
metadata:
format: Lava-Test Test Definition 1.0
name: install-ssh
description: "install step"
scope:
- functional
run:
steps:
# messageID matches, message_key as the key.
- lava-send ipv4 ipaddr=$(lava-echo-ipv4 eth0)
- lava-send lava_start
- lava-sync clients
from: inline
name: ssh-inline
path: inline/ssh-install.yaml
- test:
role:
- guest
timeout:
minutes: 5
definitions:
- repository: https://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@github.com/MichaelPed/lava…
from: git
path: smoke-tests/smoke.yaml
name: smoke-tests
# run the inline last as the host is waiting for this final sync.
- repository:
metadata:
format: Lava-Test Test Definition 1.0
name: client-ssh
description: "client complete"
scope:
- functional
run:
steps:
- df -h
- free
- lava-sync clients
from: inline
name: ssh-client
path: inline/ssh-client.yaml
And this is my device dictionary, if you could please let me know what needs changing:
{% extends 'controller.jinja2' %}
{% set power_off_command = 'python /power.py 12 off' %}
{% set soft_reboot_command = 'reboot'%}
{% set hard_reset_command = 'python /power.py 12 reboot' %}
{% set power_on_command = 'python /power.py 12 on' %}
{% set connection_list = ['uart0'] %}
{% set connection_commands = {'uart0': 'telnet 10.60.2.209 7001'} %}
{% set connection_tags = {'uart0': ['primary', 'telnet']} %}
{% set bootm_kernel_addr = '/kernel.img' %}
If you guys have any questions please ask away! Also, if there is somehow something other than a multinode job that will do what I want then please let me know!
Best regards,
Michael
Hello! I want to run QEMU in DUT and gdb the QEMU in another console to simulate something like a bit flip in memory. What should I do? I just need the new console to run some gdb scripts, like `rust-gdb vmlinux -ex 'target remote /tmp/gdb_socket' -ex 'set *0x40001000 ^= (1<<5)' -ex `c` `, so the new console can not be interactive, live to run some command is enough.
Thanks!
Hello everyone,
I would like to integrate & run a bare metal custom tests suite(pytest) on
lava. Currently I am executing manually from Host Machine(where pytest is
configured and pytest suite will execute) and these commands are sent to
DUT and results are collected on Host Machine.
I am familiar with lava-interactive method but it looks like it will not
suit this requirement(Run pytest on Host and collect logs from DUT).
Does lava suits for this requirement? if someone could give me advice on
this that would be great support.
Regards
Nagendra S
Hi,
I hope to be able to test whether the system can boot normally after multiple restarts (10,000 times)
Because every boot requires the auto_login of the boot action, it seems that a loop needs to be implemented in yaml, and this loop needs to include the boot action. Are there any relevant examples that you can refer to?
There are roughly two types of test logic
1:
while (a<10000)
{
-boot
auto_login
- test
basic io test
software reboot in target board
a=a+1
}
2:
while (a<10000)
{
-boot
auto_login
- test
basic io test
a=a+1
hardware reboot in worker
}
Thanks
Hi,
I am trying to use transfer_overlay as the fs on my DUT is read-only, but the /data/ and /tmp/ directories are writeable to some extent.
I have a working method to use wget, and it works when I use it manually to download the overlay tarball over http, both on my DUT and on the worker device.
However, the LAVA test itself always returns the error 'Network Unreachable'
What are the possible reasons for this?
Best regards,
Michael
Hi Team
Currently for the developing and running of the lava tests, we always run on the Lava master machine through lavacli. But we would like to skip the part of the running the tests from the lava master machine while developing, testing and triaging the tests written in the lava. We only want to run tests which are tested correctly to be run on through ci/cd on the lava master and for development, triaging we want to only use the lava-worker and board. We only want to use the lava master only through CI/CD.
[cid:86e0132a-a975-4d04-9f31-9a89237339d8]
Is there any way to run tests directly from the lava worker instead of going through lava master. I tried "lava-test-shell" and "lava-dispatcher" but it didnt work for me. Can I use the lava api for developing and triaging the test scripts?
Can you please recommend the best practice or setup for developing, testing and triaging the tests written in the lava? so that anyone can run, develop or modify tests quickly and make sure its successful and then only it can run through the lava master machine.
Regards,
Swapnil Tilaye
Hi all,
over the last days I was trying to bring up a setup where we simply try
to boot up a kernel and initrd combination that is provided by Debian
"as is". Sounds trivial at first but I ran into a problem with the
initrd:
The initrd contains some microcode, shipped as prepended uncompressed
archive. The microcode is added because the intel-microcode package was
installed during image generation.
When such an initrd enters the LAVA machinery, where the initrd is
first unpacked and later re-packed the file gets corrupted in a way
that the kernel is unable to unpack/use it. The end result was that NFS
boot did not work as the mounting of the rootfs did not take place.
Interestingly there is no error message, not within the unpack/repack
step in LAVA and the kernel does not complain later during the boot
sequence either.
To disable the LAVA unpack/repack sequence I had to modify my deploy
action. Setting install_overlay and install_modules [2] to false seems
to qualify as workaround:
- deploy:
timeout:
minutes: 15
to: tftp
kernel:
url: <kernel-url>
ramdisk:
url: <initrd-url>
install_overlay: false
install_modules: false
Looking at the initramfs-tools implementation [1] I think that
something similar is missing in LAVA.
Does that make sense? Ideas / comments?
Best regards,
Florian
[1] https://gitlab.com/lava/lava/-/blob/master/lava_dispatcher/actions/deploy/a…
[2] https://salsa.debian.org/kernel-team/initramfs-tools/-/blob/master/unmkinit…
--
Siemens AG, Technology
Linux Expert Center
Hi,
We are seeing below the "Proxy Error" message while loading log size logs
from the LAVA job .
Is there any solution to load a long size log file from a LAVA job ?
[image: image.png]
Hi All,
My board is configured at 55000 baud for serial communication. Can work
only at this baud, Can we use lava with this configuration to get console
prints in test method. Default lava is supporting telnet with standard baud
rates. Is there a way to make 55000 baud for uart and get prints on lava.
Regards
Nagendra S
Hi,
In a previous thread where I was getting help to do a serial connection, I also needed help with the deploy action towards the end, so I'm just continuing that here.
I have been trying to do a deploy action for a couple of days now, and have been unsuccessful thus far. The obvious first hurdle is the image URLs for the various things needed for a tftp deploy action. I have compiled a u-boot kernel, and a DTB came with it, and I managed to sort the few errors I got with those (for now at least). I'm not sure modules are necessary, but I can figure that out later. The biggest issue I'm having is with the ramdisk.
There are many resources online for creating a ramdisk, but it is just a space in the ram for things to be stored temporarily (which I don't mind). But this does not work for lava, as it seems to require a file. I have tried compressing the directory created, and passing that, but because it is still just a directory lava won't accept it after I start the job.
So I'm assuming I have to acquire some kind of image somehow, how can I do this? Is there an online source I can download from? Or if I have to make it, how?
Best regards,
Michael
The process I expect is:
1. test board restart
2. catch the bootup log by serial port
3. deploy the test suit (test script) by ssh
4. run test case by ssh or uart (ssh is better)
5. These are concentrated in one yaml and all logs are captured.
Now I try to restart the test board by below code, it works, board reboot successfully
device:
actions:
deploy:
methods:
{% if flasher_reset_commands %}
flasher:
commands: {{ flasher_reset_commands }}
{% endif %}
test yaml
- deploy:
timeout:
minutes: 30
to: flasher
images:
kernel:
url: http://10.19.207.190/static/docs/v2/contents.html#contents-first-steps-using
But if I add ssh to deploy in the device file, the yaml remains unchanged and I will be prompted with device BAD.
device as blow:
{# device_type: orinnew #}
{% extends 'base.jinja2' %}
{% block body %}
actions:
deploy:
methods:
{% if flasher_reset_commands %}
flasher:
commands: {{ flasher_reset_commands }}
{% endif %}
ssh:
options:
{{ ssh_options }}
# primary connections get this from the device dictionary.
# secondary connections get this from the lava-multinode protocol support.
host: "{{ ssh_host|default('') }}"
port: {{ ssh_port|default(22) }}
user: "{{ ssh_user|default('root') }}"
identity_file: "{{ ssh_identity_file }}"
boot:
connections:
serial:
methods:
ssh:
minimal:
{% endblock body %}
1. Where did I go wrong to cause this problem?
2. Every time you paste code, there will be formatting problems. How to paste code in this mail list? It has an effect similar to markdown.
Hey all,
I am currently trying to add a raspberry pi device to my lava installation however it keeps saying that I have an invalid device configuration.
This is the device dictionary
{% extends 'bcm2711-rpi-4-b.jinja2' %}
{% set connection_list = ['uart0']%}
{% set connnection_commands = {'uart0': 'telnet 10.60.2.209 7002'}%}
{% set connection_tags = {'uart0': ['primary','telnet']}%}
{% set power_off_command = 'python /hbus_power.py off' %}
{% set hard_reset_command = 'python /hbus_power reboot' %}
{% set power_on_command = 'python /hbus_power on' %}
Is there anything I'm doing wrong with this? or any other ideas as to what could be wrong?
Hi,
I have started trying to connect my DUT and run tests via serial. I'm not sure if it's a me issue, or a documentation issue, but I can't seem to puzzle it together.
My DUT is connected to my worker via serial, and I've configured my ser2net.yaml file (docs say it's a .config, but I found no such thing) with both the provided example and something based on what was already in there. The problems arise in the job definition, and device dictionary. I've made a (pretty much) blank device template which extends base-uboot, and does nothing else.
The device dictionary defines power_off_command, power_on_command, soft_reboot, and the connection_(list, commands, and tags). There is also the baud rate and console device
However, I've also seen some device templates use a root device instead or alongside console device, what's the difference?
Whenever I use my worker IP for the telnet command, I get connection refused, but if I use the DUT IP the connection times out after 2 minutes.
I can connect to the device using minicom, so I don't think the serial connection is bad, and I have no problems using SSH on the device and other systems connected to my network.
The only information I have on the device is that it's ARMv5TEJ, it has a custom kernel and runs on otherwise custom hardware.
Below is my device dict:
{% extends 'test-template.jinja2' %}
{% set power_off_command = 'python /power.py off' %}
{% set soft_reboot_command = 'reboot' %}
{% set power_on_command = 'python /power.py on' %}
{% set connection_list = ['uart0'] %}
{% set connection_commands = {'uart0': 'telnet dispatcher01 7001'} %}
{% set connection_tags = {'uart0': ['primary', 'telnet']} %}
{% set console_device = 'ttyUSB0' %}
{% set baud_rate = 115200 %}
My questions are:
- Difference between root_device and console_device in device dictionary/template?
- Do I set the telnet command to use my worker IP or my DUT IP?
- Is there something other than telnet I'm meant to or can use?
- How am I actually meant to configure the ser2net.yaml?
- How do I define the test job for serial to execute commands on the DUT?
Regards,
Michael
Hi all,
Recently I updated device dictionary for my devices in lava server machine.
I added {% set user_commands = {'start_tpm': {'do': 'cmd1',
'undo': 'cmd2' }%}
in the following file /etc/lava-server/dispatcher-config/devices/qemu-01.jinja2
After updating I was successfully able to use these user defined commands in my job definitions using command action where ever i needed them.
But a strange thing is happening after I made the above mentioned changes is I am unable to start the job where it is throwing Infrastructure error: Cannot find command '' in $PATH` when I use minimal boot action in my job.
I cross checked this in 2 ways,
1. I removed my user command device dictionary related changes from my lava server and my job is starting and is completed successfully even if i have minimal boot action in my job.
2. I changed my boot action from ( method: minimal ) to (method: qemu , media: tmpfs ), and i retained user command device dictionary changes in my device template.. In this case also the job started and is completed successfully.
So is there some conflict between adding commands to device dictionary and using minimal boot action in our job ?
I am not sure how these two are related. My use-case badly requires minimal boot actions in my job as well as some user defined commands in device template. Please have a look at my job definition and run log https://lava.ciplatform.org/scheduler/job/1089996. I needed minimal boot action to finish the boot action only if i get a particular shutdown-message or else I want it to timeout and be incomplete.
Can anyone please help me how to resolve this issue ?
Hi All
I am trying to sort out using queries to make charts and I am running into two problems.
1. Even after adding conditions I can not run queries as the button is disabled. I can get around this by changing it to a live query but I would prefer to be able to use the actual button.
2. I can't view a chart.
When I try to do this I just get a loading screen that won't end. According to the Console, the error is either undefined properties or functions not existing.
My query selects all jobs with a test suite with a name equal to 0_smoke-tests and then my graph is Type: pass/fail, Representation: bar, X-axis attribute: none.
I am wondering if these are bugs or if I am doing something wrong.
If anyone can help that would be greatly appreciated
Thanks
Daniel
Hi,
As stated in the subject, I was wondering if it is at all possible to use a repo on Azure DevOps instead of a GitHub one.
I have been using a GitHub repo thus far, and it works just fine, but I need to move my files over to DevOps.
According to the docs here: https://docs.lavasoftware.org/lava/actions-test.html only git is supported at the moment, with URL and tar planned for. Would URL work for this if support came in at any point? Or would azure need its own support?
Or should git work and there is something wrong with the repo URL I am using? Because the particular error I get is that the password could not be read (I'm using a PAT in the link, like I did for GitHub)
Best Regards.
Michael
Hello Team,
I was trying to flash a binary on the device and run the tests on the
device by using lava overlay.
Board flashes successfully but while booting the device in the boot log
there is a word "*** Invalid partition 3 ***" lava catches the key word
and throws an error message.
For the other devices it is not catching this error with the same boot log.
Only on one device I'm observing this issue.
Please let me know How to ignore the keyword and move on to run the tests.
matched a bootloader error message: 'Invalid partition' (17)
boot log:
*U-Boot SPL 2022.04 (Nov 30 2023 - 20:10:15
+0000)power_bd71837_initDDRINFO: start DRAM initDDRINFO: DRAM rate
1600MTSDDRINFO:ddrphy calibration doneDDRINFO: ddrmix config doneNormal
BootTrying to boot from MMC1hab fuse not enabledAuthenticate image from DDR
location 0x401fcdc0...NOTICE: BL31:
v2.6(release):lf-5.15.32-2.0.0-0-gc6a19b1a3-dirtyNOTICE: BL31: Built :
06:37:22, Jun 7 2022U-Boot 2022.04 (Nov 30 2023 - 20:10:15 +0000)CPU:
i.MX8MMD rev1.0 1600 MHz (running at 1200 MHz)CPU: Industrial temperature
grade (-40C to 105C) at 44CReset cause: PORModel: DRAM: 224
MiBboard_initCore: 61 devices, 21 uclasses, devicetree: separateMMC:
FSL_SDHC: 0Loading Environment from nowhere... OKIn: serialOut:
serialErr: serialSEC0: RNG instantiated BuildInfo: - ATF c6a19b1facmod
value is 1!pulse number is 0�flash target is MMC:0Fastboot: NormalNormal
BootHit any key to stop autoboot: 2 end: 3.4.2 bootloader-interrupt
(duration 00:00:05) [common]start: 3.4.3 bootloader-commands (timeout
00:02:52) [common]Setting prompt string to ['=>']bootloader-commands: Wait
for prompt ['=>'] (timeout 00:02:52) 0 Setting prompt string to
['=>']Sending with 5 millisecond of delaysetenv factorymode 1u-boot=>
setenv factorymode 1bootloader-commands: Wait for prompt ['=>'] (timeout
00:02:51)setenv factorymode 1Sending with 5 millisecond of
delaybootu-boot=> bootboot** Invalid partition 3 **Couldn't find partition
mmc 0:3Can't set block deviceuEnv not found in mmcpart3, checking
mmcpart1Failed to load '/boot/system/uEnv'uEnv not found in mmcpart 1
either!Booting from mmc ...## Error: \"w\" not definedmatched a bootloader
error message: 'Invalid partition' (17)end: 3.4.3 bootloader-commands
(duration 00:00:02) [common]case: bootloader-commandscase_id:
204071definition: lavaduration: 1.52extra: ...level: 3.4.3namespace:
commonresult: fail*
Best Regards
Pavan Kumar
Hello everyone,
I am trying to implement a test case scenario in which I need to confirm partition scheme, environment variables after the reboot which is triggered by the watchdog I am using. So basically I need to be able to successfully login to the image after the reboot triggered by the watchdog.
It feels like the link between the test Job and the Linux image is lost after the reboot done by the watchdog. Whatever actions present after that reboot ( boot or test actions ) are not working.
Below is the template of my Job definition:
job details...
device_type: qemu
#### (timeouts)
###(priority)
##(notify, context etc.)
actions
- deploy
images: ###
firmware: ###
- boot
auto_login:
login_prompt: "###"
username: "##"
password_prompt: "###"
password: "##"
- test:
definitions: ###
repository: ####
metadata: ###
run:
steps:
- swupdate -i ####
- reboot
-------------------------------------------------
At this stage I introduced a service which causes kernel panic during the reboot (by the command explicitly given by me above in the job). So after 'X' seconds of watchdog timeout , a second reboot is triggered by the watchdog. During that reboot done by watchdog, the kernel booted successfully, all the services were set up fine, but the test stopped at the login stage.
I think it did not get the "login_prompt" and ''password_prompt" from the boot action I wrote after the above test action ( i.e after reboot done by watchdog ).
- boot
auto_login:
login_prompt: "###"
username: "##"
password_prompt: "###"
password: "##"
So Is there a way to add boot and test actions in such a way that the test job can be continued after reboot is done by a watchdog ?
Note: When I explicitly provide "reboot" in steps section in test action, then the link did not break and I was able to reboot, login and run test action steps successfully no matter how many times I wanted.
This query is specifically for cases in which reboot happened out of test writer's scope. (i.e like reboot triggered by a watchdog)
Hello everyone,
I would like to open thread of discussion to understand about LAVA test framework support for some of the use cases where I’m facing issues.
While testing a reboot scenario in CIP (https://gitlab.com/cip-project/cip-core/isar-cip-core) where reboot is triggered by watchdog. LAVA is unable to do successful reboot.
Following are the steps:
device_type: qemu
job_name: qemu x86_64 software update testing
timeouts:
job:
minutes: 20
action:
minutes: 10
actions:
power-off:
seconds: 60
priority: high
visibility: public
notify:
criteria:
status: finished
recipients:
- to:
method: email
email: sai.sathujoda(a)toshiba-tsip.com
context:
arch: x86_64
lava_test_dir: '/home/lava-%s'
# ACTION BLOCK
actions:
- deploy:
timeout:
minutes: 15
to: tmpfs
images:
system:
image_arg: '-drive file={system},discard=unmap,if=none,id=disk,format=raw -m 1G -serial mon:stdio -cpu qemu64 -smp 4 -machine q35,accel=tcg -global ICH9-LPC.noreboot=off -device ide-hd,drive=disk -nographic'
url: ######.wic.xz
compression: xz
firmware:
image_arg: '-drive if=pflash,format=raw,unit=0,readonly=on,file={firmware}'
url: ######
# BOOT BLOCK
- boot:
timeout:
minutes: 5
method: qemu
media: tmpfs
prompts: ["root@demo:~#"]
auto_login:
login_prompt: "demo login:"
username: "root"
password_prompt: "Password:"
password: "root"
# TEST_BLOCK
- test:
timeout:
minutes: 5
definitions:
- repository:
metadata:
format: Lava-Test Test Definition 1.0
name: sample-test
description: "check reboot version"
run:
steps:
- lava-test-case uname --shell uname -a
- cd /home
- wget --no-check-certificate ####
- lsblk
- swupdate -i cip-core-*
- reboot
from: inline
name: sample-test-1
path: inline/sample-test.yaml
- boot:
timeout:
minutes: 5
method: qemu
media: tmpfs
prompts: ["root@demo:"]
auto_login:
login_prompt: "demo login:"
username: "root"
password_prompt: "Password:"
password: "root"
- test:
timeout:
minutes: 5
definitions:
- repository:
metadata:
format: Lava-Test Test Definition 1.0
name: sample-test
description: "check partition switch"
run:
steps:
- lsblk
from: inline
name: sample-test-2
path: inline/sample-test.yaml
context:
arch: x86_64
lava_test_results_dir: '/home/lava-%s'
A reboot is triggered by watchdog following the reboot done in the test action due to failed case. The reboot triggered by watchdog failed with timeout error at login stage which can be interpreted that last boot action in the above job definition failed to give the assigned login prompts.
I have already received some opinion about this from LAVA users community that LAVA does not support the board being rebooted outside it’s control ( whether by a watchdog or a package ).
However, CIP extensively uses LAVA as test framework to regressively perform many kinds of tests on CIP supported hardware.
Testing the watchdog is an important use case in CIP. Since LAVA is supposed to be test framework which can help to test many type of hardware.
We as CIP project member would like to understand LAVA community future plan to support this use case.
Thanks and Regards,
Sai Ashrith
Hi,
During last couple of weeks I had at least 2 occasions when
lava-publisher stopped working. There is nothing in the logs that
suggest a failure. The only symptom was that no events were published.
Restarting the service fixes the issue. Is this a known bug? I'm
running 2023.10 release.
Best Regards,
Milosz