- lava-users - op-lists.linaro.org

by Frank, Matthias

Hi lava users, sometimes I how in memory allocator stress tests a kernel panic. How can I evaluate this? Is it possible to set a testcase or job to fail if a kernel panic occurs? Matthias

6 years

2
1
0 0

[PATCH] Set systemd test data directory

by Khouloud Touil

In case that 'systemd' is one of the ptest to run and to avoid some test cases failure in the systemd ptest, we need to export the proper directory that contains the needed systemd test data. This patch exports the needed test data directory. Signed-off-by: Khouloud Touil <ktouil(a)baylibre.com> --- automated/linux/ptest/ptest.yaml | 1 + automated/linux/ptest/set-systemd-test-data.sh | 18 ++++++++++++++++++ 2 files changed, 19 insertions(+) create mode 100755 automated/linux/ptest/set-systemd-test-data.sh diff --git a/automated/linux/ptest/ptest.yaml b/automated/linux/ptest/ptest.yaml index 6205c11..85f6893 100644 --- a/automated/linux/ptest/ptest.yaml +++ b/automated/linux/ptest/ptest.yaml @@ -21,5 +21,6 @@ params: run: steps: - cd ./automated/linux/ptest + - ./set-systemd-test-data.sh $TESTS - PYTHONIOENCODING=UTF-8 ./ptest.py -o ./result.txt -t ${TESTS} -e ${EXCLUDE} - ../../utils/send-to-lava.sh ./result.txt diff --git a/automated/linux/ptest/set-systemd-test-data.sh b/automated/linux/ptest/set-systemd-test-data.sh new file mode 100755 index 0000000..0a26b4b --- /dev/null +++ b/automated/linux/ptest/set-systemd-test-data.sh @@ -0,0 +1,18 @@ +#!/bin/bash +#------------------------ +# Some systemd tests needs systemd test data +# to pass. +# This script test when there is systemd tests +# and set the systemd test data +#------------------------ +tests="$*" +test_dirs=['/usr/lib/systemd/ptest/tests/test', '/usr/lib64/systemd/ptest/tests/test', '/usr/lib32/systemd/ptest/tests/test'] +for t in $tests; do + if [ "$t" == "systemd" ]; then + for dir in test_dirs ; do + if [ -e "$dir" ]; then + export SYSTEMD_TEST_DATA=$dir + fi + done + fi +done \ No newline at end of file -- 2.17.1

6 years

2
1
0 0

lava-os-build

by Frank, Matthias

Hi lava users, I have tested lava-os-build on my ptxdist root images and I got this output: root@MBa6x:/lava-557/bin ./lava-os-build _____ ___ ____ _ |_ _/ _ \\ / ___| _ _ ___| |_ ___ _ __ ___root@MBa6x:/lava-557/bin Maybe it's better to look for other files first, like /etc/os-release (or /usr/lib/os-release) for systemd systems. This file is (more) standardized way to detect with os/build is running by evaluating the keys NAME,PRETTY_NAME etc. What is the exact function of lava-os-build? Only diagnostic purpose to find out with os is running? Does something else rely on lava-os-build? What could break if I change that script? Matthias [0] https://www.freedesktop.org/software/systemd/man/os-release.html

6 years, 1 month

2
1
0 0

lxc job error: Lxc container rootfs not found

by jack lu

Hi,according https://validation.linaro.org/static/docs/v2/deploy-lxc.html?highlight=lxc. I run a sample lxc Job Definition as above.. the error msg : Lxc container rootfs not found

6 years, 1 month

2
3
0 0

LAVA Ansible playbooks

by Karsten Tausche

Hi, Are the Ansible playbook for setting up LAVA available somewhere? There is an old migrated issue on GitLab [1] which is closed, but the link to an implementation in there is dead. Is that playbook only internally available for Linaro? Is there anything you could share? It looks like many people are moving to Docker in the moment, but that's not an option for us (at least not for dispatchers), as we need LXC for Android testing. Cheers, Karsten [1] https://git.lavasoftware.org/lava/lava/issues/27

6 years, 1 month

3
4
0 0

LAVA Django template caching

by Karsten Tausche

Hi all, The LAVA docs mention that it's possible to enable template caching via settings.conf [1]. However, doing so leads to a key error when starting the LAVA gunicorn server and in `lava-server manage check --deploy`: ... File "/usr/lib/python3/dist-packages/lava_server/settings/distro.py", line 185, in <module> ("django.template.loaders.cached.Loader", TEMPLATES[0]["OPTIONS"]["loaders"]) KeyError: 'loaders' The LAVA git history shows that this dictionary key was removed in commit 6705ec870[2], "Refresh the setting files". Is this just a bug in LAVA, is Django template caching not supported anymore with LAVA, or are other Debian/pip packages required to be installed to enable to cache? I'm currently testing that on Debian buster, which ships Django version 1.11. Cheers, Karsten [1] https://validation.linaro.org/static/docs/v2/advanced-installation.html?hig… [2] https://git.lavasoftware.org/lava/lava/commit/6705ec870c1f30ea0b7f78f05a322…

6 years, 1 month

2
1
0 0

simple multinode job

by Frank, Matthias

Hi lava users, I want to try the simple client/server multi node job [0]. My job yaml is below. I get following error during job submission via webui: Job submission error: 'dict_keys' object does not support indexing What is wrong? The error message (a python error?) gives me no hint. The log files had have no entries for that issue. Greetings, Matthias job_name: simple multinode timeouts: job: minutes: 90 action: minutes: 10 connection: minutes: 2 priority: medium visibility: public protocols: lava-multinode: roles: server: device_type: tqma6s-mba6x count: 1 client: device_type: tqma6d-mba6x count: 1 timeout: minutes: 6 actions: - deploy: role: - server timeout: minutes: 10 to: tftp kernel: url: <url linuximage> type: zimage nfsrootfs: url: <url rootfs> compression: gz dtb: url: <url dtb> - deploy: role: - client timeout: minutes: 10 to: tftp kernel: url: <url linuximage> type: zimage nfsrootfs: url: <url rootfs> compression: gz dtb: url: <url dtb> - boot: role: - server method: u-boot commands: nfs auto_login: login_prompt: 'MBa6x login:' username: root prompts: - 'root@MBa6x:~' - boot: role: - client method: u-boot commands: nfs auto_login: login_prompt: 'MBa6x login:' username: root prompts: - 'root@MBa6x:~' - test: timeout: minutes: 120 definitions: - repository: <url git> from: git path: simple_multinode.yaml name: simple_multinode [0] https://validation.linaro.org/static/docs/v2/multinodeapi.html#example-1-si…

6 years, 1 month

3
4
0 0

lava-server upgrade to 2019-03

by Frank, Matthias

Hi lava-users, I upgrade my lava-server package (from backports) to 2019-03 on debian stretch. After this is get an django error: ERROR 2019-04-04 08:24:16,694 exception Invalid HTTP_HOST header: '<my-host-name>'. You may need to add '<my-host-name>' to ALLOWED_HOSTS. The browser reports a Bad Request (400) So I added <my-host-name> and "*" to ALLOWED_HOSTS in /usr/lib/python3/dist-packages/lava_scheduler_app/settings.py and /usr/lib/python3/dist-packages/django/conf/global_settings.py but the error is still present. Are these the wrong file or what could be the problem? Matthias

6 years, 1 month

3
2
0 0

Lava Issue: API-Error-server.scheduler.all_devices

by Chenchun (coston)

Job definition & Detailed log in attachment 发件人: Xiaomingwang (Steve) 发送时间: 2019年4月4日 10:29 收件人: lava-users(a)lists.lavasoftware.org 抄送: Yewenzhong (jackyewenzhong) <jack.yewenzhong(a)huawei.com>; Chenchun (coston) <chenchun7(a)huawei.com>; liucaili (A) <liucaili2(a)huawei.com> 主题: 转发: Lava Issue: API-Error-server.scheduler.all_devices 发件人: liucaili (A) 发送时间: 2019年4月4日 10:24 收件人: Xiaomingwang (Steve) <xiaomingwang1(a)huawei.com<mailto:xiaomingwang1@huawei.com>> 主题: Lava Issue: API-Error-server.scheduler.all_devices Dear Sir/Madam, If possible could you please help us analyze the problems encountered in recent Lava tests? Job definition & Detailed log in the attachment. lava-dispatcher version: 2018.11+stretch. The key information is as follows: python /usr/local/bin/module_deploy.py --DUT cloudgame4_01 Traceback (most recent call last): File "/usr/local/bin/module_deploy.py", line 37, in <module> main(args) File "/usr/local/bin/module_deploy.py", line 30, in main module = get_module(dut) File "/usr/local/bin/module_deploy.py", line 14, in get_module devices_list = server.scheduler.all_devices() File "/usr/lib/python2.7/xmlrpclib.py", line 1243, in __call__ return self.__send(self.__name, args) File "/usr/lib/python2.7/xmlrpclib.py", line 1602, in __request verbose=self.__verbose File "/usr/lib/python2.7/xmlrpclib.py", line 1283, in request return self.single_request(host, handler, request_body, verbose) File "/usr/lib/python2.7/xmlrpclib.py", line 1316, in single_request return self.parse_response(response) File "/usr/lib/python2.7/xmlrpclib.py", line 1493, in parse_response return u.close() File "/usr/lib/python2.7/xmlrpclib.py", line 800, in close raise Fault(**self._stack[0]) xmlrpclib.Fault: <Fault -32603: "Internal Server Error (contact server administrator for details): 'NoneType' object has no attribute 'pk'"> Is it a lava bug? We look forward to your reply. Thank you for your assistance. Best Regards, Caili Liu

6 years, 1 month

2
1
0 0

Override method param per device?

by Kumar Gala

Guys, Is it possible for one board/device instance to override a methods param? Specifically if I want to set ‘resets_after_flash’ to ’true’ for one disco-l475-iot1 board instance. Here’s the device type definition: https://git.lavasoftware.org/lava/lava/blob/master/lava_scheduler_app/tests… thanks - k

6 years, 1 month

2
1
0 0

record measurement

by Frank, Matthias

Hi, I want to measure some output values of test cases. I have a shell script which run stress-ng, parse it's output and exports some measurement variables. How can I record theses variables? I tried to use lava-test-case -shell myscript.sh -result pass -measurement $VAR1 -unit unit but no values are recorded. Maybe this is the wrong approach, but how can I do this in the correct way? Matthias

6 years, 1 month

2
3
0 0

Using Fuego with LAVA MultiNode API

by Chase Qi

Hi, In case it is any help for other LAVA and Fuego users, here is another option to run Fuego tests with LAVA MultiNode API. https://github.com/Linaro/test-definitions/blob/master/automated/linux/fueg… Thanks, Chase

6 years, 1 month

1
0
0 0

Running MultiNode job on specific devices

by Frank, Matthias

Hello lava users, how can I run a multinode job on specific devices instead of (a possible set) of device_types? I plan to run CAN or RS485 communication tests which requires a wired connection between duts. Is is possible to submit a job to theses special duts or should is have device_types with only on device instance? Greetings, Matthias

6 years, 1 month

3
2
0 0

Deploying and maintaining a LAVA Lab

by Dave Pigott

Hi all, We’ve started a collaborate page on how to deploy and maintain devices and instances of LAVA Labs. https://collaborate.linaro.org/display/CTT/LAB+Device+Deployment+Guide <https://collaborate.linaro.org/display/CTT/LAB+Device+Deployment+Guide> Please understand that this is a work in progress. All suggestions for changes and additions are more than welcome. email lava-lab-team(a)linaro.org with questions and suggestions. Dave ---------------- Dave Pigott LAVA Lab Lead Linaro Ltd t: (+44) (0) 1223 400063

6 years, 1 month

2
2
0 0

What happened if different worker link to different coordinator in one lava instance?

by cnspring2002

As title: What happened if different worker link to different coordinator in one lava instance for multinode job?

6 years, 1 month

3
2
0 0

query about lava usage for MCU like solution

by Hake Huang

Hi Sirs, Does LAVA support MCU like system? Which usually using a debugger to download image to a MCU internal flash, and get uart result? Regards, Hake

6 years, 1 month

3
5
0 0

jobs using notify with tokens, resubmit as different user

by Kevin Hilman

How can you have more than one LAVA user have the same token secret (e.g. for a notify callback.)? Example use case: - LAVA job with notify callbacks using token names - submited as user "bob", token names of "bob" map to actual token secrets - job fails - user "lab-admin" fixes some lab issues, re-submits job - job passes, but callbacks fail because tokens are associated with user "bob" Since the re-submitted job runs as user "lab-admin", the same token names and corresponding secrets don't exist. Naively, user "lab-admin" tries to copy the token secrets from user "bob" keeping the same token names, but this fails saying that "secret already exists". Why can't different users have the same secrets? I haven't looked at the code, but this limitation kind of suggests that the secret itself is the key in the db, which would prevent multiple secrets of the same. Kevin

6 years, 1 month

3
5
0 0

kexec boot mechanism

by Hemanth K V

Hello Lava-users, Do we have any sample reference yaml job definition which has booting the target with kexec method. Thanks, Hemanth.

6 years, 1 month

4
7
0 0

Build information in metadata

by Frank, Matthias

Hello lava users, I use in a test job linuximage, rootfs and dtb from file server location retrieved via http. These files are nightly build from Jenkins. The file links are static so I can use new images within the same test job definition. Jenkins generates also a build information text file with commit IDs an so on. How can I integrate the content of this file in the job metadata or job result log? Using metadata: build-readme: <link> in the job definition is not possible because the link is stable over different build, but the file content is changing. The DUT has no access to the file location because it is in an isolated network. Is there a way to cat the file content on the worker to metadata or job result? Greetings, Matthias

6 years, 1 month

2
1
0 0

Invalid device configuration

by Frank, Matthias

Hello lava users, my lava 2019.01 shows me a device with Health == Bad. The Transitions log give that message: March11, 3:02p.m. lava-health Good → Bad (Invalid device configuration) Where can I found information about a misconfigured device? The log files did not provide (in my eyes) sufficient information. As Health checks I use the smoke-tests. Kinds Regards, Matthias

6 years, 2 months

2
5
0 0

apt get install hang

by cnspring2002

I use lxc-mocker in docker container, everything is ok. Just if the package did not install in advance, then next command will hang in web: lxc-create -t ubuntu -n test_lava -- --release xenial --mirror http://mirror.bytemark.co.uk/debian --packages systemd,systemd-sysv --arch amd64 I enter the container, see: root@df67cf292479:/# ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 17:20 pts/0 00:00:00 /usr/bin/python3 /usr/bin/lava-slave --level DEBUG --log-file - --master tcp://192.168.1.10:5556 --socket-a root 8 0 0 17:20 pts/1 00:00:00 /bin/bash root 13 1 0 17:20 pts/0 00:00:00 lava-run [job: 837] root 19 13 0 17:20 pts/0 00:00:00 /bin/bash /usr/bin/lxc-create -q -t ubuntu -n test_lava -- --release xenial --mirror http://mirror.bytemark.c root 261 19 1 17:20 pts/0 00:00:00 apt-get -q install -y systemd systemd-sysv root 473 8 0 17:21 pts/1 00:00:00 ps -ef Seems "apt-get install" hang, but if I pre-install the systemd,etc, then everything will ok, the lxc-create will not hang. The same for lxc-attach mocker etc, all will hang if do "apt-get install" & no pakcage preinstall. Please suggest, thanks.

6 years, 2 months

2
4
0 0

docker-compose with lava

by Dan Rue

I started playing with the official lava images today and wanted to share my work in progress, in case others are doing something similar or have feedback. My goal is to deploy a lava lab locally. My architecture is a single host (for now) that will host both the lava server and one dispatcher. Once it's all working, I'll start deploying a qemu worker followed by some actual boards (hopefully). So far, I have the following docker-compose.yml: version: '3' services: database: image: postgres:9.6 environment: POSTGRES_USER: lavaserver POSTGRES_PASSWORD: mysecretpassword PGDATA: /var/lib/postgresql/data/pgdata volumes: - ${PWD}/pgdata:/var/lib/postgresql/data/pgdata server: image: lavasoftware/amd64-lava-server:2018.11 ports: - 80:80 volumes: - ${PWD}/etc/lava-server/settings.conf:/etc/lava-server/settings.conf - ${PWD}/etc/lava-server/instance.conf:/etc/lava-server/instance.conf depends_on: - database dispatcher: image: lavasoftware/amd64-lava-dispatcher:2018.11 environment: - "DISPATCHER_HOSTNAME=--hostname=dispatcher.lava.therub.org" - "LOGGER_URL=tcp://server:5555" - "MASTER_URL=tcp://server:5556" With that file, settings.conf, and instance.conf in place, I run 'mkdir pgdata; docker-compose up' and the 3 containers come up and start talking to each other. The only thing exposed to the outside world is lava-server's port 80 at the host's IP, which gives the lava homepage as expected. The first time they come up, the database isn't up fast enough (it has to initialize the first time) and lava-server fails to connect. If you cancel and run again it will connect the second time. A few things to note here. First, it doesn't seem like a persistent DB volume is possible with the existing lava-server container, because the DB is initialized at container build time rather than run-time, so there's not really a way to mount in a volume for the data. Anyway, postgres already solves this. In fact, I found their container documentation and entrypoint interface to be well done, so it may be a nice model to follow: https://hub.docker.com/_/postgres/ The server mostly works as listed above. I copied settings.conf and instance.conf out of the original container and into ./etc/lava-server/ and modified as needed. The dispatcher then runs and points to the server. It's notable that docker-compose by default sets up a docker network, allowing references to "database", "server", "dispatcher" to resolve within the containers. Once up, I ran the following to create my superuser: docker-compose exec server lava-server manage users add --staff --superuser --email dan.rue(a)linaro.org --passwd foo drue Now, for things I've run into and surprises: - When I used a local database, I could log in. With the database in a separate container, I can't. Not sure why yet. - I have the dreaded CSRF problem, which is unlikely to be related to docker, but the two vars in settings.conf didn't seem to help. (I'm terminating https outside of the container context, and then proxying into the container over http) - I was surprised there were no :latest containers published - I was surprised the containers were renamed to include the architecture name was in the container name. My understanding is that's the 'old' way to do it. The better way is to transparently detect arch using manifests. Again, see postgres/ as an example. - my pgdata/ directory gets chown'd when I run postgres container. I see the container has some support for running under a different uid, which I might try. - If the entrypoint of server supported some variables like LAVA_DB_PASSWORD, LAVA_DB_SERVER, SESSION_COOKIE_SECURE, etc, I wouldn't need to mount in things like instance.conf, settings.conf. I pushed my config used here to https://github.com/danrue/lava.home.therub.org. Git clone and then run 'docker-compose up' should just work. Anyway, thanks for the official images! They're a great start and will hopefully really simplify deploying lava. My next step is to debug some of the issues I mentioned above, and then start looking at dispatcher config (hopefully it's just a local volume mount). Dan

6 years, 2 months

4
23
0 0

AOSP multiple node job

by Chase Qi

Hi, In most cases, we don't need multiple node job as we can control AOSP DUT from lxc via adb over USB. However, here is the use case. CTS/VTS tradefed-shell --shards option supports to split tests and run them on multiple devices in parallel. To leverage the feature in LAVA, we need multinode job, right? And in multinode job, master-node lxc needs access to DUTs from salve nodes via adb over tcpip, right? Karsten shared a job example here[1]. This probably is the most advanced usage of LAVA, and probably also not encouraged? To make it more clear, the connectivity should look like this. master.lxc <----adb over usb----> master.dut master.lxc <----adb over tcpip ---> slave1.dut master.lxc <----adb over tcpip ---> slave2.dut .... I see two options for adb over tcpip. Option #1: WiFi. adb over wifi can be enabled easily by issuing adb cmds from lxc. I am not using it for two reasons. * WiFi isn't reliable for long cts/vts test run. * In Cambridge lab, WiFi sub-network isn't accessible from lxc network. Because of security concerns, there is no plan to change that. Option #2: Wired Ethernet. On devices like hikey, we need to run 'pre-os-command' in boot action to power off OTG port so that USB Ethernet dongle works. Once OTG port is off, lxc has no access to the DUT, then test definition should be executed on DUT, right? I am also having the following problems to do this. * Without context overriding, overlay tarball will be applied to '/system' directory and test job reported "/system/bin/sh: /lava-247856/bin/lava-test-runner: not found"[2]. * With the following job context, LAVA still runs '/lava-24/bin/lava-test-runner /lava-24/0' and it hangs there. It is tested in my local LAVA instance, test job definition and test log attached. Maybe my understanding on the context overriding is wrong, I thought LAVA should execute '/system/lava-24/bin/lava-test-runner /system/lava-24/0' instead. Any suggestions would be appreciated. context: lava_test_sh_cmd: '/system/bin/sh' lava_test_results_dir: '/system/lava-%s' I checked on the DUT directly, '/system/lava-%s' exist, but I cannot really run lava-test-runner. The shebang line seems problematic. --- hacking --- hikey:/system/lava-24/bin # ./lava-test-runner /system/bin/sh: ./lava-test-runner: No such file or directory hikey:/system/lava-24/bin # cat lava-test-runner #!/bin/bash #!/bin/sh .... # /system/bin/sh lava-test-runner lava-test-runner[18]: .: /lava/../bin/lava-common-functions: No such file or directory --- ends --- I had a discussion with Milosz. He proposed the third option which probably will be the most reliable one, but it is not supported in LAVA yet. Here is the idea. Milosz, feel free to explain more. **Option #3**: Add support for accessing to multiple DUTs in single node job. * Physically, we need the DUTs connected via USB cable to the same dispatcher. * In single node job, LAVA needs to add the DUTs specified(somehow) or assigned randomly(lets say both device type and numbers defined) to the same lxc container. Test definitions can take over from here. Is this can be done in LAVA? Can I require the feature? Any suggestions on the possible implementations? Thanks, Chase [1] https://review.linaro.org/#/c/qa/test-definitions/+/29417/4/automated/andro… [2] https://staging.validation.linaro.org/scheduler/job/247856#L1888

6 years, 2 months

3
6
0 0

How to full validate PSCI

by Lionel DEBIEVE

Dear All, I'm currently trying to check and implement a complete validation of my PSCI solution. The standard behavior for PSCI is to manage shutdown, reset and low power mode. I'm wondering to find the best way to manage it through LAVA. So two questions: - Is there a proper way to check the reboot behavior on a target (soft reboot)? Using shell command is not possible as the is no return from reboot. - Shutdown? I'm wondering to test the shutdown command and trig an automatic wake up after x seconds. My wish is to check that no watchdog occurred during that time (which is the only way to know if the shutdown was properly working). So it will be a similar behavior that the reboot command. Thanks for your support, BR Lionel

6 years, 2 months

2
2
0 0

撤回: Abnormal interruption during Lava test execution

by Chenchun (coston)

Chenchun (coston) 将撤回邮件“[Lava-users] Abnormal interruption during Lava test execution”。

6 years, 2 months

1
0
0 0

Abnormal interruption during Lava test execution

by Chenchun (coston)

Hi Remi Thanks for the quick reply. Attached please find the document(the raw job log and the job definition). Hello, that's in fact maybe a bug in LAVA. To help me reproduce the error, could you send: * the raw job log (click on "actions / plain logs" in the job page) * the job definition Thanks Le jeu. 21 févr. 2019 à 04:05, Chenchun (coston) <chenchun7(a)huawei.com<mailto:chenchun7@huawei.com>> a écrit : Dear Sir/Madam, could you please help us analyze the problems encountered in recent Lava tests? Detailed log in the attachment. lava-dispatcher version: 2018.11+stretch. The key information is as follows: Bug error: argument of type 'NoneType' is not iterable Chase Qi preliminary positioning is a lava bug. We look forward to your reply. Thank you for your assistance. Best Regards, Caili Liu _______________________________________________ Lava-users mailing list Lava-users(a)lists.lavasoftware.org<mailto:Lava-users@lists.lavasoftware.org> https://lists.lavasoftware.org/mailman/listinfo/lava-users -- Rémi Duraffort LAVA Team, Linaro

6 years, 2 months

2
2
0 0

Specifying metadata for test cases

by Tim Jaacks

Hello everyone, I know from the LAVA documentation how to add metadata to jobs and test suites. When I look at test results, I see that test cases have metadata, too. E.g. https://validation.linaro.org/results/testcase/9759970 shows the following metadata: case: linux-linaro-ubuntu-lscpu definition: 0_smoke-tests-lxc result: pass Is there a possibility to add custom metadata to test cases? Mit freundlichen Grüßen / Best regards Tim Jaacks DEVELOPMENT ENGINEER Garz & Fricke GmbH Tempowerkring 2 21079 Hamburg Direct: +49 40 791 899 - 55 Fax: +49 40 791899 - 39 tim.jaacks(a)garz-fricke.com www.garz-fricke.com SOLUTIONS THAT COMPLETE! Sitz der Gesellschaft: D-21079 Hamburg Registergericht: Amtsgericht Hamburg, HRB 60514 Geschäftsführer: Matthias Fricke, Manfred Garz

6 years, 2 months

2
3
0 0

Changing lava job status

by tomato

Hi. To make customized performance regression test job, I need to change Lava job status as I want. When I run test like kselftest, Lava dispatcher run test program under DUT. And after finishing lava test job, Job state is always 'Complete' whatever result(pass/fail) each test return. Attached image file is about kselftest result from lava test job. I want to make lava job status as 'Fail' or 'Canceled' when certain test return fail. I would like to ask your advice. Best regards Seoji Kim

6 years, 2 months

2
1
0 0

Abnormal interruption during Lava test execution

by Chenchun (coston)

Dear Sir/Madam, could you please help us analyze the problems encountered in recent Lava tests? Detailed log in the attachment. lava-dispatcher version: 2018.11+stretch. The key information is as follows: Bug error: argument of type 'NoneType' is not iterable Chase Qi preliminary positioning is a lava bug. We look forward to your reply. Thank you for your assistance. Best Regards, Caili Liu

6 years, 2 months

2
1
0 0

lava-master crashed by a job

by Axel Lebourhis

Hi all, Yesterday, we faced a weird issue with LAVA. A job was running and returned an error saying "metadata is too long". Right after that, the worker that was running the job went offline, and the lava-master raised an "unknown exception", making it crash. In attachement, you will find the full job error saying metadata is too long, the full job log, and the lava-master.log when the exception occured. Hope this helps. Axel

6 years, 2 months

2
1
0 0

lava in docker.

by cnspring2002

I use next command to start lavadispatcher: docker run -idt --net=host --privileged -v /dev:/dev -v /var/lib/lava/dispatcher/tmp:/var/lib/lava/dispatcher/tmp -e "DISPATCHER_HOSTNAME=--hostname=myname" -e "LOGGER_URL=tcp://master_ip:5555" -e "MASTER_URL=tcp://master_ip:5556" --name test_lava lavasoftware/lava-dispatcher:2019.01 In container, I start tftp and nfs. But the nfs always cannot be start successfully. I use "service nfs-kernel-server start" to start it, also before that I did "rpcbind". The start shows below seems ok: # service nfs-kernel-server start [ ok ] Exporting directories for NFS kernel daemon.... [ ok ] Starting NFS kernel daemon: nfsd mountd. But if do next we can see the nfs still not start. # service nfs-kernel-server status nfsd not running Any suggestion?

6 years, 2 months

2
3
0 0

uboot ums flakiness

by Diego Russo

Hello, We've been using uboot-ums for WaRP7 but we've been having intermittent failures when it tried to run dd to flash the image. Provided we need to look better into the root cause of this issue, we'd like to make the flashing phase a little more reliable. I have few questions, coming from different angles: * LAVA uses dd command to flash the image. Is there a way to specify the usage of bmap-tools? * let's say dd times out (this is what usually happen). Is there a mechanism to restart the actions (deploy and boot) in case of timeout? If you have any other suggestion, let me know! Cheers -- Diego Russo | Staff Software Engineer | Mbed Linux OS ARM Ltd. CPC1, Capital Park, Cambridge Road, Fulbourn, CB21 5XE, United Kingdom http://www.diegor.co.uk - https://os.mbed.com/linux-os/ IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

6 years, 2 months

3
4
0 0

Interactive Test steps with user input

by Pete Dyer

Hi, I have a test step that requires user input to a defined prompt. Is there a way I can automate this in LAVA ? I can see how we do this for Boot Actions and I've looked at interactive jobs that communicate to U-boot but these don't seem to fit my use case. Thanks. Pete IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

6 years, 2 months

3
4
0 0

[Lava-Users] How to do Android testing on existing images

by ankit gupta

Hi All, I am new to Android testing, I am using standard board *i.mx6/rpi3* and flashed Android on it, I can connect with these board using "adb" tool. I want to do some system test using LAVA framework, and I don't want to reflash image every time(wants to test on the existing system). Can someone please let me know how I can do Android testing on existing flashed image? Thanks, Ankit

6 years, 2 months

2
1
0 0

Boot selection in u-boot from Lava?

by Denis HUMEAU

Dear Lava users, Our embedded SW offers 3 boot modes, selectable from u-boot. When booting, u-boot offers the possibility to select the boot mode: Select the boot mode 1: <boot mode 1> 2: <boot mode 2> 3: <boot mode 3> Enter choice: 1: <boot mode 1> <boot mode 1> is a default value, used after a counter has expired. All this is done using the extlinux feature. We have scripts that allow to select the boot mode, using Kermit. Now, we'd like to integrate this boot mode selection in a Lava job, and our current solution is not compatible. In Lava we may boot the kernel, modify the extlinux configuration and reboot, but do you know a direct way (with interactive mode maybe) to select the boot mode from u-boot? Best regards, Denis

6 years, 2 months

2
1
0 0

Multinode pipeline synchronisation / timeouts

by Tim Jaacks

Hello everyone, I am having problems with timeouts when using the LAVA multinode protocol. Assume the following minimal pipeline with two nodes (device = DUT, remote = some kind of external hardware interfacing with the DUT): - deploy: role: device - boot: role: device - deploy: role: remote - boot: role: remote - test: role: remote - test: role: device What I would expect: The device is booted first, then the remote is booted. Afterwards, the tests run on both nodes, being able to interact with each other. The pipeline model seems to be implemented in a way that each node has its own pipeline. This kind of makes sense, because the tests of course have to run simultaneously. However, in my case booting the device takes a lot more time than booting the remote. This makes the 'test' stage on the remote run a lot earlier than the 'test' stage on the device. My problem: How do I define a useful timeout value for the 'test' stage on the remote? Obviously I have to take the boot time difference between the two nodes into account. This seems counter-intuitive to me, since the timeout value should affect the actual test only. What happens if I use an image on the device which takes even a lot more time to boot? Or if I insert more testcases on the device which do not need the remote before? In both cases I would have to adjust the timeout value for the remote 'test' stage. Is this a design compromise? Or are there any possibilities of synchronizing pipeline stages on different nodes? I am thinking of some mechanism like "do not start 'test' stage on remote before 'boot' stage on device has finished". Mit freundlichen Grüßen / Best regards Tim Jaacks DEVELOPMENT ENGINEER Garz & Fricke GmbH Tempowerkring 2 21079 Hamburg Direct: +49 40 791 899 - 55 Fax: +49 40 791899 - 39 tim.jaacks(a)garz-fricke.com www.garz-fricke.com WE MAKE IT YOURS! Sitz der Gesellschaft: D-21079 Hamburg Registergericht: Amtsgericht Hamburg, HRB 60514 Geschäftsführer: Matthias Fricke, Manfred Garz, Marc-Michael Braun

6 years, 3 months

2
8
0 0

defining device specific values in device dictionary

by Milosz Wasilewski

Hi, Is there a way to specify an arbitrary parameter, that is board specific, in device dictionary? The parameter should be available from test shell. The use case is as follows: DUT is connected to test supporting hardware (chamelium board in this case). Tests access supporting hardware using IP address. IP address is 'static' and the supporting hw is assumed to be always turned on. Supporting hardware is dedicated to specific DUT and can't be shared (because of HW connections) between other boards of the same type (similar to energy probes). Tests run directly on DUT and access supporting hardware from there. I found 'exported parameters' in device dictionary docs: https://master.lavasoftware.org/static/docs/v2/lava-scheduler-device-dictio…. But they only list device_ip, device_mac and storage_info. Is there a way to extend this list? If not, is there any other way to provide device specific information to test shells? milosz

6 years, 3 months

2
2
0 0

Functional test for imx8m

by Axel Lebourhis

Hi everyone, I'm writing this email after discussion with Neil. I'm working at NXP and he told me Linaro wanted to run functional tests on imx8m with the new u-boot support. He told me it requires full open access, no license click-through or passwords. Philippe Mazet is more qualified to answer this type of question as I only use Android atm. He will follow up the discussion. Here you have the Yocto source code : https://source.codeaurora.org/external/imx/imx-manifest/tree/README?h=imx-l… You can get the latest GA release with this : repo init -u https://source.codeaurora.org/external/imx/imx-manifest -b imx-linux-sumo -m imx-4.14.78-1.0.0_ga.xml You can build these sources like this : DISTRO=fsl-imx-wayland MACHINE=imx8mqevk source ./fsl-setup-release.sh -b build But, this will redirect you to a license click-through. However, you can bypass the license click-through, like "auto accept it" with this command : EULA=1 DISTRO=fsl-imx-wayland MACHINE=imx8mqevk source ./fsl-setup-release.sh -b build We use this bypass to automate builds. So, let us know if this would be suitable for Linaro's needs. Best regards, Axel

6 years, 3 months

3
3
0 0

Creating lxc container inside docker container issue

by tomato

Hi. I encountered a problem running android apk test. I tried to run Antutu apk-automation test on android device. (https://github.com/Linaro/test-definitions/tree/master/automated/android/ap…) This is my job yaml file except device boot action. device_type: s5p4418-navi-ref-type job_name: s5p4418-navi-ref-all-pass tags: - s5p4418-navi-ref timeouts: job: minutes: 60 action: minutes: 30 connection: minutes: 20 priority: medium visibility: public protocols: lava-lxc: name: s5p4418-test template: debian distribution: debian release: jessie arch: amd64 actions: - deploy: failure_retry: 3 namespace: tlxc timeout: minutes: 5 to: lxc packages: - wget - zip - unzip - apt os: debian - boot: namespace: tlxc timeout: minutes: 10 method: lxc prompts: - 'root@(.*):/#' - 'console:/' - ':/' - test: namespace: tlxc timeout: minutes: 3 failure_retry: 3 definitions: - from: inline name: install-google-fastboot path: inline/install-google-fastboot.yaml repository: metadata: format: Lava-Test Test Definition 1.0 name: install-fastboot description: "Install fastboot provided by google" run: steps: - wget https://dl.google.com/android/repository/platform-tools_r26.0.0-linux.zip - unzip platform-tools_r26.0.0-linux.zip - ln -s `pwd`/platform-tools/fastboot /usr/bin/fastboot - ln -s `pwd`/platform-tools/adb /usr/bin/adb - fastboot --version - test: namespace: tlxc timeout: minutes: 10 failure_retry: 3 definitions: - repository: https://github.com/Linaro/test-definitions.git from: git path: automated/android/apk-automation/apk-automation.yaml name: antutu6 params: TEST_NAME: antutu6 This is log created when test job create and run lxc container. |start: 1 lxc-deploy (timeout 00:05:00) [tlxc]||start: 1.1 lxc-create-action (timeout 00:05:00) [tlxc]||nice lxc-create -q -t debian -n s5p4418-test-46 -- --release jessie --packages systemd,systemd-sysv --arch amd64||Container created successfully||end: 1.1 lxc-create-action (duration 00:00:30) [tlxc]||case: lxc-create-action case_id: 541 definition: lava duration: 29.74 extra: ... level: 1.1 namespace: tlxc result: pass <http://192.168.1.44/results/testcase/541>||start: 1.2 lxc-create-udev-rule-action (timeout 00:04:30) [tlxc]||device info file '/var/lib/lava/dispatcher/tmp/46/lxc-create-udev-rule-action-ymvjdkgm/device-info.yaml' created with: [{'board_id': 's5p4418-navi-ref'}]||udev rules file '/var/lib/lava/dispatcher/tmp/46/lxc-create-udev-rule-action-muvnvu7f/100-lava-s5p4418-test-46.rules' created||ACTION=="add", ATTR{serial}=="s5p4418-navi-ref", RUN+="/usr/share/lava-dispatcher/lava_lxc_device_add.py --lxc-name s5p4418-test-46 --device-node $name --job-id 46 --logging-url tcp://192.168.1.44:5557" ||'/etc/udev/rules.d/100-lava-s5p4418-test-46.rules' symlinked to '/var/lib/lava/dispatcher/tmp/46/lxc-create-udev-rule-action-muvnvu7f/100-lava-s5p4418-test-46.rules'||nice udevadm control --reload-rules||action: lxc-create-udev-rule-action command: ['nice', 'udevadm', 'control', '--reload-rules'] message: Command '['nice', 'udevadm', 'control', '--reload-rules']' returned non-zero exit status 2 output: Command '['nice', 'udevadm', 'control', '--reload-rules']' returned non-zero exit status 2 ||udev rules reloaded.||end: 1.2 lxc-create-udev-rule-action (duration 00:00:00) [tlxc]||start: 1.3 boot-lxc (timeout 00:04:30) [tlxc]||nice lxc-start -n s5p4418-test-46 -d||action: boot-lxc command: ['nice', 'lxc-start', '-n', 's5p4418-test-46', '-d'] message: Command '['nice', 'lxc-start', '-n', 's5p4418-test-46', '-d']' returned non-zero exit status 1 output: lxc-start: tools/lxc_start.c: main: 366 The container failed to start. lxc-start: tools/lxc_start.c: main: 368 To get more details, run the container in foreground mode. lxc-start: tools/lxc_start.c: main: 370 Additional information can be obtained by setting the --logfile and --logpriority options. ||Wait until 's5p4418-test-46' state becomes RUNNING||nice lxc-info -sH -n s5p4418-test-46||output: STOPPED||output: ||nice lxc-info -sH -n s5p4418-test-46||output: STOPPED||output: | I wonder if it is okay to create and run lxc container in docker container. I wrote job yaml file by reference this job definition. (https://validation.linaro.org/scheduler/job/1656201/definition) Best regards Seoji Kim

6 years, 3 months

3
2
0 0

swupdate image deployment

by Hemanth K V

Hello Lava-users, Do we have support in LAVA for deploying to target the SWUpdate images directly if target supports the swupdate image deployment( http://sbabic.github.io/swupdate/). Thanks, Hemanth.

6 years, 3 months

2
1
0 0

Device-type support for TI AM4378 IDK

by Olli Vainola

Hi all, We're planning to use TI AM4378 IDK in the board farm for testing (http://www.ti.com/tool/TMDSIDK437X) but it seems there is not device-type for this board. I tried to search from these links: - https://git.lavasoftware.org/lava/lava/tree/master/lava_scheduler_app/tests… - https://git.linaro.org/lava/lava-lab.git/tree/shared/device-types Did I just miss it, or is it missing from the device-types? If it is missing, are there any templates available / what template could be used as a base for the device Br Olli Väinölä IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

6 years, 3 months

2
1
0 0

[Lava-Users] Android testing on Rpi3 board

by ankit gupta

Hi All, I have a Rpi3 board with android install on it, it can be accessed using adb from my Linux PC. I am successfully able to use Lava-LXC for testing a device. Can someone please share the steps how I can use LAVA to test an android device and how the setup will look like. If anyone can share a test job with some basic android tests would be very helpful. Thanks, Ankit

6 years, 3 months

2
1
0 0

Query to get jobs between two dates

by Denis HUMEAU

Hello Lava users, I'm trying to build a query to monitor all the test jobs done in a given period. Let's say all the jobs done in January, just to check the robustness of my setup (incomplete jobs rate). In the query interface, I can add a condition on start_time of a job, but only with a "greater than" operator, when I want to add also a condition on start_time "less than". Do you have any hint to do this? Best regards, Denis

6 years, 3 months

3
3
0 0

USB-network gadget and LXC container

by Diego Russo

Hello, I have the following setup: a WaRP7 which exposes a network connection over USB gadget driver (http://trac.gateworks.com/wiki/linux/OTG#g_etherGadget) A possible test case is to have some process running on the LAVA dispatcher (within a LXC container) which targets the WaRP7 over this network interface. Through LXC I'm able to passtrhough this interface from the host to the container and use it within the container (via /etc/lxc/default.conf) If a test requires the reboot of the WaRP7, the usb0 interface disappears from the container. When the WaRP7 boots again the usb0 interface is available on the host (but not in the container). Things I tried or thought about: * I tried synchronizing boots both of the WaRP7 and LXC container but it seems not possible to "reboot" (restart) a container within the same job execution. * Is it possible to "restart" a container during a job execution? * Outside LAVA it is possible to run a command (lxc-device --name diegor-test -- add usb0) which re-passthrough the interface from Linux to LXC container. * Is it possible to run the above command ad job execution time on the lava dispatcher? How can I solve this situation? Cheers -- Diego Russo Staff Software Engineer - diego.russo(a)arm.com Direct Tel. no: +44 1223 405920 Main Tel. no: +44 1223 400400 ARM Ltd. CPC1, Capital Park, Cambridge Road, Fulbourn, CB21 5XE, United Kingdom http://www.diegor.co.uk - http://twitter.com/diegor http://www.linkedin.com/in/diegor IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

6 years, 3 months

2
10
0 0

LAVA LMP still active?

by Tim Jaacks

Hello, while searching for possibilities of testing external interfaces of my DUTs I found this presentation: https://wiki.linaro.org/Internal/lava/Lava-lmp?action=AttachFile&do=get&tar… Is the LAVA LMP project still active? If yes, how can I find information on this? Mit freundlichen Grüßen / Best regards Tim Jaacks DEVELOPMENT ENGINEER Garz & Fricke GmbH Tempowerkring 2 21079 Hamburg Direct: +49 40 791 899 - 55 Fax: +49 40 791899 - 39 tim.jaacks(a)garz-fricke.com www.garz-fricke.com<http://www.garz-fricke.com/> WE MAKE IT YOURS! [cid:image001.jpg@01D4B71E.962E36E0] Sitz der Gesellschaft: D-21079 Hamburg Registergericht: Amtsgericht Hamburg, HRB 60514 Geschäftsführer: Matthias Fricke, Manfred Garz, Marc-Michael Braun

6 years, 3 months

3
4
0 0

u-boot one-too-many prompt problem

by Dan Rue

I'm sorry, as surely this is an FAQ but I've spent quite a bit of time troubleshooting and reading. This is very similar to Kevin's thread from May subject 'u-boot devices broken after 2018.4 upgrade, strange u-boot interaction'. In that thread's case, the issue was that interrupt_char was being set to "\n". My symptoms are the same, but interrupt_char is set to " " or "d". I'm running LAVA from the latest released containers (2018.11), and trying to use a beaglebone-black with a more recent u-boot than exists in validation.l.o. qemu works fine. The problem seems to be that LAVA thinks there's a prompt when there isn't, and so it sends commands too quickly. Here's example output from the serial console (job link[2]): U-Boot 2017.07 (Aug 31 2017 - 15:35:58 +0000) CPU : AM335X-GP rev 2.1 I2C: ready DRAM: 512 MiB No match for driver 'omap_hsmmc' No match for driver 'omap_hsmmc' Some drivers were not found MMC: OMAP SD/MMC: 0, OMAP SD/MMC: 1 Net: cpsw, usb_ether Press SPACE to abort autoboot in 10 seconds => => setenv autoload no => setenv initrd_high 0xffffffff => setenv fdt_high 0xffffffff => dhcp link up on port 0, speed 100, full duplex BOOTP broadcast 1 BOOTP broadcast 2 BOOTP broadcast 3 DHCP client bound to address 10.100.0.55 (1006 ms) => 172.28.0.4 Unknown command '172.28.0.4' - try 'help' => tftp 0x82000000 57/tftp-deploy-t7xus3ey/kernel/vmlinuz link up on port 0, speed 100, full duplex *** ERROR: `serverip' not set ... When I u-boot manually, after I hit SPACE (or 'd', both work), u-boot *deletes* the character and then prints '=> ' (is that delete the root cause?). When LAVA runs, it shows an extra => and starts typing as seen above. dhcp takes a second or two, and so the subsequent command starts to get lost (in the above log we see an IP, because 'setenv serverip' got lost). If I set boot_character_delay to like 1000, it works because it gives enough time for dhcp to finish before typing the next character, but obviously makes the job very slow, and still not reliable. I'm out of ideas.. help? P.S. Two interesting things I've learned recently: 1) boot_character_delay must be specified in device_types file. it's ignored when specified in the device file (surprising, as I see it listed in some people's device files[3]). 2) If you install ser2net from sid, you can set max-connections and do some _very handy_ voyeurism on the serial console while lava does its thing (hat tip Kevin Hilman for that one). Thanks, Dan [1] https://lists.lavasoftware.org/pipermail/lava-users/2018-May/001064.html [2] https://lava.therub.org/scheduler/job/57 [3] https://git.linaro.org/lava/lava-lab.git/tree/lkft.validation.linaro.org/ma… -- Linaro - Kernel Validation

6 years, 3 months

2
7
0 0

Re: [Lava-users] reboot during test

by cnspring2002

100 testcases, submit to lava before leave the office, expect to get all the results next morning. If everything is ok, get 100 results next morning, and check every issues. Then, e.g. the 10th cases OOM, I wish from 11st case to 100th cases can continue run during the night, so I want to reboot the device after 10th case which I find it OOM. Then after reboot, continue 11st case to 100 the cases. I know OOM not automation related, but if I do not resume the 11st ~ 100th cases during this night. I had to resubmit these cases tomorrow morning after I back to office, maybe the 100th case also have some bug, I wish it could send a result, then that morning I could assign other guy to fix it quickly, not wait I back to office then remove the 10th issue case, resubmit it, wait another 8 hours, finally execute the 100th case. 8 hours passed, we do not want the process's efficiency so low! This is our aim. ------------------------------------------------------------------ 发件人：lava-users-request <lava-users-request(a)lists.lavasoftware.org> 发送时间：2019年1月25日(星期五) 16:55 收件人：lava-users <lava-users(a)lists.lavasoftware.org> 主　题：Lava-users Digest, Vol 5, Issue 37 Send Lava-users mailing list submissions to lava-users(a)lists.lavasoftware.org To subscribe or unsubscribe via the World Wide Web, visit https://lists.lavasoftware.org/mailman/listinfo/lava-users or, via email, send a message with subject or body 'help' to lava-users-request(a)lists.lavasoftware.org You can reach the person managing the list at lava-users-owner(a)lists.lavasoftware.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Lava-users digest..." Today's Topics: 1. reboot during test (cnspring2002) 2. Re: AOSP multiple node job (Neil Williams) 3. Re: reboot during test (Neil Williams) 4. Re: AOSP multiple node job (Chase Qi) ---------------------------------------------------------------------- Message: 1 Date: Thu, 24 Jan 2019 20:11:39 +0800 From: cnspring2002 <cnspring2002(a)aliyun.com> To: lava-users(a)lists.lavasoftware.org Subject: [Lava-users] reboot during test Message-ID: <C0C4B61B-7B5A-4DAB-B644-849E77F0119B(a)aliyun.com> Content-Type: text/plain; charset=us-ascii Dear all, In test stage, I have a case, when it run, one firmware OOM. So I trigger the device to reboot, Then all left case cannot run . I can not use multiple boot when define job in advance because I can not predict which case will make OOM, what you suggest doing? ------------------------------ Message: 2 Date: Thu, 24 Jan 2019 15:06:56 +0000 From: Neil Williams <neil.williams(a)linaro.org> To: Chase Qi <chase.qi(a)linaro.org> Cc: lava-users(a)lists.lavasoftware.org Subject: Re: [Lava-users] AOSP multiple node job Message-ID: <CAC6CAR3v_i-vOv4BVe56RHR4PahMihexpNa6v4w44UNb+_PVuw(a)mail.gmail.com> Content-Type: text/plain; charset="UTF-8" On Thu, 24 Jan 2019 at 11:41, Chase Qi <chase.qi(a)linaro.org> wrote: > > Hi, > > In most cases, we don't need multiple node job as we can control AOSP > DUT from lxc via adb over USB. However, here is the use case. > > CTS/VTS tradefed-shell --shards option supports to split tests and run > them on multiple devices in parallel. To leverage the feature in LAVA, > we need multinode job, right? If more than one device needs to have images deployed and booted specifically for this test job, then yes. MultiNode is required. To be sure that each device is at the same stage (as deploy and boot timings can vary), the test job will need to wait for all test jobs to be synchronised to the same point in each test job - synchronisation is currently restricted to POSIX shells. > And in multinode job, master-node lxc > needs access to DUTs from salve nodes via adb over tcpip, right? Not necessarily. From the LXC, the device can be controlled using USB. There is no need for devices to have a direct connection to each other just to use MultiNode. The shards implementation may require that though. > Karsten shared a job example here[1]. This probably is the most > advanced usage of LAVA All MultiNode is a complex usage of LAVA but VLANd used by the networking teams is more complex than your use case. >, and probably also not encouraged? To make it > more clear, the connectivity should look like this. There is a problem in this model: Every DUT will have it's own LXC and that device will be connected to the LXC using USB. > master.lxc <----adb over usb----> master.dut > master.lxc <----adb over tcpip ---> slave1.dut > master.lxc <----adb over tcpip ---> slave2.dut Do not separate the LXC from the DUT - the LXC and it's DUT are a single node. Master DUT has a master LXC. Slave1 DUT has a Slave1 LXC Slave2 DUT has a Slave2 LXC. Depending on the boards in use, you may be able to configure each DUT, including the master DUT, to have TCP/IP networking. That then allows the processes running in the Master node to access the slave nodes. (The following model is based on a theoretical device which doesn't have the crippling USB OTG problem of the hikey - but the hikey can work in this model if the IP addresses are determined statically and therefore are available to each slave LXC.) 0: A program executing in the Master LXC which uses USB to send commands to the master DUT which allow the Master LXC to retrieve the IP address of the master DUT. 1: That program in the Master LXC then uses the MultiNode API (lava-send) to declare that IP address to all the slave nodes. This is equivalent to how existing jobs declare the IP address of the device when using secondary connections. 2: Each slave node waits for the master-ip-addr message and sets that value in a program executing in the slave LXC. The slave LXC is connected to the slave DUT directly using USB so can use this to set the master IP address, if that is required. 3: Each slave node now runs a program in each slave LXC to connect to the slave DUT over USB and extract the slave DUT IP address 4: Each slave node then broadcasts that slave-<ID>-ip-addr message, so the first slave sends slave-1-ip-addr containing an IP address, slave 2 sends slave-2-ip-addr containing a different IP address. 5: The master node is waiting for all of these messages to be sent and retrieves the values in turn. This information is now available to a program executing inside the master LXC. This program could use USB to set these values in the master DUT, if that is required. 6: During this time, all the slave nodes are waiting for the master node to broadcast another message saying that the work on the master is complete. 7: Once the master sends the complete message, each slave node picks up this message from the MultiNode API and the script executing in the slave LXC then ends the Lava Test Definition and the slave test job completes. 8: The master can then do some other stuff and then complete. https://staging.validation.linaro.org/scheduler/job/246447/multinode_defini… https://staging.validation.linaro.org/scheduler/job/246230/multinode_defini… Don't obsess about the LXC either. With upcoming changes for docker support, we could remove the presence of the LXC entirely. The LXC with android devices only exists as a unit of isolation for the benefit of the dispatcher. It has useful side effects but the reason for the LXC to exist is to separate the fastboot operations from the dispatcher operations. For hikey and it's broken USB OTG support: 0: Each slave test job turns off the USB OTG support once the slave LXC has deployed all the test image files and determined that the slave DUT has booted correctly. If not, use lava-test-raise. 1: Next, each slave LXC uses the IP address of it's own slave DUT to check connectivity. If this fails, use lava-test-raise. 2: Each slave LXC uses the MultiNode API to declare the IP address of the slave DUT (because the slave node has determined that this IP is working). 3: The master node is waiting for these messages and these are picked up by the master LXC test definition. 4: The master LXC test definition issues commands to the master DUT - now depending on how the sharding works, this could be over USB (turn the USB OTG off later) or over TCP/IP (turn off the master USB OTG at the start of this test definition). 5: The master DUT has enough information to drive the sharding across the slave DUTs. The slave LXCs are waiting for the master to finish the sharding. (lava-wait) 6: When the master LXC determines that the master DUT has finished the sharding, then the master LXC sends a message to all the slave nodes that the test is complete. 7: Each slave node picks up the completion message in the slave LXC and the test definition finishes. 8: The master node can continue to do other tasks or can also complete it's test definition. > .... > > I see two options for adb over tcpip. > > Option #1: WiFi. adb over wifi can be enabled easily by issuing adb > cmds from lxc. I am not using it for two reasons. Agreed, this doesn't need to rely on WiFi. > > * WiFi isn't reliable for long cts/vts test run. > * In Cambridge lab, WiFi sub-network isn't accessible from lxc > network. Because of security concerns, there is no plan to change > that. > > Option #2: Wired Ethernet. On devices like hikey, we need to run > 'pre-os-command' in boot action to power off OTG port so that USB > Ethernet dongle works. Once OTG port is off, lxc has no access to the > DUT, then test definition should be executed on DUT, right? I am also > having the following problems to do this. Before the OTG is switched, all data from the DUT needs to be retrieved (and set) using the USB connection. What information you need to set depends on how the sharding works. The problem, as I see it, is that the slave DUTs have no way to declare their IP address to the slave LXC once the OTG port is switched. Therefore, you will need to put in a request for the boards to have static IP addresses declared in the device dictionary. Then the OTG can be switched and things become easier because the LXC knows the IP address and can simply declare that to the MultiNode API so that the master LXC can know which IP matches which node. There are already a number of hikey devices with the static_ip device tag and you can specify this device tag in your MultiNode test definition. > > * Without context overriding, overlay tarball will be applied to > '/system' directory and test job reported "/system/bin/sh: Why are you talking about /system ??? MultiNode only operates in a POSIX shell - the POSIX shell is in the LXC and each DUT has a dedicated LXC. In this use case, MultiNode API calls are only going to be made from each LXC. The master LXC sends some information and then receives information from test definitions running in each of the slave LXCs. The overlay is to be deployed to the LXC, not the DUT because this is an Android system. What the android system does is determined either by commands run inside the slave LXC to deploy files (before the OTG switch) or commands run inside the master LXC (with knowledge of the IP address from the MultiNode API) to execute commands on the DUT over TCP/IP. Use the LXC to deploy the files and boot the device, then to declare information about each particular node. Once that is done, whatever thing is controlling the test needs to just use TCP/IP to communicate and use the MultiNode API to send messages and allow some nodes to wait for other nodes whilst the test proceeds. > /lava-247856/bin/lava-test-runner: not found"[2]. > * With the following job context, LAVA still runs > '/lava-24/bin/lava-test-runner /lava-24/0' and it hangs there. It is > tested in my local LAVA instance, test job definition and test log > attached. Maybe my understanding on the context overriding is wrong, I > thought LAVA should execute '/system/lava-24/bin/lava-test-runner > /system/lava-24/0' instead. Any suggestions would be appreciated. > > context: > lava_test_sh_cmd: '/system/bin/sh' > lava_test_results_dir: '/system/lava-%s' > > I checked on the DUT directly, '/system/lava-%s' exist, but I cannot > really run lava-test-runner. The shebang line seems problematic. > > --- hacking --- > hikey:/system/lava-24/bin # ./lava-test-runner > /system/bin/sh: ./lava-test-runner: No such file or directory > hikey:/system/lava-24/bin # cat lava-test-runner > #!/bin/bash > > #!/bin/sh > > .... > # /system/bin/sh lava-test-runner > lava-test-runner[18]: .: /lava/../bin/lava-common-functions: No such > file or directory > --- ends --- > > I had a discussion with Milosz. He proposed the third option which > probably will be the most reliable one, but it is not supported in > LAVA yet. Here is the idea. Milosz, feel free to explain more. > > **Option #3**: Add support for accessing to multiple DUTs in single node job. > > * Physically, we need the DUTs connected via USB cable to the same dispatcher. I don't see that this solves anything and it adds a lot of unnecessary lab configuration - entirely duplicating the point of having ethernet connections to the boards. Assign static IP addresses to each board and when the test job starts, each dedicated LXC can declare the static information according to whichever board was assigned to whichever node. The DUTs only need to be visible to programs running on the master node and that can be done by declaring static IP addresses using the MultiNode API. > * In single node job, LAVA needs to add the DUTs specified(somehow) or > assigned randomly(lets say both device type and numbers defined) to > the same lxc container. Test definitions can take over from here. No - the LXC is used to issue commands to deploy test images to the DUT. The LXC is a transparent part of the dispatcher, it is not just for test definitions. The LXC cannot be used for multiple test jobs, it is part of the one dispatcher. > > Is this can be done in LAVA? Can I require the feature? Any > suggestions on the possible implementations? > > > Thanks, > Chase > > [1] https://review.linaro.org/#/c/qa/test-definitions/+/29417/4/automated/andro… > [2] https://staging.validation.linaro.org/scheduler/job/247856#L1888 > _______________________________________________ > Lava-users mailing list > Lava-users(a)lists.lavasoftware.org > https://lists.lavasoftware.org/mailman/listinfo/lava-users -- Neil Williams ============= neil.williams(a)linaro.org http://www.linux.codehelp.co.uk/ ------------------------------ Message: 3 Date: Thu, 24 Jan 2019 15:09:58 +0000 From: Neil Williams <neil.williams(a)linaro.org> To: cnspring2002 <cnspring2002(a)aliyun.com> Cc: lava-users(a)lists.lavasoftware.org Subject: Re: [Lava-users] reboot during test Message-ID: <CAC6CAR3fV+b1p5T2EVxUCPP7d=Erno-20MNVxPaZPVf3tbK3Yg(a)mail.gmail.com> Content-Type: text/plain; charset="UTF-8" On Thu, 24 Jan 2019 at 12:13, cnspring2002 <cnspring2002(a)aliyun.com> wrote: > > Dear all, > > In test stage, I have a case, when it run, one firmware OOM. So I trigger the device to reboot, Then all left case cannot run . I can not use multiple boot when define job in advance because I can not predict which case will make OOM, what you suggest doing? The out of memory killer is a fatal device error. The test job is not going to be able to continue because the failure mode is unpredictable. The cause of the OOM needs to be determined through standard triage, not automation. (Although automation may help create a data matrix of working and failing combinations and test operations.) -- Neil Williams ============= neil.williams(a)linaro.org http://www.linux.codehelp.co.uk/ ------------------------------ Message: 4 Date: Fri, 25 Jan 2019 15:45:56 +0800 From: Chase Qi <chase.qi(a)linaro.org> To: Neil Williams <neil.williams(a)linaro.org> Cc: lava-users(a)lists.lavasoftware.org Subject: Re: [Lava-users] AOSP multiple node job Message-ID: <CADzYPRFJiX8qKt_NyHZCi0qs5iotx0wg0OMN9o7SOi84sYYTow(a)mail.gmail.com> Content-Type: text/plain; charset="utf-8" Hi Neil, Thanks a lot for your guidance. It is really good to see you back :) On Thu, Jan 24, 2019 at 11:07 PM Neil Williams <neil.williams(a)linaro.org> wrote: > > On Thu, 24 Jan 2019 at 11:41, Chase Qi <chase.qi(a)linaro.org> wrote: > > > > Hi, > > > > In most cases, we don't need multiple node job as we can control AOSP > > DUT from lxc via adb over USB. However, here is the use case. > > > > CTS/VTS tradefed-shell --shards option supports to split tests and run > > them on multiple devices in parallel. To leverage the feature in LAVA, > > we need multinode job, right? > > If more than one device needs to have images deployed and booted > specifically for this test job, then yes. MultiNode is required. To be > sure that each device is at the same stage (as deploy and boot timings > can vary), the test job will need to wait for all test jobs to be > synchronised to the same point in each test job - synchronisation is > currently restricted to POSIX shells. > > > And in multinode job, master-node lxc > > needs access to DUTs from salve nodes via adb over tcpip, right? > > Not necessarily. From the LXC, the device can be controlled using USB. > There is no need for devices to have a direct connection to each other > just to use MultiNode. The shards implementation may require that > though. CTS/VTS sharding shards a run into given number of independent chunks, to run on multiple devices that connected to the same host. The host will be the master lxc in our case. > > > Karsten shared a job example here[1]. This probably is the most > > advanced usage of LAVA > > All MultiNode is a complex usage of LAVA but VLANd used by the > networking teams is more complex than your use case. > > >, and probably also not encouraged? To make it > > more clear, the connectivity should look like this. > > There is a problem in this model: Every DUT will have it's own LXC and > that device will be connected to the LXC using USB. > > > master.lxc <----adb over usb----> master.dut > > master.lxc <----adb over tcpip ---> slave1.dut > > master.lxc <----adb over tcpip ---> slave2.dut > > Do not separate the LXC from the DUT - the LXC and it's DUT are a single node. > > Master DUT has a master LXC. > Slave1 DUT has a Slave1 LXC > Slave2 DUT has a Slave2 LXC. > > Depending on the boards in use, you may be able to configure each DUT, > including the master DUT, to have TCP/IP networking. That then allows > the processes running in the Master node to access the slave nodes. > Yes, that is what I am trying to do. The above connectivity topology I wrote is the goal not the initial state with LAVA design. Master lxc needs access to all the DUT nodes, either via USB or tcpip. > (The following model is based on a theoretical device which doesn't > have the crippling USB OTG problem of the hikey - but the hikey can > work in this model if the IP addresses are determined statically and > therefore are available to each slave LXC.) > > 0: A program executing in the Master LXC which uses USB to send > commands to the master DUT which allow the Master LXC to retrieve the > IP address of the master DUT. > > 1: That program in the Master LXC then uses the MultiNode API > (lava-send) to declare that IP address to all the slave nodes. This is > equivalent to how existing jobs declare the IP address of the device > when using secondary connections. > > 2: Each slave node waits for the master-ip-addr message and sets that > value in a program executing in the slave LXC. The slave LXC is > connected to the slave DUT directly using USB so can use this to set > the master IP address, if that is required. > > 3: Each slave node now runs a program in each slave LXC to connect to > the slave DUT over USB and extract the slave DUT IP address > > 4: Each slave node then broadcasts that slave-<ID>-ip-addr message, so > the first slave sends slave-1-ip-addr containing an IP address, slave > 2 sends slave-2-ip-addr containing a different IP address. > > 5: The master node is waiting for all of these messages to be sent and > retrieves the values in turn. This information is now available to a > program executing inside the master LXC. This program could use USB to > set these values in the master DUT, if that is required. > > 6: During this time, all the slave nodes are waiting for the master > node to broadcast another message saying that the work on the master > is complete. > > 7: Once the master sends the complete message, each slave node picks > up this message from the MultiNode API and the script executing in the > slave LXC then ends the Lava Test Definition and the slave test job > completes. > > 8: The master can then do some other stuff and then complete. > > https://staging.validation.linaro.org/scheduler/job/246447/multinode_defini… > > https://staging.validation.linaro.org/scheduler/job/246230/multinode_defini… > > Don't obsess about the LXC either. With upcoming changes for docker > support, we could remove the presence of the LXC entirely. The LXC > with android devices only exists as a unit of isolation for the > benefit of the dispatcher. It has useful side effects but the reason > for the LXC to exist is to separate the fastboot operations from the > dispatcher operations. > > For hikey and it's broken USB OTG support: > > 0: Each slave test job turns off the USB OTG support once the slave > LXC has deployed all the test image files and determined that the > slave DUT has booted correctly. If not, use lava-test-raise. > > 1: Next, each slave LXC uses the IP address of it's own slave DUT to > check connectivity. If this fails, use lava-test-raise. > > 2: Each slave LXC uses the MultiNode API to declare the IP address of > the slave DUT (because the slave node has determined that this IP is > working). > > 3: The master node is waiting for these messages and these are picked > up by the master LXC test definition. > > 4: The master LXC test definition issues commands to the master DUT - > now depending on how the sharding works, this could be over USB (turn > the USB OTG off later) or over TCP/IP (turn off the master USB OTG at > the start of this test definition). > > 5: The master DUT has enough information to drive the sharding across > the slave DUTs. The slave LXCs are waiting for the master to finish > the sharding. (lava-wait) > > 6: When the master LXC determines that the master DUT has finished the > sharding, then the master LXC sends a message to all the slave nodes > that the test is complete. > > 7: Each slave node picks up the completion message in the slave LXC > and the test definition finishes. > > 8: The master node can continue to do other tasks or can also complete > it's test definition. > > > > .... > > > > I see two options for adb over tcpip. > > > > Option #1: WiFi. adb over wifi can be enabled easily by issuing adb > > cmds from lxc. I am not using it for two reasons. > > Agreed, this doesn't need to rely on WiFi. > > > > > * WiFi isn't reliable for long cts/vts test run. > > * In Cambridge lab, WiFi sub-network isn't accessible from lxc > > network. Because of security concerns, there is no plan to change > > that. > > > > Option #2: Wired Ethernet. On devices like hikey, we need to run > > 'pre-os-command' in boot action to power off OTG port so that USB > > Ethernet dongle works. Once OTG port is off, lxc has no access to the > > DUT, then test definition should be executed on DUT, right? I am also > > having the following problems to do this. > > Before the OTG is switched, all data from the DUT needs to be > retrieved (and set) using the USB connection. > > What information you need to set depends on how the sharding works. > > The problem, as I see it, is that the slave DUTs have no way to > declare their IP address to the slave LXC once the OTG port is > switched. Therefore, you will need to put in a request for the boards That is the problem I had. And that is why I was trying to run test definition on Android DUT directly to enable adb over tcpip and declare IP address. As you mentioned below, it is the wrong direction. > to have static IP addresses declared in the device dictionary. Then > the OTG can be switched and things become easier because the LXC knows > the IP address and can simply declare that to the MultiNode API so > that the master LXC can know which IP matches which node. There are > already a number of hikey devices with the static_ip device tag and > you can specify this device tag in your MultiNode test definition. Brilliant and brand new idea to me. I didn't realize static-ip tag is the solution. I have managed to enable and test adb over tcpip in this way(In my local instance). I have attached my test job definition here in case it is any help for other LAVA users. The following definitions are essential. tags: - static-ip reboot_to_fastboot: false - test: namespace: tlxc timeout: minutes: 10 protocols: lava-lxc: - action: lava-test-shell request: pre-os-command timeout: minutes: 2 Thanks, Chase > > > > > * Without context overriding, overlay tarball will be applied to > > '/system' directory and test job reported "/system/bin/sh: > > Why are you talking about /system ??? MultiNode only operates in a > POSIX shell - the POSIX shell is in the LXC and each DUT has a > dedicated LXC. In this use case, MultiNode API calls are only going to > be made from each LXC. The master LXC sends some information and then > receives information from test definitions running in each of the > slave LXCs. > > The overlay is to be deployed to the LXC, not the DUT because this is > an Android system. What the android system does is determined either > by commands run inside the slave LXC to deploy files (before the OTG > switch) or commands run inside the master LXC (with knowledge of the > IP address from the MultiNode API) to execute commands on the DUT over > TCP/IP. > > Use the LXC to deploy the files and boot the device, then to declare > information about each particular node. Once that is done, whatever > thing is controlling the test needs to just use TCP/IP to communicate > and use the MultiNode API to send messages and allow some nodes to > wait for other nodes whilst the test proceeds. > > > /lava-247856/bin/lava-test-runner: not found"[2]. > > * With the following job context, LAVA still runs > > '/lava-24/bin/lava-test-runner /lava-24/0' and it hangs there. It is > > tested in my local LAVA instance, test job definition and test log > > attached. Maybe my understanding on the context overriding is wrong, I > > thought LAVA should execute '/system/lava-24/bin/lava-test-runner > > /system/lava-24/0' instead. Any suggestions would be appreciated. > > > > context: > > lava_test_sh_cmd: '/system/bin/sh' > > lava_test_results_dir: '/system/lava-%s' > > > > I checked on the DUT directly, '/system/lava-%s' exist, but I cannot > > really run lava-test-runner. The shebang line seems problematic. > > > > --- hacking --- > > hikey:/system/lava-24/bin # ./lava-test-runner > > /system/bin/sh: ./lava-test-runner: No such file or directory > > hikey:/system/lava-24/bin # cat lava-test-runner > > #!/bin/bash > > > > #!/bin/sh > > > > .... > > # /system/bin/sh lava-test-runner > > lava-test-runner[18]: .: /lava/../bin/lava-common-functions: No such > > file or directory > > --- ends --- > > > > I had a discussion with Milosz. He proposed the third option which > > probably will be the most reliable one, but it is not supported in > > LAVA yet. Here is the idea. Milosz, feel free to explain more. > > > > **Option #3**: Add support for accessing to multiple DUTs in single node job. > > > > * Physically, we need the DUTs connected via USB cable to the same dispatcher. > > I don't see that this solves anything and it adds a lot of unnecessary > lab configuration - entirely duplicating the point of having ethernet > connections to the boards. Assign static IP addresses to each board > and when the test job starts, each dedicated LXC can declare the > static information according to whichever board was assigned to > whichever node. > > The DUTs only need to be visible to programs running on the master > node and that can be done by declaring static IP addresses using the > MultiNode API. > > > * In single node job, LAVA needs to add the DUTs specified(somehow) or > > assigned randomly(lets say both device type and numbers defined) to > > the same lxc container. Test definitions can take over from here. > > No - the LXC is used to issue commands to deploy test images to the > DUT. The LXC is a transparent part of the dispatcher, it is not just > for test definitions. The LXC cannot be used for multiple test jobs, > it is part of the one dispatcher. > > > > > Is this can be done in LAVA? Can I require the feature? Any > > suggestions on the possible implementations? > > > > > > Thanks, > > Chase > > > > [1] https://review.linaro.org/#/c/qa/test-definitions/+/29417/4/automated/andro… > > [2] https://staging.validation.linaro.org/scheduler/job/247856#L1888 > > _______________________________________________ > > Lava-users mailing list > > Lava-users(a)lists.lavasoftware.org > > https://lists.lavasoftware.org/mailman/listinfo/lava-users > > > > -- > > Neil Williams > ============= > neil.williams(a)linaro.org > http://www.linux.codehelp.co.uk/

6 years, 3 months

2
1
0 0

reboot during test

by cnspring2002

Dear all, In test stage, I have a case, when it run, one firmware OOM. So I trigger the device to reboot, Then all left case cannot run . I can not use multiple boot when define job in advance because I can not predict which case will make OOM, what you suggest doing?

6 years, 3 months

2
1
0 0

Import junit data into LAVA

by Diego Russo

Hello list, Apologies if this question has been asked already. I have a test framework which spits out a junit file. What’s the best way to import data from the junit file into LAVA? Cheers -- Diego Russo | Staff Software Engineer | Mbed Linux OS ARM Ltd. CPC1, Capital Park, Cambridge Road, Fulbourn, CB21 5XE, United Kingdom http://www.diegor.co.uk - https://os.mbed.com/linux-os/ IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

6 years, 3 months

3
2
0 0

Reboot DUT during job execution

by Diego Russo

Hello, we have the need to perform tests that require reboots of the DUT between their executions. Few examples are to check rootfs upgrades or to check configurations changes to persist . I have few questions: * Does LAVA support those cases? * If yes, does LAVA support multiple reboots? * If yes, how can I write tests in order to run different sets of tests at any boot. * Example: 1) do an upgrade 2) reboot the device 3) Check if the upgrade was successful * How can I structure my pipeline? Thanks -- Diego Russo | Staff Software Engineer | Mbed Linux OS ARM Ltd. CPC1, Capital Park, Cambridge Road, Fulbourn, CB21 5XE, United Kingdom http://www.diegor.co.uk - https://os.mbed.com/linux-os/ IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

6 years, 3 months

2
4
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

lava-users