Hi.
To make customized performance regression test job, I need to change
Lava job status as I want.
When I run test like kselftest, Lava dispatcher run test program under DUT.
And after finishing lava test job, Job state is always 'Complete'
whatever result(pass/fail) each test return.
Attached image file is about kselftest result from lava test job.
I want to make lava job status as 'Fail' or 'Canceled' when certain test
return fail.
I would like to ask your advice.
Best regards
Seoji Kim
Dear Sir/Madam,
could you please help us analyze the problems encountered in recent Lava tests?
Detailed log in the attachment.
lava-dispatcher version: 2018.11+stretch.
The key information is as follows: Bug error: argument of type 'NoneType' is not iterable
Chase Qi preliminary positioning is a lava bug. We look forward to your reply.
Thank you for your assistance.
Best Regards,
Caili Liu
Hi all,
Yesterday, we faced a weird issue with LAVA.
A job was running and returned an error saying "metadata is too long".
Right after that, the worker that was running the job went offline, and the
lava-master
raised an "unknown exception", making it crash.
In attachement, you will find the full job error saying metadata is too
long,
the full job log, and the lava-master.log when the exception occured.
Hope this helps.
Axel
I use next command to start lavadispatcher:
docker run -idt --net=host --privileged -v /dev:/dev -v /var/lib/lava/dispatcher/tmp:/var/lib/lava/dispatcher/tmp -e "DISPATCHER_HOSTNAME=--hostname=myname" -e "LOGGER_URL=tcp://master_ip:5555" -e "MASTER_URL=tcp://master_ip:5556" --name test_lava lavasoftware/lava-dispatcher:2019.01
In container, I start tftp and nfs. But the nfs always cannot be start successfully. I use "service nfs-kernel-server start" to start it, also before that I did "rpcbind".
The start shows below seems ok:
# service nfs-kernel-server start
[ ok ] Exporting directories for NFS kernel daemon....
[ ok ] Starting NFS kernel daemon: nfsd mountd.
But if do next we can see the nfs still not start.
# service nfs-kernel-server status
nfsd not running
Any suggestion?
Hello,
We've been using uboot-ums for WaRP7 but we've been having intermittent failures when it tried to run dd to flash the image.
Provided we need to look better into the root cause of this issue, we'd like to make the flashing phase a little more reliable.
I have few questions, coming from different angles:
* LAVA uses dd command to flash the image. Is there a way to specify the usage of bmap-tools?
* let's say dd times out (this is what usually happen). Is there a mechanism to restart the actions (deploy and boot) in case of timeout?
If you have any other suggestion, let me know!
Cheers
--
Diego Russo | Staff Software Engineer | Mbed Linux OS
ARM Ltd. CPC1, Capital Park, Cambridge Road, Fulbourn, CB21 5XE, United Kingdom
http://www.diegor.co.uk - https://os.mbed.com/linux-os/
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi,
I have a test step that requires user input to a defined prompt.
Is there a way I can automate this in LAVA ?
I can see how we do this for Boot Actions and I've looked at interactive jobs that communicate to U-boot but these don't seem to fit my use case.
Thanks.
Pete
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi All,
I am new to Android testing, I am using standard board *i.mx6/rpi3* and
flashed Android on it, I can connect with these board using "adb" tool.
I want to do some system test using LAVA framework, and I don't want to
reflash image every time(wants to test on the existing system). Can someone
please let me know how I can do Android testing on existing flashed image?
Thanks,
Ankit
Dear Lava users,
Our embedded SW offers 3 boot modes, selectable from u-boot.
When booting, u-boot offers the possibility to select the boot mode:
Select the boot mode
1: <boot mode 1>
2: <boot mode 2>
3: <boot mode 3>
Enter choice: 1: <boot mode 1>
<boot mode 1> is a default value, used after a counter has expired.
All this is done using the extlinux feature.
We have scripts that allow to select the boot mode, using Kermit. Now, we'd like to integrate this boot mode selection in a Lava job, and our current solution is not compatible.
In Lava we may boot the kernel, modify the extlinux configuration and reboot, but do you know a direct way (with interactive mode maybe) to select the boot mode from u-boot?
Best regards,
Denis
Hello everyone,
I am having problems with timeouts when using the LAVA multinode protocol. Assume the following minimal pipeline with two nodes (device = DUT, remote = some kind of external hardware interfacing with the DUT):
- deploy:
role: device
- boot:
role: device
- deploy:
role: remote
- boot:
role: remote
- test:
role: remote
- test:
role: device
What I would expect: The device is booted first, then the remote is booted. Afterwards, the tests run on both nodes, being able to interact with each other.
The pipeline model seems to be implemented in a way that each node has its own pipeline. This kind of makes sense, because the tests of course have to run simultaneously.
However, in my case booting the device takes a lot more time than booting the remote. This makes the 'test' stage on the remote run a lot earlier than the 'test' stage on the device.
My problem: How do I define a useful timeout value for the 'test' stage on the remote? Obviously I have to take the boot time difference between the two nodes into account. This seems counter-intuitive to me, since the timeout value should affect the actual test only. What happens if I use an image on the device which takes even a lot more time to boot? Or if I insert more testcases on the device which do not need the remote before? In both cases I would have to adjust the timeout value for the remote 'test' stage.
Is this a design compromise? Or are there any possibilities of synchronizing pipeline stages on different nodes? I am thinking of some mechanism like "do not start 'test' stage on remote before 'boot' stage on device has finished".
Mit freundlichen Grüßen / Best regards
Tim Jaacks
DEVELOPMENT ENGINEER
Garz & Fricke GmbH
Tempowerkring 2
21079 Hamburg
Direct: +49 40 791 899 - 55
Fax: +49 40 791899 - 39
tim.jaacks(a)garz-fricke.com
www.garz-fricke.com
WE MAKE IT YOURS!
Sitz der Gesellschaft: D-21079 Hamburg
Registergericht: Amtsgericht Hamburg, HRB 60514
Geschäftsführer: Matthias Fricke, Manfred Garz, Marc-Michael Braun
Hi,
Is there a way to specify an arbitrary parameter, that is board
specific, in device dictionary? The parameter should be available from
test shell. The use case is as follows: DUT is connected to test
supporting hardware (chamelium board in this case). Tests access
supporting hardware using IP address. IP address is 'static' and the
supporting hw is assumed to be always turned on. Supporting hardware
is dedicated to specific DUT and can't be shared (because of HW
connections) between other boards of the same type (similar to energy
probes). Tests run directly on DUT and access supporting hardware from
there.
I found 'exported parameters' in device dictionary docs:
https://master.lavasoftware.org/static/docs/v2/lava-scheduler-device-dictio….
But they only list device_ip, device_mac and storage_info. Is there a
way to extend this list? If not, is there any other way to provide
device specific information to test shells?
milosz
Hi everyone,
I'm writing this email after discussion with Neil.
I'm working at NXP and he told me Linaro wanted to run functional tests on
imx8m
with the new u-boot support.
He told me it requires full open access, no license click-through or
passwords.
Philippe Mazet is more qualified to answer this type of question as I only
use Android atm. He will follow up the discussion.
Here you have the Yocto source code :
https://source.codeaurora.org/external/imx/imx-manifest/tree/README?h=imx-l…
You can get the latest GA release with this :
repo init -u https://source.codeaurora.org/external/imx/imx-manifest -b
imx-linux-sumo -m imx-4.14.78-1.0.0_ga.xml
You can build these sources like this :
DISTRO=fsl-imx-wayland MACHINE=imx8mqevk source ./fsl-setup-release.sh -b
build
But, this will redirect you to a license click-through.
However, you can bypass the license click-through, like "auto accept it"
with this command :
EULA=1 DISTRO=fsl-imx-wayland MACHINE=imx8mqevk source
./fsl-setup-release.sh -b build
We use this bypass to automate builds.
So, let us know if this would be suitable for Linaro's needs.
Best regards,
Axel
Hello Lava-users,
Do we have support in LAVA for deploying to target the SWUpdate images
directly if target supports the swupdate image deployment(
http://sbabic.github.io/swupdate/).
Thanks,
Hemanth.
Hi all,
We're planning to use TI AM4378 IDK in the board farm for testing (http://www.ti.com/tool/TMDSIDK437X) but it seems there is not device-type for this board. I tried to search from these links:
- https://git.lavasoftware.org/lava/lava/tree/master/lava_scheduler_app/tests…
- https://git.linaro.org/lava/lava-lab.git/tree/shared/device-types
Did I just miss it, or is it missing from the device-types? If it is missing, are there any templates available / what template could be used as a base for the device
Br
Olli Väinölä
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi All,
I have a Rpi3 board with android install on it, it can be accessed using
adb from my Linux PC.
I am successfully able to use Lava-LXC for testing a device.
Can someone please share the steps how I can use LAVA to test an android
device and how the setup will look like. If anyone can share a test job
with some basic android tests would be very helpful.
Thanks,
Ankit
Hello Lava users,
I'm trying to build a query to monitor all the test jobs done in a given period. Let's say all the jobs done in January, just to check the robustness of my setup (incomplete jobs rate).
In the query interface, I can add a condition on start_time of a job, but only with a "greater than" operator, when I want to add also a condition on start_time "less than".
Do you have any hint to do this?
Best regards,
Denis
Hello,
I have the following setup: a WaRP7 which exposes a network connection over USB gadget driver (http://trac.gateworks.com/wiki/linux/OTG#g_etherGadget)
A possible test case is to have some process running on the LAVA dispatcher (within a LXC container) which targets the WaRP7 over this network interface.
Through LXC I'm able to passtrhough this interface from the host to the container and use it within the container (via /etc/lxc/default.conf)
If a test requires the reboot of the WaRP7, the usb0 interface disappears from the container. When the WaRP7 boots again the usb0 interface is available on the host (but not in the container).
Things I tried or thought about:
* I tried synchronizing boots both of the WaRP7 and LXC container but it seems not possible to "reboot" (restart) a container within the same job execution.
* Is it possible to "restart" a container during a job execution?
* Outside LAVA it is possible to run a command (lxc-device --name diegor-test -- add usb0) which re-passthrough the interface from Linux to LXC container.
* Is it possible to run the above command ad job execution time on the lava dispatcher?
How can I solve this situation?
Cheers
--
Diego Russo
Staff Software Engineer - diego.russo(a)arm.com
Direct Tel. no: +44 1223 405920
Main Tel. no: +44 1223 400400
ARM Ltd. CPC1, Capital Park, Cambridge Road, Fulbourn, CB21 5XE, United Kingdom
http://www.diegor.co.uk - http://twitter.com/diegorhttp://www.linkedin.com/in/diegor
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hello,
while searching for possibilities of testing external interfaces of my DUTs I found this presentation:
https://wiki.linaro.org/Internal/lava/Lava-lmp?action=AttachFile&do=get&tar…
Is the LAVA LMP project still active? If yes, how can I find information on this?
Mit freundlichen Grüßen / Best regards
Tim Jaacks
DEVELOPMENT ENGINEER
Garz & Fricke GmbH
Tempowerkring 2
21079 Hamburg
Direct: +49 40 791 899 - 55
Fax: +49 40 791899 - 39
tim.jaacks(a)garz-fricke.com
www.garz-fricke.com<http://www.garz-fricke.com/>
WE MAKE IT YOURS!
[cid:image001.jpg@01D4B71E.962E36E0]
Sitz der Gesellschaft: D-21079 Hamburg
Registergericht: Amtsgericht Hamburg, HRB 60514
Geschäftsführer: Matthias Fricke, Manfred Garz, Marc-Michael Braun
I'm sorry, as surely this is an FAQ but I've spent quite a bit of time
troubleshooting and reading. This is very similar to Kevin's thread from
May subject 'u-boot devices broken after 2018.4 upgrade, strange u-boot
interaction'. In that thread's case, the issue was that interrupt_char
was being set to "\n". My symptoms are the same, but interrupt_char is
set to " " or "d".
I'm running LAVA from the latest released containers (2018.11), and
trying to use a beaglebone-black with a more recent u-boot than exists
in validation.l.o. qemu works fine.
The problem seems to be that LAVA thinks there's a prompt when there
isn't, and so it sends commands too quickly. Here's example output from
the serial console (job link[2]):
U-Boot 2017.07 (Aug 31 2017 - 15:35:58 +0000)
CPU : AM335X-GP rev 2.1
I2C: ready
DRAM: 512 MiB
No match for driver 'omap_hsmmc'
No match for driver 'omap_hsmmc'
Some drivers were not found
MMC: OMAP SD/MMC: 0, OMAP SD/MMC: 1
Net: cpsw, usb_ether
Press SPACE to abort autoboot in 10 seconds
=>
=> setenv autoload no
=> setenv initrd_high 0xffffffff
=> setenv fdt_high 0xffffffff
=> dhcp
link up on port 0, speed 100, full duplex
BOOTP broadcast 1
BOOTP broadcast 2
BOOTP broadcast 3
DHCP client bound to address 10.100.0.55 (1006 ms)
=> 172.28.0.4
Unknown command '172.28.0.4' - try 'help'
=> tftp 0x82000000 57/tftp-deploy-t7xus3ey/kernel/vmlinuz
link up on port 0, speed 100, full duplex
*** ERROR: `serverip' not set
...
When I u-boot manually, after I hit SPACE (or 'd', both work), u-boot
*deletes* the character and then prints '=> ' (is that delete the root
cause?). When LAVA runs, it shows an extra => and starts typing as seen
above. dhcp takes a second or two, and so the subsequent command starts
to get lost (in the above log we see an IP, because 'setenv serverip'
got lost).
If I set boot_character_delay to like 1000, it works because it gives
enough time for dhcp to finish before typing the next character, but
obviously makes the job very slow, and still not reliable.
I'm out of ideas.. help?
P.S. Two interesting things I've learned recently:
1) boot_character_delay must be specified in device_types file. it's
ignored when specified in the device file (surprising, as I see it
listed in some people's device files[3]).
2) If you install ser2net from sid, you can set max-connections and do
some _very handy_ voyeurism on the serial console while lava does its
thing (hat tip Kevin Hilman for that one).
Thanks,
Dan
[1] https://lists.lavasoftware.org/pipermail/lava-users/2018-May/001064.html
[2] https://lava.therub.org/scheduler/job/57
[3] https://git.linaro.org/lava/lava-lab.git/tree/lkft.validation.linaro.org/ma…
--
Linaro - Kernel Validation
100 testcases, submit to lava before leave the office, expect to get all the results next morning.
If everything is ok, get 100 results next morning, and check every issues.
Then, e.g. the 10th cases OOM, I wish from 11st case to 100th cases can continue run during the night, so I want to reboot the device after 10th case which I find it OOM. Then after reboot, continue 11st case to 100 the cases.
I know OOM not automation related, but if I do not resume the 11st ~ 100th cases during this night. I had to resubmit these cases tomorrow morning after I back to office, maybe the 100th case also have some bug, I wish it could send a result, then that morning I could assign other guy to fix it quickly, not wait I back to office then remove the 10th issue case, resubmit it, wait another 8 hours, finally execute the 100th case. 8 hours passed, we do not want the process's efficiency so low! This is our aim.
------------------------------------------------------------------
发件人:lava-users-request <lava-users-request(a)lists.lavasoftware.org>
发送时间:2019年1月25日(星期五) 16:55
收件人:lava-users <lava-users(a)lists.lavasoftware.org>
主 题:Lava-users Digest, Vol 5, Issue 37
Send Lava-users mailing list submissions to
lava-users(a)lists.lavasoftware.org
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.lavasoftware.org/mailman/listinfo/lava-users
or, via email, send a message with subject or body 'help' to
lava-users-request(a)lists.lavasoftware.org
You can reach the person managing the list at
lava-users-owner(a)lists.lavasoftware.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Lava-users digest..."
Today's Topics:
1. reboot during test (cnspring2002)
2. Re: AOSP multiple node job (Neil Williams)
3. Re: reboot during test (Neil Williams)
4. Re: AOSP multiple node job (Chase Qi)
----------------------------------------------------------------------
Message: 1
Date: Thu, 24 Jan 2019 20:11:39 +0800
From: cnspring2002 <cnspring2002(a)aliyun.com>
To: lava-users(a)lists.lavasoftware.org
Subject: [Lava-users] reboot during test
Message-ID: <C0C4B61B-7B5A-4DAB-B644-849E77F0119B(a)aliyun.com>
Content-Type: text/plain; charset=us-ascii
Dear all,
In test stage, I have a case, when it run, one firmware OOM. So I trigger the device to reboot, Then all left case cannot run . I can not use multiple boot when define job in advance because I can not predict which case will make OOM, what you suggest doing?
------------------------------
Message: 2
Date: Thu, 24 Jan 2019 15:06:56 +0000
From: Neil Williams <neil.williams(a)linaro.org>
To: Chase Qi <chase.qi(a)linaro.org>
Cc: lava-users(a)lists.lavasoftware.org
Subject: Re: [Lava-users] AOSP multiple node job
Message-ID:
<CAC6CAR3v_i-vOv4BVe56RHR4PahMihexpNa6v4w44UNb+_PVuw(a)mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"
On Thu, 24 Jan 2019 at 11:41, Chase Qi <chase.qi(a)linaro.org> wrote:
>
> Hi,
>
> In most cases, we don't need multiple node job as we can control AOSP
> DUT from lxc via adb over USB. However, here is the use case.
>
> CTS/VTS tradefed-shell --shards option supports to split tests and run
> them on multiple devices in parallel. To leverage the feature in LAVA,
> we need multinode job, right?
If more than one device needs to have images deployed and booted
specifically for this test job, then yes. MultiNode is required. To be
sure that each device is at the same stage (as deploy and boot timings
can vary), the test job will need to wait for all test jobs to be
synchronised to the same point in each test job - synchronisation is
currently restricted to POSIX shells.
> And in multinode job, master-node lxc
> needs access to DUTs from salve nodes via adb over tcpip, right?
Not necessarily. From the LXC, the device can be controlled using USB.
There is no need for devices to have a direct connection to each other
just to use MultiNode. The shards implementation may require that
though.
> Karsten shared a job example here[1]. This probably is the most
> advanced usage of LAVA
All MultiNode is a complex usage of LAVA but VLANd used by the
networking teams is more complex than your use case.
>, and probably also not encouraged? To make it
> more clear, the connectivity should look like this.
There is a problem in this model: Every DUT will have it's own LXC and
that device will be connected to the LXC using USB.
> master.lxc <----adb over usb----> master.dut
> master.lxc <----adb over tcpip ---> slave1.dut
> master.lxc <----adb over tcpip ---> slave2.dut
Do not separate the LXC from the DUT - the LXC and it's DUT are a single node.
Master DUT has a master LXC.
Slave1 DUT has a Slave1 LXC
Slave2 DUT has a Slave2 LXC.
Depending on the boards in use, you may be able to configure each DUT,
including the master DUT, to have TCP/IP networking. That then allows
the processes running in the Master node to access the slave nodes.
(The following model is based on a theoretical device which doesn't
have the crippling USB OTG problem of the hikey - but the hikey can
work in this model if the IP addresses are determined statically and
therefore are available to each slave LXC.)
0: A program executing in the Master LXC which uses USB to send
commands to the master DUT which allow the Master LXC to retrieve the
IP address of the master DUT.
1: That program in the Master LXC then uses the MultiNode API
(lava-send) to declare that IP address to all the slave nodes. This is
equivalent to how existing jobs declare the IP address of the device
when using secondary connections.
2: Each slave node waits for the master-ip-addr message and sets that
value in a program executing in the slave LXC. The slave LXC is
connected to the slave DUT directly using USB so can use this to set
the master IP address, if that is required.
3: Each slave node now runs a program in each slave LXC to connect to
the slave DUT over USB and extract the slave DUT IP address
4: Each slave node then broadcasts that slave-<ID>-ip-addr message, so
the first slave sends slave-1-ip-addr containing an IP address, slave
2 sends slave-2-ip-addr containing a different IP address.
5: The master node is waiting for all of these messages to be sent and
retrieves the values in turn. This information is now available to a
program executing inside the master LXC. This program could use USB to
set these values in the master DUT, if that is required.
6: During this time, all the slave nodes are waiting for the master
node to broadcast another message saying that the work on the master
is complete.
7: Once the master sends the complete message, each slave node picks
up this message from the MultiNode API and the script executing in the
slave LXC then ends the Lava Test Definition and the slave test job
completes.
8: The master can then do some other stuff and then complete.
https://staging.validation.linaro.org/scheduler/job/246447/multinode_defini…https://staging.validation.linaro.org/scheduler/job/246230/multinode_defini…
Don't obsess about the LXC either. With upcoming changes for docker
support, we could remove the presence of the LXC entirely. The LXC
with android devices only exists as a unit of isolation for the
benefit of the dispatcher. It has useful side effects but the reason
for the LXC to exist is to separate the fastboot operations from the
dispatcher operations.
For hikey and it's broken USB OTG support:
0: Each slave test job turns off the USB OTG support once the slave
LXC has deployed all the test image files and determined that the
slave DUT has booted correctly. If not, use lava-test-raise.
1: Next, each slave LXC uses the IP address of it's own slave DUT to
check connectivity. If this fails, use lava-test-raise.
2: Each slave LXC uses the MultiNode API to declare the IP address of
the slave DUT (because the slave node has determined that this IP is
working).
3: The master node is waiting for these messages and these are picked
up by the master LXC test definition.
4: The master LXC test definition issues commands to the master DUT -
now depending on how the sharding works, this could be over USB (turn
the USB OTG off later) or over TCP/IP (turn off the master USB OTG at
the start of this test definition).
5: The master DUT has enough information to drive the sharding across
the slave DUTs. The slave LXCs are waiting for the master to finish
the sharding. (lava-wait)
6: When the master LXC determines that the master DUT has finished the
sharding, then the master LXC sends a message to all the slave nodes
that the test is complete.
7: Each slave node picks up the completion message in the slave LXC
and the test definition finishes.
8: The master node can continue to do other tasks or can also complete
it's test definition.
> ....
>
> I see two options for adb over tcpip.
>
> Option #1: WiFi. adb over wifi can be enabled easily by issuing adb
> cmds from lxc. I am not using it for two reasons.
Agreed, this doesn't need to rely on WiFi.
>
> * WiFi isn't reliable for long cts/vts test run.
> * In Cambridge lab, WiFi sub-network isn't accessible from lxc
> network. Because of security concerns, there is no plan to change
> that.
>
> Option #2: Wired Ethernet. On devices like hikey, we need to run
> 'pre-os-command' in boot action to power off OTG port so that USB
> Ethernet dongle works. Once OTG port is off, lxc has no access to the
> DUT, then test definition should be executed on DUT, right? I am also
> having the following problems to do this.
Before the OTG is switched, all data from the DUT needs to be
retrieved (and set) using the USB connection.
What information you need to set depends on how the sharding works.
The problem, as I see it, is that the slave DUTs have no way to
declare their IP address to the slave LXC once the OTG port is
switched. Therefore, you will need to put in a request for the boards
to have static IP addresses declared in the device dictionary. Then
the OTG can be switched and things become easier because the LXC knows
the IP address and can simply declare that to the MultiNode API so
that the master LXC can know which IP matches which node. There are
already a number of hikey devices with the static_ip device tag and
you can specify this device tag in your MultiNode test definition.
>
> * Without context overriding, overlay tarball will be applied to
> '/system' directory and test job reported "/system/bin/sh:
Why are you talking about /system ??? MultiNode only operates in a
POSIX shell - the POSIX shell is in the LXC and each DUT has a
dedicated LXC. In this use case, MultiNode API calls are only going to
be made from each LXC. The master LXC sends some information and then
receives information from test definitions running in each of the
slave LXCs.
The overlay is to be deployed to the LXC, not the DUT because this is
an Android system. What the android system does is determined either
by commands run inside the slave LXC to deploy files (before the OTG
switch) or commands run inside the master LXC (with knowledge of the
IP address from the MultiNode API) to execute commands on the DUT over
TCP/IP.
Use the LXC to deploy the files and boot the device, then to declare
information about each particular node. Once that is done, whatever
thing is controlling the test needs to just use TCP/IP to communicate
and use the MultiNode API to send messages and allow some nodes to
wait for other nodes whilst the test proceeds.
> /lava-247856/bin/lava-test-runner: not found"[2].
> * With the following job context, LAVA still runs
> '/lava-24/bin/lava-test-runner /lava-24/0' and it hangs there. It is
> tested in my local LAVA instance, test job definition and test log
> attached. Maybe my understanding on the context overriding is wrong, I
> thought LAVA should execute '/system/lava-24/bin/lava-test-runner
> /system/lava-24/0' instead. Any suggestions would be appreciated.
>
> context:
> lava_test_sh_cmd: '/system/bin/sh'
> lava_test_results_dir: '/system/lava-%s'
>
> I checked on the DUT directly, '/system/lava-%s' exist, but I cannot
> really run lava-test-runner. The shebang line seems problematic.
>
> --- hacking ---
> hikey:/system/lava-24/bin # ./lava-test-runner
> /system/bin/sh: ./lava-test-runner: No such file or directory
> hikey:/system/lava-24/bin # cat lava-test-runner
> #!/bin/bash
>
> #!/bin/sh
>
> ....
> # /system/bin/sh lava-test-runner
> lava-test-runner[18]: .: /lava/../bin/lava-common-functions: No such
> file or directory
> --- ends ---
>
> I had a discussion with Milosz. He proposed the third option which
> probably will be the most reliable one, but it is not supported in
> LAVA yet. Here is the idea. Milosz, feel free to explain more.
>
> **Option #3**: Add support for accessing to multiple DUTs in single node job.
>
> * Physically, we need the DUTs connected via USB cable to the same dispatcher.
I don't see that this solves anything and it adds a lot of unnecessary
lab configuration - entirely duplicating the point of having ethernet
connections to the boards. Assign static IP addresses to each board
and when the test job starts, each dedicated LXC can declare the
static information according to whichever board was assigned to
whichever node.
The DUTs only need to be visible to programs running on the master
node and that can be done by declaring static IP addresses using the
MultiNode API.
> * In single node job, LAVA needs to add the DUTs specified(somehow) or
> assigned randomly(lets say both device type and numbers defined) to
> the same lxc container. Test definitions can take over from here.
No - the LXC is used to issue commands to deploy test images to the
DUT. The LXC is a transparent part of the dispatcher, it is not just
for test definitions. The LXC cannot be used for multiple test jobs,
it is part of the one dispatcher.
>
> Is this can be done in LAVA? Can I require the feature? Any
> suggestions on the possible implementations?
>
>
> Thanks,
> Chase
>
> [1] https://review.linaro.org/#/c/qa/test-definitions/+/29417/4/automated/andro…
> [2] https://staging.validation.linaro.org/scheduler/job/247856#L1888
> _______________________________________________
> Lava-users mailing list
> Lava-users(a)lists.lavasoftware.org
> https://lists.lavasoftware.org/mailman/listinfo/lava-users
--
Neil Williams
=============
neil.williams(a)linaro.org
http://www.linux.codehelp.co.uk/
------------------------------
Message: 3
Date: Thu, 24 Jan 2019 15:09:58 +0000
From: Neil Williams <neil.williams(a)linaro.org>
To: cnspring2002 <cnspring2002(a)aliyun.com>
Cc: lava-users(a)lists.lavasoftware.org
Subject: Re: [Lava-users] reboot during test
Message-ID:
<CAC6CAR3fV+b1p5T2EVxUCPP7d=Erno-20MNVxPaZPVf3tbK3Yg(a)mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"
On Thu, 24 Jan 2019 at 12:13, cnspring2002 <cnspring2002(a)aliyun.com> wrote:
>
> Dear all,
>
> In test stage, I have a case, when it run, one firmware OOM. So I trigger the device to reboot, Then all left case cannot run . I can not use multiple boot when define job in advance because I can not predict which case will make OOM, what you suggest doing?
The out of memory killer is a fatal device error. The test job is not
going to be able to continue because the failure mode is
unpredictable.
The cause of the OOM needs to be determined through standard triage,
not automation. (Although automation may help create a data matrix of
working and failing combinations and test operations.)
--
Neil Williams
=============
neil.williams(a)linaro.org
http://www.linux.codehelp.co.uk/
------------------------------
Message: 4
Date: Fri, 25 Jan 2019 15:45:56 +0800
From: Chase Qi <chase.qi(a)linaro.org>
To: Neil Williams <neil.williams(a)linaro.org>
Cc: lava-users(a)lists.lavasoftware.org
Subject: Re: [Lava-users] AOSP multiple node job
Message-ID:
<CADzYPRFJiX8qKt_NyHZCi0qs5iotx0wg0OMN9o7SOi84sYYTow(a)mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi Neil,
Thanks a lot for your guidance. It is really good to see you back :)
On Thu, Jan 24, 2019 at 11:07 PM Neil Williams <neil.williams(a)linaro.org> wrote:
>
> On Thu, 24 Jan 2019 at 11:41, Chase Qi <chase.qi(a)linaro.org> wrote:
> >
> > Hi,
> >
> > In most cases, we don't need multiple node job as we can control AOSP
> > DUT from lxc via adb over USB. However, here is the use case.
> >
> > CTS/VTS tradefed-shell --shards option supports to split tests and run
> > them on multiple devices in parallel. To leverage the feature in LAVA,
> > we need multinode job, right?
>
> If more than one device needs to have images deployed and booted
> specifically for this test job, then yes. MultiNode is required. To be
> sure that each device is at the same stage (as deploy and boot timings
> can vary), the test job will need to wait for all test jobs to be
> synchronised to the same point in each test job - synchronisation is
> currently restricted to POSIX shells.
>
> > And in multinode job, master-node lxc
> > needs access to DUTs from salve nodes via adb over tcpip, right?
>
> Not necessarily. From the LXC, the device can be controlled using USB.
> There is no need for devices to have a direct connection to each other
> just to use MultiNode. The shards implementation may require that
> though.
CTS/VTS sharding shards a run into given number of independent chunks,
to run on multiple devices that connected to the same host. The host
will be the master lxc in our case.
>
> > Karsten shared a job example here[1]. This probably is the most
> > advanced usage of LAVA
>
> All MultiNode is a complex usage of LAVA but VLANd used by the
> networking teams is more complex than your use case.
>
> >, and probably also not encouraged? To make it
> > more clear, the connectivity should look like this.
>
> There is a problem in this model: Every DUT will have it's own LXC and
> that device will be connected to the LXC using USB.
>
> > master.lxc <----adb over usb----> master.dut
> > master.lxc <----adb over tcpip ---> slave1.dut
> > master.lxc <----adb over tcpip ---> slave2.dut
>
> Do not separate the LXC from the DUT - the LXC and it's DUT are a single node.
>
> Master DUT has a master LXC.
> Slave1 DUT has a Slave1 LXC
> Slave2 DUT has a Slave2 LXC.
>
> Depending on the boards in use, you may be able to configure each DUT,
> including the master DUT, to have TCP/IP networking. That then allows
> the processes running in the Master node to access the slave nodes.
>
Yes, that is what I am trying to do. The above connectivity topology I
wrote is the goal not the initial state with LAVA design. Master lxc
needs access to all the DUT nodes, either via USB or tcpip.
> (The following model is based on a theoretical device which doesn't
> have the crippling USB OTG problem of the hikey - but the hikey can
> work in this model if the IP addresses are determined statically and
> therefore are available to each slave LXC.)
>
> 0: A program executing in the Master LXC which uses USB to send
> commands to the master DUT which allow the Master LXC to retrieve the
> IP address of the master DUT.
>
> 1: That program in the Master LXC then uses the MultiNode API
> (lava-send) to declare that IP address to all the slave nodes. This is
> equivalent to how existing jobs declare the IP address of the device
> when using secondary connections.
>
> 2: Each slave node waits for the master-ip-addr message and sets that
> value in a program executing in the slave LXC. The slave LXC is
> connected to the slave DUT directly using USB so can use this to set
> the master IP address, if that is required.
>
> 3: Each slave node now runs a program in each slave LXC to connect to
> the slave DUT over USB and extract the slave DUT IP address
>
> 4: Each slave node then broadcasts that slave-<ID>-ip-addr message, so
> the first slave sends slave-1-ip-addr containing an IP address, slave
> 2 sends slave-2-ip-addr containing a different IP address.
>
> 5: The master node is waiting for all of these messages to be sent and
> retrieves the values in turn. This information is now available to a
> program executing inside the master LXC. This program could use USB to
> set these values in the master DUT, if that is required.
>
> 6: During this time, all the slave nodes are waiting for the master
> node to broadcast another message saying that the work on the master
> is complete.
>
> 7: Once the master sends the complete message, each slave node picks
> up this message from the MultiNode API and the script executing in the
> slave LXC then ends the Lava Test Definition and the slave test job
> completes.
>
> 8: The master can then do some other stuff and then complete.
>
> https://staging.validation.linaro.org/scheduler/job/246447/multinode_defini…
>
> https://staging.validation.linaro.org/scheduler/job/246230/multinode_defini…
>
> Don't obsess about the LXC either. With upcoming changes for docker
> support, we could remove the presence of the LXC entirely. The LXC
> with android devices only exists as a unit of isolation for the
> benefit of the dispatcher. It has useful side effects but the reason
> for the LXC to exist is to separate the fastboot operations from the
> dispatcher operations.
>
> For hikey and it's broken USB OTG support:
>
> 0: Each slave test job turns off the USB OTG support once the slave
> LXC has deployed all the test image files and determined that the
> slave DUT has booted correctly. If not, use lava-test-raise.
>
> 1: Next, each slave LXC uses the IP address of it's own slave DUT to
> check connectivity. If this fails, use lava-test-raise.
>
> 2: Each slave LXC uses the MultiNode API to declare the IP address of
> the slave DUT (because the slave node has determined that this IP is
> working).
>
> 3: The master node is waiting for these messages and these are picked
> up by the master LXC test definition.
>
> 4: The master LXC test definition issues commands to the master DUT -
> now depending on how the sharding works, this could be over USB (turn
> the USB OTG off later) or over TCP/IP (turn off the master USB OTG at
> the start of this test definition).
>
> 5: The master DUT has enough information to drive the sharding across
> the slave DUTs. The slave LXCs are waiting for the master to finish
> the sharding. (lava-wait)
>
> 6: When the master LXC determines that the master DUT has finished the
> sharding, then the master LXC sends a message to all the slave nodes
> that the test is complete.
>
> 7: Each slave node picks up the completion message in the slave LXC
> and the test definition finishes.
>
> 8: The master node can continue to do other tasks or can also complete
> it's test definition.
>
>
> > ....
> >
> > I see two options for adb over tcpip.
> >
> > Option #1: WiFi. adb over wifi can be enabled easily by issuing adb
> > cmds from lxc. I am not using it for two reasons.
>
> Agreed, this doesn't need to rely on WiFi.
>
> >
> > * WiFi isn't reliable for long cts/vts test run.
> > * In Cambridge lab, WiFi sub-network isn't accessible from lxc
> > network. Because of security concerns, there is no plan to change
> > that.
> >
> > Option #2: Wired Ethernet. On devices like hikey, we need to run
> > 'pre-os-command' in boot action to power off OTG port so that USB
> > Ethernet dongle works. Once OTG port is off, lxc has no access to the
> > DUT, then test definition should be executed on DUT, right? I am also
> > having the following problems to do this.
>
> Before the OTG is switched, all data from the DUT needs to be
> retrieved (and set) using the USB connection.
>
> What information you need to set depends on how the sharding works.
>
> The problem, as I see it, is that the slave DUTs have no way to
> declare their IP address to the slave LXC once the OTG port is
> switched. Therefore, you will need to put in a request for the boards
That is the problem I had. And that is why I was trying to run test
definition on Android DUT directly to enable adb over tcpip and
declare IP address. As you mentioned below, it is the wrong direction.
> to have static IP addresses declared in the device dictionary. Then
> the OTG can be switched and things become easier because the LXC knows
> the IP address and can simply declare that to the MultiNode API so
> that the master LXC can know which IP matches which node. There are
> already a number of hikey devices with the static_ip device tag and
> you can specify this device tag in your MultiNode test definition.
Brilliant and brand new idea to me. I didn't realize static-ip tag is
the solution. I have managed to enable and test adb over tcpip in this
way(In my local instance). I have attached my test job definition here
in case it is any help for other LAVA users. The following definitions
are essential.
tags:
- static-ip
reboot_to_fastboot: false
- test:
namespace: tlxc
timeout:
minutes: 10
protocols:
lava-lxc:
- action: lava-test-shell
request: pre-os-command
timeout:
minutes: 2
Thanks,
Chase
>
> >
> > * Without context overriding, overlay tarball will be applied to
> > '/system' directory and test job reported "/system/bin/sh:
>
> Why are you talking about /system ??? MultiNode only operates in a
> POSIX shell - the POSIX shell is in the LXC and each DUT has a
> dedicated LXC. In this use case, MultiNode API calls are only going to
> be made from each LXC. The master LXC sends some information and then
> receives information from test definitions running in each of the
> slave LXCs.
>
> The overlay is to be deployed to the LXC, not the DUT because this is
> an Android system. What the android system does is determined either
> by commands run inside the slave LXC to deploy files (before the OTG
> switch) or commands run inside the master LXC (with knowledge of the
> IP address from the MultiNode API) to execute commands on the DUT over
> TCP/IP.
>
> Use the LXC to deploy the files and boot the device, then to declare
> information about each particular node. Once that is done, whatever
> thing is controlling the test needs to just use TCP/IP to communicate
> and use the MultiNode API to send messages and allow some nodes to
> wait for other nodes whilst the test proceeds.
>
> > /lava-247856/bin/lava-test-runner: not found"[2].
> > * With the following job context, LAVA still runs
> > '/lava-24/bin/lava-test-runner /lava-24/0' and it hangs there. It is
> > tested in my local LAVA instance, test job definition and test log
> > attached. Maybe my understanding on the context overriding is wrong, I
> > thought LAVA should execute '/system/lava-24/bin/lava-test-runner
> > /system/lava-24/0' instead. Any suggestions would be appreciated.
> >
> > context:
> > lava_test_sh_cmd: '/system/bin/sh'
> > lava_test_results_dir: '/system/lava-%s'
> >
> > I checked on the DUT directly, '/system/lava-%s' exist, but I cannot
> > really run lava-test-runner. The shebang line seems problematic.
> >
> > --- hacking ---
> > hikey:/system/lava-24/bin # ./lava-test-runner
> > /system/bin/sh: ./lava-test-runner: No such file or directory
> > hikey:/system/lava-24/bin # cat lava-test-runner
> > #!/bin/bash
> >
> > #!/bin/sh
> >
> > ....
> > # /system/bin/sh lava-test-runner
> > lava-test-runner[18]: .: /lava/../bin/lava-common-functions: No such
> > file or directory
> > --- ends ---
> >
> > I had a discussion with Milosz. He proposed the third option which
> > probably will be the most reliable one, but it is not supported in
> > LAVA yet. Here is the idea. Milosz, feel free to explain more.
> >
> > **Option #3**: Add support for accessing to multiple DUTs in single node job.
> >
> > * Physically, we need the DUTs connected via USB cable to the same dispatcher.
>
> I don't see that this solves anything and it adds a lot of unnecessary
> lab configuration - entirely duplicating the point of having ethernet
> connections to the boards. Assign static IP addresses to each board
> and when the test job starts, each dedicated LXC can declare the
> static information according to whichever board was assigned to
> whichever node.
>
> The DUTs only need to be visible to programs running on the master
> node and that can be done by declaring static IP addresses using the
> MultiNode API.
>
> > * In single node job, LAVA needs to add the DUTs specified(somehow) or
> > assigned randomly(lets say both device type and numbers defined) to
> > the same lxc container. Test definitions can take over from here.
>
> No - the LXC is used to issue commands to deploy test images to the
> DUT. The LXC is a transparent part of the dispatcher, it is not just
> for test definitions. The LXC cannot be used for multiple test jobs,
> it is part of the one dispatcher.
>
> >
> > Is this can be done in LAVA? Can I require the feature? Any
> > suggestions on the possible implementations?
> >
> >
> > Thanks,
> > Chase
> >
> > [1] https://review.linaro.org/#/c/qa/test-definitions/+/29417/4/automated/andro…
> > [2] https://staging.validation.linaro.org/scheduler/job/247856#L1888
> > _______________________________________________
> > Lava-users mailing list
> > Lava-users(a)lists.lavasoftware.org
> > https://lists.lavasoftware.org/mailman/listinfo/lava-users
>
>
>
> --
>
> Neil Williams
> =============
> neil.williams(a)linaro.org
> http://www.linux.codehelp.co.uk/
Dear all,
In test stage, I have a case, when it run, one firmware OOM. So I trigger the device to reboot, Then all left case cannot run . I can not use multiple boot when define job in advance because I can not predict which case will make OOM, what you suggest doing?
Hello list,
Apologies if this question has been asked already. I have a test framework which spits out a junit file.
What’s the best way to import data from the junit file into LAVA?
Cheers
--
Diego Russo | Staff Software Engineer | Mbed Linux OS
ARM Ltd. CPC1, Capital Park, Cambridge Road, Fulbourn, CB21 5XE, United Kingdom
http://www.diegor.co.uk - https://os.mbed.com/linux-os/
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hello,
we have the need to perform tests that require reboots of the DUT between their executions. Few examples are to check rootfs upgrades or to check configurations changes to persist .
I have few questions:
* Does LAVA support those cases?
* If yes, does LAVA support multiple reboots?
* If yes, how can I write tests in order to run different sets of tests at any boot.
* Example: 1) do an upgrade 2) reboot the device 3) Check if the upgrade was successful
* How can I structure my pipeline?
Thanks
--
Diego Russo | Staff Software Engineer | Mbed Linux OS
ARM Ltd. CPC1, Capital Park, Cambridge Road, Fulbourn, CB21 5XE, United Kingdom
http://www.diegor.co.uk - https://os.mbed.com/linux-os/
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
>
>
> When Lava start test or finish the test job, squad zmq SUB socket seems
> not receive log message from lava-server.
>
> It just print out '[2019-01-22 08:26:24 +0000] [DEBUG]
> nexell.lava.server2: connected to tcp://192.168.1.20:2222:5500'.
>
The url is really strange as two port numbers are specified. Should be
"tcp://192.168.1.20:5500"
Rgds
--
Rémi Duraffort
LAVA Team, Linaro
Hello,
we have the need to perform tests that require reboots of the DUT between their executions. Few examples are to check rootfs upgrades or to check configurations changes to persist .
I have few questions:
* Does LAVA support those cases?
* If yes, does LAVA support multiple reboots?
* If yes, how can I write tests in order to run different sets of tests at any boot.
* Example: 1) do an upgrade 2) reboot the device 3) Check if the upgrade was successful
* How can I structure my pipeline?
Thanks
--
Diego Russo | Staff Software Engineer | Mbed Linux OS
ARM Ltd. CPC1, Capital Park, Cambridge Road, Fulbourn, CB21 5XE, United Kingdom
http://www.diegor.co.uk - https://os.mbed.com/linux-os/
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Dear Lava users,
I'm trying to create a job that performs the following steps :
Deploy to flasher our custom tool (OK)
Boot in bootloader mode - u-boot (OK)
Test in interactive mode (OK)
From interactive mode start kernel (OK)
Launch a test from kernel (KO)
The last part fails because I don't know where to check the kernel prompt. I cannot add a boot stage and declare an additional prompt because I don't want to reboot, that would reset the configuration done in interactive mode.
Do you have any piece of advice or example of job showing how to proceed to manage the kernel prompt & autologin in such a job?
Best regards,
Denis
Hello,
A request from some Lava users internally.
We have 3 boot stages, TF-A, U-boot, and kernel.
Is there a way, in a Lava job, to test that these components' versions are the expected ones?
That would mean, as far as I understand, not testing the embedded software itself, but the Lava job log...
Best regards,
Denis
Two questions here:
1. If a master-only instance here, how to upgrade without data loss?
2. If a slave-only instance here, how to upgrade without data loss?
Please suggest, thanks.
Hello,
the lava-dispatcher package installs a file in /etc/modules-load.d/lava-modules.conf, containing the following lines:
install ipv6 modprobe ipv6
install brltty /bin/false
On my debian 9.3 system, this file leads to failures when the systemd-modules-load service starts:
Job for systemd-modules-load.service failed because the control process exited with error code.
See "systemctl status systemd-modules-load.service" and "journalctl -xe" for details.
The details in journalctl say:
Jan 07 15:08:30 a048 systemd[1]: Starting Load Kernel Modules...
-- Subject: Unit systemd-modules-load.service has begun start-up
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit systemd-modules-load.service has begun starting up.
Jan 07 15:08:30 a048 systemd-modules-load[2088]: Failed to find module 'install ipv6 modprobe ipv6'
Jan 07 15:08:30 a048 systemd-modules-load[2088]: Failed to find module 'install brltty /bin/false'
Jan 07 15:08:30 a048 systemd[1]: systemd-modules-load.service: Main process exited, code=exited, status=1/FAILURE
Jan 07 15:08:30 a048 systemd[1]: Failed to start Load Kernel Modules.
Obviously the "install" syntax is supported only in /etc/modprobe.d:
https://manpages.debian.org/stretch/kmod/modprobe.d.5.en.html
But not in /etc/modules-load.d:
https://manpages.debian.org/stretch/systemd/modules-load.d.5.en.html
Is this a known issue?
Mit freundlichen Grüßen / Best regards
Tim Jaacks
DEVELOPMENT ENGINEER
Garz & Fricke GmbH
Tempowerkring 2
21079 Hamburg
Direct: +49 40 791 899 - 55
Fax: +49 40 791899 - 39
tim.jaacks(a)garz-fricke.com
www.garz-fricke.com
WE MAKE IT YOURS!
Sitz der Gesellschaft: D-21079 Hamburg
Registergericht: Amtsgericht Hamburg, HRB 60514
Geschäftsführer: Matthias Fricke, Manfred Garz, Marc-Michael Braun
Folks,
We have the need to run some tests which requires a process off DUT to be executed.
Examples are:
* port scan like process against the DUT
* cli tests which interact with the DUT
The workflow could be like:
* Deploy the DUT
* Boot the DUT
* Run a process off DUT (actual test)
* Collect test results
In our setup we assume there is always a network connection between the DUT and the LAVA dispatcher.
How can we achieve such workflow with LAVA?
Cheers
--
Diego Russo | Staff Software Engineer | Mbed Linux OS
ARM Ltd. CPC1, Capital Park, Cambridge Road, Fulbourn, CB21 5XE, United Kingdom
http://www.diegor.co.uk - https://os.mbed.com/linux-os/
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Dear all,
I need to get "jobs logs" using lavacli. To do that, I use the following command : lavacli jobs logs <job_id>
I can perform this operation with a user that has a profile "superuser".
We have also users with restricted permissions ( they can only submit jobs using lavacli ), and I would like them
to get logs using lavacli.
So, I have consulted in the Lava django administration page, the users permission that can be set, but I could not find
which option could allow a "basic user" to get log jobs using lavacli.
Is it possible to allow a user with "basic profile" to do that ?
If yes, with user permission have to be set ?
Best regards
Philippe Begnic
Just use the sample command "python zmq_client.py -j 357 --hostname tcp://127.0.0.1:5500 -t 1200"
Get the error:
Traceback (most recent call last):
File "zmq_client_1.py", line 155, in <module>
main()
File "zmq_client_1.py", line 139, in main
publisher = lookup_publisher(options.hostname, options.https)
File "zmq_client_1.py", line 109, in lookup_publisher
socket = server.scheduler.get_publisher_event_socket()
File "/usr/lib/python3.5/xmlrpc/client.py", line 1092, in __call__
return self.__send(self.__name, args)
File "/usr/lib/python3.5/xmlrpc/client.py", line 1432, in __request
verbose=self.__verbose
File "/usr/lib/python3.5/xmlrpc/client.py", line 1134, in request
return self.single_request(host, handler, request_body, verbose)
File "/usr/lib/python3.5/xmlrpc/client.py", line 1146, in single_request
http_conn = self.send_request(host, handler, request_body, verbose)
File "/usr/lib/python3.5/xmlrpc/client.py", line 1259, in send_request
self.send_content(connection, request_body)
File "/usr/lib/python3.5/xmlrpc/client.py", line 1289, in send_content
connection.endheaders(request_body)
File "/usr/lib/python3.5/http/client.py", line 1103, in endheaders
self._send_output(message_body)
File "/usr/lib/python3.5/http/client.py", line 934, in _send_output
self.send(msg)
File "/usr/lib/python3.5/http/client.py", line 877, in send
self.connect()
File "/usr/lib/python3.5/http/client.py", line 849, in connect
(self.host,self.port), self.timeout, self.source_address)
File "/usr/lib/python3.5/socket.py", line 694, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
File "/usr/lib/python3.5/socket.py", line 733, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known
I wonder for following code, why http://tcp://127.0.0.1:5500/RPC2 will finally be passed to ServerProxy?
If I remove http, just tcp://127.0.0.1:5500/RPC2, it will say "OSError: unsupported XML-RPC protocol"
If I remove tcp, just http://127.0.0.1:5500/RPC2, it will hang in get_publisher_event_socket
I use 2018.11 version, please suggest!!!
xmlrpc_url = "http://%s/RPC2" % (hostname)
if https:
xmlrpc_url = "https://%s/RPC2" % (hostname)
server = xmlrpc.client.ServerProxy(xmlrpc_url)
try:
socket = server.scheduler.get_publisher_event_socket()
I am trying to submit test job using squad.
When squad fetch the result of lava, the interval is too long.
It takes at least 1 hour. I want to reduce fetch time as short as possible.
This is what Squad team answer.
Alternatively you can turn on ZMQ notifications in LAVA and run squad
listener. This will cause test results to be fetched immediately after
the test job finishes in LAVA. Enabling ZMQ publisher:
https://master.lavasoftware.org/static/docs/v2/advanced-installation.html#c….
SQUAD listener is a separate process. It doesn't need any additional
settings on top of what you already have.
So I want to try restart lava-publisher service.
I am running lava-server and dispatcher using Linaro lava docker image.
On docker container, there is no service named lava-publisher.
How can I manage this?
Thanks.
Hello,
Our LAVA deployment has both RPi3 B and B+ and we are interested only on the 32bit version. The device type to use looks like:
https://git.lavasoftware.org/lava/lava/blob/master/lava_scheduler_app/tests…
Does this device type cover both B and B+? I mean, can we use it for B+ as well?
If yes, what's the best way to differentiate those on LAVA? Creating a new device type for B+ (which is the same of the above) or using tags?
Thanks
--
Diego Russo | Staff Software Engineer | Mbed Linux OS
ARM Ltd. CPC1, Capital Park, Cambridge Road, Fulbourn, CB21 5XE, United Kingdom
http://www.diegor.co.uk - https://os.mbed.com/linux-os/
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi,
I've been using the lava-server docker image (hub.lavasoftware.org/lava/lava/lava-server:2018.10) for a couple of weeks and I had some problems on making the data persistent for the postgres (mapping the volume from host to container). Anyways I decided to take postgres out from the startup by modifying the entrypoint.sh file:
++++++++++++++ Added these lines after start_lava_server_gunicorn -function ++++++++++++++++++++++++++
if [[ ! -z $DJANGO_POSTGRES_SERVER ]]; then
txt="s/LAVA_DB_SERVER=\"localhost\"/LAVA_DB_SERVER=\"$DJANGO_POSTGRES_SERVER\"/g"
sed -i $txt /etc/lava-server/instance.conf
fi
if [[ ! -z $DJANGO_POSTGRES_PORT ]]; then
txt="s/LAVA_DB_PORT=\"5432\"/LAVA_DB_PORT=\"$DJANGO_POSTGRES_PORT\"/g"
sed -i $txt /etc/lava-server/instance.conf
fi
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
---------------------------- Commented out these lines -----------------------------
# Start all services
#echo "Starting postgresql"
#/etc/init.d/postgresql start
#echo "done"
#echo
#echo "Waiting for postgresql"
#wait_postgresql
#echo "[done]"
#echo
-------------------------------------------- ----------------------------------------------------
After that I created a new Dockerfile and built a new image:
+++++++++++++++++++++++++ Dockerfile +++++++++++++++++++++++
FROM hub.lavasoftware.org/lava/lava/lava-server:2018.10
COPY ./entrypoint.sh /root/
RUN chmod 755 /root/entrypoint.sh
ENTRYPOINT ["/root/entrypoint.sh"]
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> sudo docker build -t lava-server:mod .
Just to get all up and running I made a docker-compose.yml file to kick postgres and lava-server up
+++++++++++++++++++++++++++ docker-compose.yml +++++++++++++++++++++++++
version: '3'
services:
postgres:
image: postgres:11
restart: always
environment:
POSTGRES_DB: lavaserver
POSTGRES_USER: lavaserver
POSTGRES_PASSWORD: d3e5d13fa15f
lava-server:
depends_on:
- postgres
image: lava-server:mod
environment:
DJANGO_POSTGRES_PORT: 5432
DJANGO_POSTGRES_SERVER: postgres
ports:
- "80:80"
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
I still feel that there is too much going on inside that lava-server:mod image, since it has following softwares running:
* Lavamaster
* Gunicorn
* Logger
* Publisher
What do you think, should I still break it into smaller pieces? Pros on this would be that softwares wouldn't die silently (started by using '&' and docker could try to restart them), logs would be in their own logs windows' (docker log CONTAINER) and containers themselves would be more configurable via environment variables.
Br
Olli Väinölä
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
On Wed, 16 Jan 2019 at 17:33, Steve McIntyre <steve.mcintyre(a)linaro.org> wrote:
>
> Hi,
>
> In founding the LAVA Software Community Project, the team planned to
> open up LAVA development more. As already announced by Neil in
> September, we have already moved our infrastructure to a GitLab
> instance and LAVA developers and users can collaborate there. [1]
>
> The next step in our process is to also open our regular development
> design meetings to interested developers. The LAVA design meeting is
> where the team gets together to work out deep technical issues, and to
> agree on future development goals and ideas. We run these as a weekly
> video conference using Google Hangouts Meet [2], We now wish to
> welcome other interested developers to join us there too, to help us
> develop LAVA.
Steve, the only missing bit is which day and what time? :)
milosz
>
> Summaries of the meetings will be posted regularly to the lava-devel
> mailing list [3], and we encourage interested people to subscribe and
> discuss LAVA development there.
>
> [1] https://git.lavasoftware.org/
> [2] https://meet.google.com/qre-rgen-zwc
> [3] https://lists.lavasoftware.org/mailman/listinfo/lava-devel
>
> Cheers,
> --
> Steve McIntyre steve.mcintyre(a)linaro.org
> <http://www.linaro.org/> Linaro.org | Open source software for ARM SoCs
> _______________________________________________
> Lava-announce mailing list
> Lava-announce(a)lists.lavasoftware.org
> https://lists.lavasoftware.org/mailman/listinfo/lava-announce
Hi everyone,
I want to change the default path of MEDIA_ROOT and ARCHIVE_ROOT.
So in /etc/lava-server-settings.conf I wrote this :
"MEDIA_ROOT": "/data/lava/var/lib/lava-server/default/media",
"ARCHIVE_ROOT": "/data/lava/var/lib/lava-server/default/archive",
This seems to work but the web UI doesn't display the job output and when I
try to download
the plain log, it returns an error saying it can't be found.
Could you tell me what should I do to make the web ui find the job outputs ?
Best regards,
Axel
In the thread about Git Authentication a solution was proposed using git credentials and the fact that the dispatcher is running as root.
See: https://lists.lavasoftware.org/pipermail/lava-users/2018-December/001455.ht…
I've worked out that even though the dispatcher is running as root, the environment is purged based upon env.yaml that is sent over from the master.
I found that I had to add HOME=/root into env.yaml on the master for the git clone to pick up the files in the /root folder.
Hope this helps anyone else trying to do this.
Pete
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Dear Remi,
I ran those commands you suggested in lava shell and it printed as below :
Python 3.5.3 (default, Jan 19 2017, 14:11:04)
[GCC 6.3.0 20170118] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> from django.contrib.sites.models import Site
>>> Site.objects.all()
<QuerySet [<Site: 192.168.100.103>]>
>>> Site.objects.count()
1
>>>
Dear all,
I was encountered with a login in issue . I've changed domain name and display name to IP , but when I relogin now, I can't reach login page , and it reports as below :
500 Internal Server ErrorSite matching query does not exist.
Oops, something has gone wrong!
And I set 'DEBUG = True' in settings.conf , and it reported like below :
DoesNotExist at /accounts/login/
Site matching query does not exist.
| Request Method: | GET |
| Request URL: | http://127.0.0.1:8000/accounts/login/?next=/scheduler/job/42 |
| Django Version: | 1.11.14 |
| Exception Type: | DoesNotExist |
| Exception Value: |
Site matching query does not exist.
|
| Exception Location: | /usr/lib/python3/dist-packages/django/db/models/query.py in get, line 380 |
| Python Executable: | /usr/bin/python3 |
| Python Version: | 3.5.3 |
| Python Path: |
['/',
'/usr/bin',
'/usr/lib/python35.zip',
'/usr/lib/python3.5',
'/usr/lib/python3.5/plat-x86_64-linux-gnu',
'/usr/lib/python3.5/lib-dynload',
'/usr/local/lib/python3.5/dist-packages',
'/usr/local/lib/python3.5/dist-packages/icsectl-0.2-py3.5.egg',
'/usr/lib/python3/dist-packages']
|
So I just wanna know how can I set it back?
Yours , sincerely
Su Chuan
Hello LAVA mailing list,
I see RPI3 is supported by LAVA but unfortunately it doesn't fit Mbed Linux OS (MBL) case. Our build system produces a WIC image which can be written on the SD card directly (instructions here: https://os.mbed.com/docs/linux-os/v0.5/getting-started/writing-and-booting-…).
This is needed for two main reasons:
* MBL has its own booting flow
* MBL expects a partition layout (not only the rootfs) with multiple partitions in order to work.
What's the best way to automate the deployment on MBL on RPI3 via LAVA in order to run our tests?
Cheers
--
Diego Russo | Staff Software Engineer | Mbed Linux OS
ARM Ltd. CPC1, Capital Park, Cambridge Road, Fulbourn, CB21 5XE, United Kingdom
http://www.diegor.co.uk - https://os.mbed.com/linux-os/
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
https://fosdem.org/2019/
(No registration is necessary for FOSDEM - if you can get to Brussels,
you are welcome to just turn up on site. Avoid trying to park near the
site unless you get there very early, use public transport.)
https://fosdem.org/2019/practical/https://fosdem.org/2019/stands/
If you haven't been to FOSDEM before, note that there will be a huge
number of people attending FOSDEM (well over 8,000 are expected), so
the only real chance of meeting up is to have a known meeting point.
Hopefully, you'll have a chance to get to some of the 700 sessions
which have been arranged over the 2 days.
Linaro will have a stand at FOSDEM 2019 in the AW building. The stand
will be on the ground floor, along the corridor from AW120 near
MicroPython and PINE64. Various Linaro and LAVA people will be around
the stand during the event, so this can act as a focal point for
anyone wanting to talk about Linaro and or LAVA during FOSDEM.
https://fosdem.org/2019/schedule/room/aw1120/
The AW building is near the car park, across from Janson and the H building.
Various Linaro and LAVA people have been routinely attending FOSDEM
for a number of years. If you are able to get there, we will be happy
to see you.
--
Neil Williams
=============
neil.williams(a)linaro.org
http://www.linux.codehelp.co.uk/
Hi lava team,
I want to use lava for running shell scripts on several devices which are connected to the lava server via ssh.
Is there a way to deploy the overlay to another directory than "/"?
Example:
The overlay is downloaded to /lava-123.tar.gz and extracted to /lava-123 by default.
What if my rootfs "/" is read-only and I want to download and extract the overlay to /tmp or /data, which are read-write mounted?
Best regards,
Christoph
Hi,
I have some test files stored in a private GIT repository. As I understand it there is no easy way to get the authentication passed through to the Test Definition, so I am looking for other options.
Is there a way that I can write the Test Definition to reference the files elsewhere?
I can bundle them into a tarball or zipfile and have them served via http without the need for authentication, but I cannot figure out how to describe this in the Test Definition.
Thanks.
Pete
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hello,
What is the rationale to ignore individual lava-test-case results when running a health check jobs?
For example, the following job failed one case: https://validation.linaro.org/scheduler/job/1902316
A failing test case can be an indication of device malfunction (e.g. out of disk space, hw issues). Is it possible to force LAVA to fail a health check and thus put device in a "bad" state if one of the test cases is not successful?
Thanks,
Andrei Narkevitch
Cypress Semiconductors
This message and any attachments may contain confidential information from Cypress or its subsidiaries. If it has been received in error, please advise the sender and immediately delete this message.
After setup a master & a worker, I add this worker to this master, also add device to this worker.
Then submit this job, the web page just show the task is in the status "Submit".
The same configures seems work on my local machine(with master & worker on the same machine)
I cannot find the root cause as job's device-type which the device have is idle & the health is good.
For such kind of issue, any way to debug? I mean if any log I can find for why lava cannot dispatch the task with its 20 seconds poll?
Hello,
please keep the mailing-list in copy when replying.
thanks for your help . I will discuss two questions (1. Change site
> issue ; 2 secure boot mode ; ) in this reply.
> 1) Change site issue : We changed both 'display name' and
> 'display name' (example.com ) to IP address , and now we found that we
> couldn't login in , when we pressed login in link on home page , it
> reported as below:
>
>
> *500 Internal Server Error**Site matching query does not exist.*
>
> *Oops, something has gone wrong! *
>
That's really strange. Could you update /etc/lava-server/settings.conf to
set "DEBUG" to true, restart lava-server-gunicorn, refresh the page and
send back the stack trace.
Keep in mind that settings.conf is a json file, so check that the syntax is
valid json.
> 2) secure boot mode : at present , we solved it by utilizing
> ''ramdisk" key words and forcing it not to decompress utee file , but we
> don't know if we will utilize this key words in the future , so maybe it's
> not a good solution. If we find any better solution after we reviews all
> source codes ,we will submit it !
>
Right.
Rgds
--
Rémi Duraffort
Dear all,
I'm trying booting my devices under test in a secure boot mode , which means i have to download dtb、kernel 、uTee files and set nfs serverip to mount rootfs in uboot , so there's an extra uTee file . And i checked key words in deploy steps and there was none extra for this action . And we only found 'image' 、’ramdisk‘、'dtb'、'rootfs' in job_templates.py under lava_server directionary . So what should i do to add this action in deploy?