Leveling up snapd integration tests

Over the last several months there has been noticeable and growing pain associated with the evolving integration tests around snapd, and given the project goal of being a cross-distribution platform, we are very keen on solving this problem appropriately so that stability is guaranteed everywhere.

With that mindset a more focused effort was made over the last few weeks to produce a tool that can get the project out of those problems, and onto a runway of more pleasant stability. Despite the short amount of time, I’m very happy about the Spread project which resulted from this effort.

Spread is not Jenkins or Travis, and is not a language or library either. Spread is a tool that will very conveniently ship your code to one or more systems, in parallel, and then offer the right set of options so you can run whatever you need to run to make sure the logic is working, and drive it all from the local system. That implies you can run Spread inside Travis, Jenkins, or your terminal, in a similar way to how your unit tests work.

Here is a short list of interesting facts about Spread:

  • Full-system tests with on demand machine allocation.
  • Multi-backend with Linode and LXD (for local runs) out of the box for now.
  • Multi-language since it can run arbitrary remote code.
  • Agent-less and driven via embedded ssh (kudos to Go team).
  • Convenient harness with project+backend+suite+test prepare and restore scripts.
  • Variants feature for test duplication without copy & paste.
  • Great debugging support – add -debug and stop with a shell inside every failure.
  • Reuse of servers – server allocation is fast, but not allocating is faster.
  • Reasonable test outputs with the shell’s +x mode on failures.
  • … and so forth.

This is all well documented, so I’ll just provide one example here to offer a real taste of how the system feels like.

This is spread.yaml, put in the project root to define the basics:

project: spread

backends:
    lxd:
        systems:
            - ubuntu-16.04
            - ubuntu-14.04

path: /home/test

prepare: |
    echo Entering project...
restore: |
    echo Leaving project...

suites:
    tests/: 
        summary: Integration tests
        prepare: |
            echo Entering suite...
        restore: |
            echo Leaving suite...

The suite name is also the path under which the tests are found.

Then, this is tests/hello/task.yaml:

summary: Greet the world
prepare: |
    echo "Entering task..."
restore: |
    echo "Leaving task..."
environment:
    FOO/a: one
    FOO/b: two
execute: |
    echo "Hello world!"
    [ $FOO = one ] || exit 1

The outcome should be almost obvious (intended feature :-). The one curious detail here is the FOO/a and FOO/b environment variables. This is how to introduce variants, which means this one test will in fact become two: first with FOO=one, and then with FOO=two. Now consider that such environment variables can be defined at any level – project, backend, suite, and task – and imagine how easy it is to test small variations without any copy & paste. After cascading takes place (project→backend→suite→task) all environment variables using a given variant key will be present at once on the same execution.

Now let’s try to run this configuration, including the -debug flag so we get a shell on the failures. Note how with a single test we get four different jobs, two variants over two systems, with the variant b failing as instructed:

$ spread -debug

2016/06/11 19:09:27 Allocating lxd:ubuntu-14.04...
2016/06/11 19:09:27 Allocating lxd:ubuntu-16.04...
2016/06/11 19:09:41 Waiting for LXD container to have an address...
2016/06/11 19:09:43 Waiting for LXD container to have an address...
2016/06/11 19:09:44 Allocated lxd:ubuntu-14.04.
2016/06/11 19:09:44 Connecting to lxd:ubuntu-14.04...
2016/06/11 19:09:48 Allocated lxd:ubuntu-16.04.
2016/06/11 19:09:48 Connecting to lxd:ubuntu-16.04...
2016/06/11 19:09:52 Connected to lxd:ubuntu-14.04.
2016/06/11 19:09:52 Sending project data to lxd:ubuntu-14.04...
2016/06/11 19:09:53 Connected to lxd:ubuntu-16.04.
2016/06/11 19:09:53 Sending project data to lxd:ubuntu-16.04...

2016/06/11 19:09:54 Error executing lxd:ubuntu-14.04:tests/hello:b :
-----
+ echo Hello world!
Hello world!
+ [ two = one ]
+ exit 1
-----

2016/06/11 19:09:54 Starting shell to debug...

lxd:ubuntu-14.04 ~/tests/hello# echo $FOO
two
lxd:ubuntu-14.04 ~/tests/hello# cat /etc/os-release | grep ^PRETTY
PRETTY_NAME="Ubuntu 14.04.4 LTS"
lxd:ubuntu-14.04 ~/tests/hello# exit
exit

2016/06/11 19:09:55 Error executing lxd:ubuntu-16.04:tests/hello:b :
-----
+ echo Hello world!
Hello world!
+ [ two = one ]
+ exit 1
-----

2016/06/11 19:09:55 Starting shell to debug...

lxd:ubuntu-16.04 ~/tests/hello# echo $FOO
two
lxd:ubuntu-16.04 ~/tests/hello# cat /etc/os-release | grep ^PRETTY
PRETTY_NAME="Ubuntu 16.04 LTS"
lxd:ubuntu-16.04 ~/tests/hello# exit
exit


2016/06/11 19:10:33 Discarding lxd:ubuntu-14.04 (spread-129)...
2016/06/11 19:11:04 Discarding lxd:ubuntu-16.04 (spread-130)...
2016/06/11 19:11:05 Successful tasks
2016/06/11 19:11:05 Aborted tasks: 0
2016/06/11 19:11:05 Failed tasks: 2
    - lxd:ubuntu-14.04:tests/hello:b
    - lxd:ubuntu-16.04:tests/hello:b
error: unsuccessful run

This demonstrates many of the stated goals (parallelism, clarity, convenience, debugging, …) while running on a local system. Running on a remote system is just as easy by using an appropriate backend. The snapd project on GitHub, for example, is hooked up on Travis to run Spread and then ship its tests over to Linode. Here is a real run output with the initial tests being ported, and a basic smoke test.

If you like what you see, by all means please go ahead and make good use of it.

We’re all for more stability and sanity everywhere.

@gniemeyer

This entry was posted in Architecture, Assembly, C/C++, Cloud, Design, Go, Hardware, Haskell, Java, Lua, Project, Python, Ruby, Snippet, Test. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *