martin-fieber.de | TAP — Test Anything Protocol

TAP — Test Anything Protocol

Making contact

Years ago, when I first heard about the Test Anything Protocol (TAP), I gave it little thought. I wrote tests, I ran them, and I saw if they all passed or not. Whatever programming language I used, the testing harness — test runner, framework, data, the tests and surrounding tools — usually produced some result that could be easily interpreted by humans and machines, most of the time. For all I wanted from my tests back then, it was enough.

Over time, different needs and expectations arose. Using many test frameworks, switching between languages, and creating tooling that works on top of the test runner output; a lot can change with projects maintained over a long period of time.

And with that ever-changing landscape came the longing for stability and reusability — to not again write a tool for the next test runner, not reinterpret test output, or be able to get the same output format for tests written in different languages.

Enter TAP.

TAP it!

TAP is a text-based interface, offering a unified way of not only reporting test results but also how tools should handle the output, as a stream, describing the communication from a TAP producer to a TAP consumer. This decouples the test reporting from the presentation and further processing.

TAP producers can be many things, from unit and integration testing frameworks to visual testing software, lint tooling, build systems, etc. Support exists in many languages, but there are also language-agnostic tools.

TAP consumers are programs that take the input and transform, process, or format it. This can range from adding color, full reformatting, transforming to different output formats such as JUnit or HTML, running analysis and statistics on it, integrating it into CI/CD like Jenkins, or even getting desktop notifications, if so desired.

A core philosophy is to not wait for the full output, but rather process the data line by line, or chunk by chunk, as a stream. The reason for the "data chunks" is that TAP can contain YAML blocks, that can potentially only make sense as a whole block.

Specification

The full specification for the current version, TAP14, can be found on the official TAP website. Taken from this website, this is an example of what a test output can look like for four tests, where two pass and two don't.

TAP version 14
1..4
ok 1 - Input file opened
not ok 2 - First line of the input valid
  ---
  message: 'First line invalid'
  severity: fail
  data:
    got: 'Flirble'
    expect: 'Fnible'
  ...
ok 3 - Read the rest of the file
not ok 4 - Summarized correctly # TODO Not written yet
  ---
  message: "Can't make summary yet"
  severity: todo
  ...

Between --- and ... are YAML blocks.

The specification also describes how consumers should handle new versions, with the goal of being forward-compatible.

In general, the official website is very readable and holistically describes TAP, how to implement the protocol, how to upgrade one TAP version to another, the ideas behind it, language and tooling support, and much more. It is well worth a read if the use of TAP is considered.

Getting giddy about a protocol

In a full test harness, neither the producer nor consumer need to be written in the same language. Tooling from one context can be used in another, resulting in different projects sharing more tooling and me spending less time reworking systems to fit different needs.

That was what really sold me on TAP. It is minimal and readable*, tools can be reused; and it is language agnostic, being nothing more than a text-based interface, an interface that has served me very well over the years.

Whenever I have the option to use TAP, I will. Without further ado, let's take a look at some examples of how to produce TAP with different languages and tools.

Produce TAP

Getting TAP as a reporting format is a straight-forward process, usually setting a "reporter" flag. I picked a collection of possible producers to showcase the setup.

C++ with Catch2

Catch2 comes with TAP as a reporter built-in. To get a minimal example with CMake going, the following CMake CMakeLists.txt will fetch Catch2 and set up an example project.

# CMakeLists.txt
cmake_minimum_required(VERSION 3.5)

project(TAP LANGUAGES CXX)

include(FetchContent)

FetchContent_Declare(
  Catch2
  GIT_REPOSITORY https://github.com/catchorg/Catch2.git
  GIT_TAG v3.3.2
)

FetchContent_MakeAvailable(Catch2)

add_executable(Test Test.cpp)
target_link_libraries(Test
  PRIVATE Catch2::Catch2WithMain)

The Test.cpp looks like the following, taken from the Catch2 tutorial:

// Test.cpp
#include <catch2/catch_test_macros.hpp>

unsigned int Factorial(unsigned int number) {
  return number <= 1
    ? number
    : Factorial(number - 1) * number;
}

TEST_CASE("Factorials are computed", "[factorial]") {
  REQUIRE(Factorial(1) == 1);
  REQUIRE(Factorial(2) == 2);
  REQUIRE(Factorial(3) == 6);
  REQUIRE(Factorial(10) == 3628800);
}

Building the project with CMake through the command line.

$ cmake -B build/debug
$ cmake --build build/debug

The $ is used to indicate a command entered into the terminal.

And finally, running the test using TAP as a reporter.

$ ./build/debug/Test --reporter TAP

View example setup on GitHub.

Lua with LuaUnit

The popular Lua testing framework LuaUnit has TAP built-in as a reporter. To set up an environment to run tests with LuaRocks, I can refer to an earlier article of mine on setting up projects with LuaRocks.

With the environment ready and a project folder capable of running LuaRocks, a minimal example rockspec could look like the following.

-- test-1.0.0-1.rockspec
rockspec_format = "3.0"
package = "test"
version = "1.0.0-1"
source = {
  url = "..."
}
build = {
  type = "builtin",
  modules = { "test.lua" }
}
test_dependencies = {
  "luaunit >= 3.4"
}

Given a test file.

-- test.lua
local lu = require('luaunit')

TestCompare = {}

function TestCompare.test1()
  local A = { 1, 2 }
  local B = { 1, 2 }
  lu.assertEquals(A, B)
end

function TestCompare.test2()
  local A = { "a", "b" }
  local B = { "a", "b" }
  lu.assertEquals(A, B)
end

os.exit(lu.LuaUnit.run())

The LuaRocks test command can be executed, passing the output option for TAP to LuaUnit.

$ luarocks test -- -o tap

View example setup on GitHub.

JavaScript with node-tap

For JavaScript and Node.js, there is the amazing Node-Tap testing library. With Node.js installed, to get a minimal setup going, the following package.json can describe a project.

{
  "name": "tap-with-node",
  "version": "1.0.0",
  "scripts": {
    "test": "node test.js"
  },
  "devDependencies": {
    "tap": "^16.3.4"
  }
}

A small sample test file shows two tests passing and one failing.

// test.js
const tap = require('tap');

tap.pass('this is fine');
tap.fail('this is not');
tap.pass('this is also fine');

Installing the node-tap dependency for the project setup.

$ npm install

And running the tests. By default, node-tap will produce TAP output.

$ npm test

View example setup on GitHub.

Python with Tappy

Tappy can generate TAP output and integrates with Python's unittest module. Prerequisite is having an environment set up that can run Python and has pip available, for example, with pyenv.

Tests can be written like any other in Python; here is a test file taken from the "unittest" documentation.

# test.py
import unittest


class TestStringMethods(unittest.TestCase):

  def test_upper(self):
    self.assertEqual('foo'.upper(), 'FOO')

  def test_isupper(self):
    self.assertTrue('FOO'.isupper())
    self.assertFalse('Foo'.isupper())

  def test_split(self):
    s = 'hello world'
    self.assertEqual(s.split(), ['hello', 'world'])
    # check that s.split fails when
    # the separator is not a string
    with self.assertRaises(TypeError):
      s.split(2)


if __name__ == '__main__':
  unittest.main()

Running this test via Python will produce TAP.

$ python -m tap

View example setup on GitHub.

CSS with StyleLint

The CSS linting tool StyleLint does support TAP as a reporter — Node.js with NPM needs to be installed. For a minimal setup, the following package.json can describe the project.

{
  "name": "tap-with-stylelint",
  "version": "1.0.0",
  "scripts": {
    "lint": "stylelint '**/*.css' --formatter tap"
  },
  "devDependencies": {
    "stylelint": "^15.6.2",
    "stylelint-config-standard": "^33.0.0"
  }
}

The defined npm run lint command will find all CSS files in the project and run the linter on them. As an example, two different CSS files will demonstrate the test, one with a problem and the other without.

/* test1.css */
a {
  colr: #fff;
}

The keyword color is written wrong as colr.

/* test2.css */
strong {
  color: blue;
}

All good in here.

Running the lint command will produce TAP for the lint report.

$ npm run lint

View example setup on GitHub.

PostgreSQL with pgTAP

pgTAP for PostgreSQL is a collection of database functions to truly write unit tests for a database while emitting TAP. It requires PostgreSQL 9.1 or higher and needs to be installed on the host that runs the database server.

After the installation of pgTAP, the pgtap extension can be added to a PostgreSQL database by running the following as the superuser for the database that should be tested.

CREATE EXTENSION pgtap;

The following example test is taken from the official documentation of pgTAP.

\unset ECHO
\set QUIET 1
-- Turn off echo and keep things quiet.

-- Format the output for nice TAP.
\pset format unaligned
\pset tuples_only true
\pset pager off

-- Revert all changes on failure.
\set ON_ERROR_ROLLBACK 1
\set ON_ERROR_STOP true

-- Load the TAP functions.
BEGIN;
\i pgtap.sql

-- Plan the tests.
SELECT plan(1);

-- Run the tests.
SELECT pass('My test passed, w00t!');

-- Finish the tests and clean up.
SELECT * FROM finish();
ROLLBACK;

And finally, running that test against a PostgreSQL database, producing TAP.

$ psql -d try -Xf test.sql

View example setup on GitHub.

Consume TAP

I will need to point to the official documentation for consumers, as it contains a multitude of options for different languages, with quite a few producers also providing options for consumption. Additionally, there is always the option to search the internet for more — there always is, more I mean.

The best part here is that, thanks to TAP, consumers can work with any producer, no matter the language or context.

Conclusion

I have stronger feelings about TAP than I initially thought I would have, and this article can only give a small glimpse into them. It may not look like it at first, but this is a love letter to TAP and hopefully an encouragement for others to give it a shot.

All the producer examples can be found in the companion repository on GitHub.

Until then 👋🏻