Look Inside CatWalk

If you're new to Lazy Systematic Testing, then you may appreciate a quick overview of the approach, and how it may help you. Here, we explain what the fuss is all about, and how CatWalk is different from other tools.

Who is CatWalk for?

CatWalk is for Java programmers, who work in agile software development teams. That is, you prefer to design by prototyping code, and you're not afraid to change the codebase frequently. You may avoid formal design and specification, but would still like to have guarantees on the quality of your tested code.

You may rely on regression testing, in which you hand-craft tests for the current version of your code. But you find that writing tests that cover all aspects of your code becomes increasingly tedious, and you miss some cases. Furthermore, your test suites rapidly grow out of date, so you spend as much time fixing tests as you do fixing bugs in your code.

How does CatWalk help?

Test automation is a much-hyped concept, but sometimes refers only to running all your test scripts automatically. This hardly addresses any of the difficult tasks in software testing! CatWalk automates much more of the testing process, taking more of the burden away from the programmer:

CatWalk assumes that the code is a flawed, partly reliable expression of the design intent, yet good enough to start testing;
CatWalk generates mininmal test suites that cover the tested code automatically, something that testers find hard to do;
CatWalk makes better use of the human-in-the-loop, by prompting for confirmation of selected test outcomes, and automatically testing the rest.

Systematic Testing

Human test-authors can usually identify positive tests for what they expect to see, such as successfully issuing a Loan of a Book to a Borrower. But they are not good at identifying negative tests for things they do not expect to see, such as discharging the same Loan twice! What should even happen in this case?

CatWalk's systematic testing explores the constructors and methods of the test class exhaustively. Each unique test sequence consists of a constructor call, followed by one or more method invocations. Eventually, every possible interleaved combination of methods is explored - in both expected and unexpected orders.

CatWalk populates each sequence with different sets of input arguments, to see if these cause a different response in the test object. Since every sequence is unique, the same property is never tested twice! There are no redundant tests. But the cost of being so thorough means that the test suite size grows exponentially.

Dynamic Test Pruning

CatWalk solves the test explosion by dynamic test pruning. This is possible because the tool generates, executes and evaluates test sequences in cycles of increasing length. Dead-end paths need not be expanded in the next cycle.

Baseline test explosion

You can see the difference between baseline testing and test pruning in a series of figures. Imagine a Stack with methods: push(), pop(), top(). Baseline testing executes all of the above paths. Some of these raise exceptions. If we extend such a path, anything after the first raised exception will never be reached.

Pruning paths ending in exceptions

But if we prune all paths just after these have raised exceptions, we grow fewer paths to test, as seen above. This is what is generated when using CatWalk's EXHAUSTIVE test strategy.

Algebraic Analysis

CatWalk goes further in its analysis. This is where the operations of the test class are classified as: primitive, transformer, or observer operations.

Primitive operations place the object into a new, unexplored state;
Transformer operations return the object to a previously visited state;
Observer operations leave the state of the object unchanged.

This comes from the theory of algebraic datatypes. Note that some object constructors can be derived, so are transformers; whereas some methods may be primitive. In this BankAccount, the constructor and the methods open() and deposit() are primitive:

Algebraic analysis of a BankAccount

Algebraic Test Pruning

Knowing the algebraic structure of the test class allows CatWalk to make better decisions about what constitutes a dead-end path. For example, a path whose last operation did not change the object's state need not be extended in the next test cycle, since we already have a shorter path ending in that state.

Pruning paths ending in observers

Pruning all paths that end in observers reduces the test suite size further. This is what is generated when using CatWalk's MUTATING test strategy, in which the path prefix consists only of methods that changed the object's state.

Pruning paths ending in transformers

Transformers return the object to a previously visited state. So pruning all paths that end in transformers reduces the test suite size even further. This is what is generated when using CatWalk's ALGEBRAIC test strategy, in which the path prefix consists only of methods that reached a new, unexplored state.

Metamorphic Test Pruning

Metamorphic testing refers to deriving test outcomes by self-similar comparison with other tests. CatWalk has built-in rules that allow it to derive the outcomes of longer tests from shorter tests. One of these rules is useful to reduce the size of test suites in classes that have several constructors.

Overlapping paths from many constructors

Even with algebraic pruning, different constructors may start separate paths which, individually, never cover the same state twice, but which may cover states that have already been visited on another path.

Pruning parallel paths after path mapping

When comparing the states reached on parallel paths, we may terminate any path that reaches a state already visited on a shorter path. This is what is generated when using CatWalk's METAMORPHIC test strategy, in which parallel paths are mapped onto existing paths.

Behavioural Response

CatWalk has another unique feature, which is its way of determining whether the tested object responds in a qualitatively different way, when subjected to different test sequences, populated with different inputs. We call this the Behavioural Response (BR). Kinds of response include:

the test object returns a distinct result;
the test object raises a distinct exception;
the test object responds true or false;
the test object enters an unexplored state;
the test object revisits an old state;
...etc.

CatWalk uses this ability to decide whether a generated test sequence is worth keeping in the test suite. If many sequences provoke the same response, then we only need to keep one of them. This is really useful when determining the input partitions of every method. Inputs in each partition trigger different responses.

Exploring the CBR of a BankAccount

The Complete Behavioural Response (CBR) is an ideal test suite, which explores all input partitions of every constructor and method. This covers every state-contingent and input-contingent behaviour of the test class.