JWalk software testing tool suite

Lazy systematic unit testing for agile methods

You are here: JWalk Home / Weblog / 2008 Entries /
Department of Computer Science

The JWalk Weblog, 2008

This is the JWalk weblog for 2008! It describes the further appearances of JWalk on the world stage, starting at the first IEEE International Conference on Software Testing in Lillehammer, at the TestBench Workshop, in early April. Chris Thomson and I were building on our reputation for comparing JWalk testing against manual JUnit test authoring, and showing how JWalk always wins hands-down in terms of test coverage and time saved. This culminates in a paper by myself, Neil Griffiths and Chris Thomson, on the feedback-based development method that is possible, when using a JWalk Editor tool, presented at the TaicPart conference in Windsor.

This log also charts the development, over the summer months, of the experimental 0.9 version, in collaboration with Wenwen Zhao, which aimed to detect revisited object states, in order better to support full algebraic analysis. This eventually led to the next major release of the JWalk 1.0 toolkit. The year ends with a collection of student-created tools based on JWalk, none of which are quite of production quality, but some of which may well be the seeds of a proper GUI-based tool.

Tuesday, 23 December 2008

It's the departmental Christmas party today, at the Harlequin pub. I'm logging off now until the New Year. Happy JWalking, everybody!

Friday, 12 December 2008

Well, they all attempted something. The prettiest solutions were Java applications running using Swing interfaces. One team (Erik Tittel, if you're there) worked out how to nest Swing panels so that the different test settings were logically organised: set the test class; set the other settings (strategy, modality, depth); then run the tests. Only one group successfully put the output data onto separate tabs for each test cycle. They find Swing harder to use than they thought - actually getting any single panel to size itself to a sensible width sometimes doesn't work if it's part of a larger structure which imposes different sizing rules on all the subpanels that it controls.

Team Dave has, as expected, produced the entire kitchen, as well as the kitchen sink: it's a web-based JWalk sampler, running on two different servers, using Java, PHP, and all manner of glue and sticky-tape. Not only this, but they managed to lock up the whole DCS external webservice, by running JWalk tests on prototype classes to test depths of 300! I mean, what do you expect? If you test a class API of size 10 to depth 30, this can take 20 minutes (for i=0 to 30, Σ C x Mi, where C is the number of constructors and M is the number of public methods). So this is going deeper by exponential powers of up to a factor of 10 greater. The support team have rightly banned them from DCS services for a while. Oh well, they have their own external web-service, and I gave them an extension. Didn't someone once say: Dave, I'm sorry I can't let you do that?

Friday, 5 December 2008

Today is the first day that I've seen the complete version of some of the students' work. As I expected, many of them found out they couldn't read or write oracle files when running JWalk as an applet in a browser. Some of them have worked out how to create a policy file that allows file reads and writes on the client. The disadvantage of this is that it is the end-users who have to realise this, and create similar policy files. Others have cheated slightly, by causing the JWalk applet to be launched in Java Webstart (rather than run in the browser and sandbox), in which case it has full permissions to write files.

Only a couple of teams have tried to do something bigger than a JWalk sampler. Two have tried to build the JWalk Marker application. One completely bonkers team (Team Dave, if you're out there), has decided to host the JWalk sampler on their own external web-service, and they interact with the tester via HTTP and webforms, instead of using an applet! So, they have to re-parse the textual output of the JWalk process running on the server, in order to assemble this into something that presents itself well in active script drop-down boxes over the web. This must take the prize for the application that cobbles together the most, and the most different, technologies.

Thursday, 30 October 2008

The systems analysis and design teams have had a couple of weeks now to find out what they might build, out of the JWalk 1.0 toolkit. The main applications that the stakeholders have requested are:

  • a web-based sampler for JWalk, as an advertising tool
  • a Java-sensitive editor and JWalk-style testing application
  • a tool to mark student software assignments based on JWalk oracles
I don't know how many teams have realised that running JWalk over the web will require reading and writing files on the client's machine, something traditionally blocked by the Java sandbox! They will discover the existence of security policies, eventually! And of course George will probably want to prevent everything they foresee doing with the DCS web service.

Thursday, 16 October 2008

I've made it! The new version of the JWalk 1.0 toolkit is done. It's not going public for a while - partly, this is because I want to make sure it's been fully tested; and if I give this to a class of students, they're bound to find all the ways in which they can use and abuse the toolkit! I published their assignment today, which tells them everything about how to organise themselves as software engineering teams, but nothing yet about the content of the assignment. They have to apply their newly-acquired interviewing skills and get the project content out of me (as a stakeholder) and two others: George (who runs the DCS web service) and Guy (who teaches Java and might want to use a JWalk-based tool).

There's so much that is new or changed in this version of the toolkit. Firstly, the old six test modes have been abandoned. Now, you can alter the test strategy (one of: protocol, algebra or states) and the test modality (one of: inspect, explore, or validate) independently! So, for the first time, you will be able to do protocol validation, algebra inspection and state inspection as separate activities. Then, I've completely changed the Generator interface. Rather than keep the white-box framework, whereby you subclass ObjectGenerator, I've converted this to a black-box framework, where the tester creates any component satisfying the CustomGenerator interface and uploads this to JWalk's MasterGenerator. I'm using the master and delegate model inside the generators, so it makes the main generation algorithms much cleaner, and custom algorithms (such as generating instances of String types, or interface types) can be moved out to specific custom generators.

Wednesday, 8 October 2008

This rebuilding exercise is huge! Adding the new facility for detecting object states has involved not only the creation of the new StateInspector class. The TestCase class has had to be revised, so that it could store an encoded representation of state, after execution. The TesSequence class has also been entirely re-written to provide fast algorithms to detect whether the sequence has re-entered a previously-visited state, or left the state unchanged. The pruning rule for algebraic exploration in AlgebraWalker has been revised to prune test sequences whose prefixes are re-entrant (before, we merely pruned prefixes which were observations). This now affects how concrete states are explored in the StateSpaceWalker, which can find new concrete states much more quickly than before.

Then, of course, all of the test prediction rules must change, to allow partial order reduction on test sequences containing transformer methods in the prefix, so we map onto smaller saved oracle sets. I've also decided to make the test oracles into plain text files, using the oracle encoding format as the text format. This means that it's easy to see what the oracle has actually learned, after testing. The previous JWalk 0.8 version used Java's native serialisation format, which is not so easy to read.

Friday, 25 September 2008

Now starts a huge coding challenge. I want to re-write the JWalker toolkit from the ground up, to incorporate the latest object state detection technology. I'm going to set myself the goal of finishing this rewrite by the middle of teaching week 3 (Wednesday 15 October), because I'm going to make building a JWalk-based application the challenge for my Systems Analysis and Design class this year. During this refactoring (really a re-boot of the whole JWalk idea), I think I'll remove some of the clutter caused by extra interfaces which only have one implementation in the JWalk 0.8 version. The exciting prospect of having the ability to detect revisited states will be that we can apply full algebraic rules for pruning and predicting test outcomes, which will reduce quite effectively the number of manual confirmations made by the human tester.

Tuesday, 23 September 2008

Yes, this idea actually works. I have built a case study trying out a StateInspector class that knows how to inspect arbitrary objects and encodes their states using hash codes. It can handle deep, medium, or shallow equality of states. It is so effective at distinguishing concrete object states, that at first I didn't believe it when a popped Stack entered a different state to that prior to the previous push. In fact, the StateInspector was correct to distinguish the two states, since one of them had a garbage value in the next slot, whereas the other had a null value in the same slot. So, you may have to follow design-for-test coding practices to benefit fully from the prospective ability to identify previously-visited concrete states.

Monday, 22 September 2008

I have just managed to grab some time to think about ways in which I could compute hash codes for object states more efficiently. I think it can be done using a variable-depth parameter denoting how deeply you wish to explore object states, then using Java's built-in hashing algorithms for primitive types, references and lists of values. Java has a suitable algorithm that encodes ordering as well as the basic values in its hash codes. It involves multiplying by 37, clearly a good prime number. It will be interesting to see whether we get any collisions of hash codes using this, rather than MD5, which generates a much larger spread. However, my intuition is that we won't see many collisons in the spaces explored by JWalk, since the input value synthesis algorithms produce unique values upon each request.

Monday, 1 September 2008

Wenwen Zhao, my MSc student, has now submitted her dissertation. In this she develops a unique string encoding for every method invocation expression, and from this computes a hashcode using the MD5 algorithm. On the one hand, this allows oracles to store a slightly richer representation of each search key; and on the other hand this allows JWalk to determine when an object revisits a previously-visited state. Wenwen has produced graphs of the increased numbers of redundant paths that are detected when the new state-revisiting algorithm is introduced. However, the cost of computing the strings is fairly high. The time taken to discover an abstract state space would be shorter (were it not for the cost of computing strings). This is a nice piece of work, which points the way to a future strategy for computing revisited states more efficiently. I've decided to call this version JWalk 0.9, but it won't go on public release, because it's not yet of production quality. There are too many issues to resolve with the extra memory consumption.

Saturday, 31 August 2008

Chris Thomson reports that the presentation of the JWalk paper at TaicPart has gone down well. We're coming to the end of a great walking holiday in Austria, so my mind wasn't really on the matter. But it's good to know that the show went well. Several delegates told Chris that "They finally understood what it was all about", having been interested in the notion before, but clearly not quite getting the point. He's sent me a few contacts to follow up. Thanks, Chris!

Saturday, 16 August 2008

Well, we're off on vacation until the end of the month. I'm taking the family to the Austrian Alps, just south of Salzburg, for a walking and rafting holiday. Because of the dates, it turns out that I cannot now attend TaicPart, which is a great shame, since I really like the conference venue in Windsor Great Park. Fortunately, Chris Thomson has kindly stepped in to present our paper. I had rather a frantic week preparing the slides and only managed to contact Chris at the last minute (he is on vacation right now) to let him know where to find the slide-set! If only TaicPart had been in September this year, like it was last year... but I suppose the organising committee could not get the venue for the same month as before.

Monday, 11 August 2008

Wenwen is now back, with less than one week before I disappear on vacation. We were able to discuss the issue of incorporating state detection into the main testing loop. So far, her dissertation is pretty good, so the end result may be quite interesting. She has an idea to compute hash codes from the string representation of states, and compare these instead of comparing the strings directly. This is somewhat reminiscent of the approach taken by Henkel and Diwan in their tool which guides the programmer to infer the abstract datatype algebra of a Java class (they use a "serialise-and-hash" approach). We can't rely on serialisation, since not every object we want to test declares this capability.

Friday, 4 July 2008

Wenwen has decided to return to China, to visit her sick relative! This is potentially bad news, as she still hasn't addressed the issue of how to incorporate her calculation of states into the main testing loop performed by JWalk. We need to know when states are revisited, in order to prune redundant paths in the next test cycle. (They are redundant, because shorter paths reaching the same states have already been explored). She promises to keep up her work while she's away.

Friday, 27 June 2008

Wenwen has continued to develop her ideas regarding how to capture information about object states during the execution of JWalk. Her approach seems to centre on producing a collection of inspector-classes that know how to extract the attribute state data of an object, and convert this into a storable string format, such that the oracle could save this information between test runs. I'm slightly concerned that this representation of states might be too expensive to compute.

Friday, 20 June 2008

Neil and I received some useful critcal feedback on the first draft of the paper. We have to change the name, since "Round-trip" as a term has now been cornered by the folk who use forward/reverse engineering tools to maintain software and design models in step. The revised version of the paper contains more statistics on the use of the tool, in particular some coverage data and timing information about the cost of developing tests, when we compare JWalk's semi-automated approach with manual testing with JUnit. We brought Chris Thomson in again as our experienced JUnit test writer, and did another comparative test creation marathon on several classes. The paper only has space for two sets of results, so we used the same LibraryBook and subclass ReservableBook examples as in the description of the cyclic development method. I submitted a revised version of the paper, called "Feedback-based specification, coding and testing with JWalk". This has now been accepted by TaicPart. You can access a version of this from the Publications Page.

Friday, 9 May 2008

I have just submitted a draft paper, based on Neil Griffiths' integration of JWalk testing into a Java-sensitive editor, to the testing conference TaicPart (Testing in Academia and Industry - Practice and Research Techniques). The paper is currently called "Round-trip specification, coding and testing with JWalk" and it aims to demonstrate not just issues involving the integration of JWalk with a third-party piece of software, but also to describe the incremental coding, validation, inference of specification and systematic testing that is possible using JWalk.

Thursday, 10 April 2008

We're all in Lillehammer, at the IEEE International Conference on Software Testing! I've never been to Norway before. It's a beautiful country, but quite cold. Everywhere is still covered in late snow, but things are starting to thaw out. A gang of us went up to the old Winter Olympics ski-jump slope. You can still see the place where James May, from BBC 2's Top Gear programme, spray-painted a red line across the bottom of the landing area, as a target for his rocket-powered Mini to reach, when doing the automobile Olympics for the TV show.

The real reason I'm here is to take Mike Holcombe's place as co-organiser of the TestBench '08 Workshop, along with Paul Roper from Strathclyde. I have a workshop paper to present on JWalk compared against JUnit; and also a presentation to give on behalf of a PhD student, Sarah Salahuddin, who has hurt her back and cannot travel. I think the workshop went fairly well. I don't think the attendees could believe how much time JWalk saves, when testing. You can get a copy of the paper from the Publications Page.

Monday, 7 April 2008

I have just acquired a masters student, Wenwen Zhao, to work on something I really want to do, which is to add the capability to recognise when object states are revisited. This has the possibility of greatly reducing the number of oracle values requested interactively during a test cycle. Also, we might be able to identify and prune more redundant paths (that revisit old states).

Other ideas for the future include changing the way in which keys are stored for oracle values. As a precursor to equivalence-partition testing, the keys need to encode the values supplied as arguments to methods, rather than just the types. This would allow the same method sequence to be indexed multiple times for different arguments.

Monday, 25 March 2008

Having made it possible for third-party systems to access the contents of test reports (by downcasting), I have now made a number of classes public that used to be package-protected. This includes DynamicAnalysis and TestSequence, so that tools can inspect the sequences used to reach states, or to exercise the class under test. Note: Be aware that the test sequences used in cycle N are extended in cycle N+1. So, you should not attempt to modify the contents of a TestSequence, if you access it from a test report.

Monday, 16 March 2008

Third-party tools that invoke JWalk may now use Banner reports, that come as part of the event-stream fed back to the third-party, as signals to filter and partition the test reports. Whereas the standard Banner simply contains header text announcing the next test result, a new subclass CycleBanner now provides an API to access the test cycle, the test mode, and any current object state. Neil Griffiths asked for this, so that he could partition the test results under different tabbed panes in his integrated Java editor and JWalk testing tool. The third-party tool merely has to use report instanceof CycleBanner to know that it is safe to cast the report to this kind of banner. See Inside JWalk for further details on the CycleBanner API.

Friday, 14 March 2008

Today I uploaded an incremental revision to JWalk0.8, in response to issues surrounding reloading the edited test class into the current runtime, without restarting the third-party application. The setTestClass API method relies on the third-party application providing its own ClassLoader as before; but now the setTestClassName API method has been changed, so that it no longer uses the boostrap loader's loadClass method (called by Class.forName). Instead, we create a new local ClassLoader of our own (which therefore does not cache the existing loaded test class) and load from a test class directory, that can be set using setTestClassDirectory.

Thursday, 13 March, 2008

Today Neil Griffiths gave me a demonstration of his current Java-and-JWalk editor. The look-and-feel is quite attractive. I have asked for some changes to the way the test results are organised and presented. He would also like to split up the stream of test results into several partitions, to be displayed on separate tabbed panes in the editor tool. To do this, the third-party editor will need access to more detailed information about each test cycle. I'll have a think about this.

We now know how to guaratee loading a fresh copy of the test class, after this has been changed. Step 1: you need to create your own custom ClassLoader. Longhand, this loads a stream of bytes from a file and then defines a class from the bytestream. Shorthand, you can use a URLClassLoader from the java.net package and convert the test class directory path into a URL. This is harder than you think: first, you have the File denoting the directory. The approved way to get a URL is to use the sequence file.toURI().toURL(), because of machine-dependent features. Step 2: you create a new instance of your loader, each time you want to reload the class, because of all the cacheing that class loaders normally do. Step 3: you only access the loaded class, and any instance of this, through the reflection API, to avoid all possibility of type-mismatch, due to type tagging by different class loaders.

Friday, 7 March 2008

On the matter of using a custom ClassLoader to reload a class that has been recompiled during the current runtime, all the internet chatter about creating a clean loader (with no existing cache of previously loaded classes) isn't even half of the story. The newly-loaded class is not recognised as being of the same type as the previous version! We are getting a ClassCastException at the point where an instance of the reloaded class is being typecast back from Object to its actual class. Digging into the javadoc, this is because each ClassLoader tags all the classes that it loads with a unique version tag. Different versions are deemed type incompatible (a serious nuisance).

Attempt 3: Remove all reference to the reloaded class's name from the runtime and try to access its methods by some other means. For example, create an interface, known to the runtime, through which the reloaded class can be accessed. This fails too, because the reloaded class is not recognised as implementing the same interface. Experimenting with putting things in different directories, we find that reloading a class also reloads the interface that the class satisfies. We had cases where two different copies of the same interface were being loaded by different loaders.

Attempt 4: Give up trying to treat the loaded class as anything other than a Class, and its instance as anything other than an Object. This means we must use the reflection API to discover a Method as the entry-point to the reloaded class's behaviour. We then call Class.newInstance and Method.invoke to start the ball rolling. Finally, this seems to work. We have edited and recompiled a test class, while not quitting the runtime of the program loading it, and have successfully reloaded the test class and observed its changed behaviour, in the same runtime.

Thursday, 6 March 2008

Neil Griffiths and another student Tom Dixon have come across a problem when reloading a recompiled Java class into the runtime, without quitting the current application. Both Neil and Tom are developing editors that invoke the Java compilation tools. It turns out that Class.forName is inadequate for reloading a class that has previously been seen by the current runtime. My initial suspicion was that class definitions were being cached, once loaded. After navigating through the javadoc for ClassLoader, it seems clear that the default boostrap loader is designed to cache classes. This makes sense for a bootstrap, but not for the general loading case. So, I spent a couple of hours researching this on the internet.

Attempt 1: clone the bootstrap loader, so that the new loader has a clean cache. This fails, because the bootstrap loader has package protected constructors! (It's really implemented as part of the Java Runtime Environment, and is not meant to be user-accessible in this way).

Attempt 2: write a custom ClassLoader subclass (with public constructors) and reload the desired class. This also caches any previously loaded version; so we have to re-create the custom loader each time, to ensure a clean cache (either that, or perform some extensive surgery on the loadClass method, so that it (a) doesn't search the cache; (b) doesn't look for a parent ClassLoader (which would be the bootstrap loader) and (c) actually fetches the desired class from file). Why does Java make this so difficult? And then, this fails too because of type mismatch problems.

Tuesday, 5 February 2008

Chris and I have submitted our paper to TestBench '08. We repeated our comparative work on expert manual testing with JUnit, versus semi-automated testing with JWalk. This time, the manual tester had full access to the source code and the modifications. The JWalk unit tests were generated in a fraction of the time (for six classes; a matter of 30-50 seconds for each class, rather than 15-20 minutes developing handwritten tests). Also, when judged against a benchmark called the behavioural response for an object, the JWalk tests were nearly 100% effective (the only missing cases were five exemplars from equivalence partitions not fully covered).

This success has prompted me to think about ways in which equivalence partitions (of method inputs) might be covered fully by JWalk. Originally, the testing assumptions were directed towards full exploration of the state space, which JWalk does very well. Clearly, the uninformed cycling of arguments to methods results in many wasted test cases (and useless confirmations from the programmer). A better strategy would be to build an innner loop inside the construction of each test sequence, so that we could populate that sequence with different arguments.

Wednesday, 23 January 2008

Chris Thomson and I are considering submitting a paper to Marc Roper and Mike Holcombe's First Software Testing Benchmark Workshop (TestBench '08), a workshop of the First International Conference on Software Testing, Verification and Validation, to be held at Lillehammer in Norway in April 2008. I'm thinking along the lines of a benchmark for object-oriented unit testing.

Back to earlier entries...
Regent Court, 211 Portobello, Sheffield S1 4DP, United Kingdom