Haskell, Rants

If wishes were tests, code would be perfect

(With apologies to wherever the original came from.)

As I’ve mentioned previously, I’m currently writing a QuickCheck-based test suite for graphviz. Overall, I’m quite pleased with QuickCheck, especially since the amount of moaning from people that QuickCheck-2 (which I’m using) is too different from the version 1.x series. The monadic usage of Gen for arbitrary means in most cases instances are just a matter of picking the right liftM function with multiple calls to arbitrary. However, in the course of using it, I’ve come across some observations/problems with QuickCheck.

Default Instances

Usually, it’s great to see a library define instances of its own classes for the common data types (that is, those available from the Prelude, etc.). However, I’m finding the the default instances of Arbitrary for lists (and to a lesser but related extent Chars) a pain. Specifically, how the shrink method is defined: it not only tries to shrink the size of the list (which is great) but also individually shrink each element in the list. My preferred behaviour (which I’ve defined a custom function that my code explicitly calls) is to just shrink the size of the list, unless it’s a singleton, in which case try to shrink that value.

The reason the default behaviour is so bad in my case is that I quite often have lists of custom data types, which can individually have lots of other sub-types in them, possibly with lists of their own. As such, if I used the default shrinking behaviour on lists, this could result in a lot of attempts at shrinking.

Note that this isn’t really a problem of QuickCheck per-se: it’s great that it defines an Arbitrary instance for lists; it would be great (but probably not type-safe, etc.) if it was possible to override class instances in Haskell.

Why lists anyway?

One of the problems with the shrinking behaviour of lists is due to the number of appends that occur; whilst lists are nicer/easier to deal with, using something like Seq from Data.Sequence might improve performance.

Getting big for its boots

In most of the tests that I’ve done, the problems that occur are usually in printing and parsing Attributes. As such, at the start of my test suite I run a test on lists of Attributes; to try and ensure that they’re valid I run 10000 tests rather than the default 100. These extra tests, however, come at a price: QuickCheck keeps testing longer and longer lists, which means that each individual test takes longer and longer to run. I’d prefer to run even more tests which are individually smaller (around the mid-point of what gets generated with 10000 tests); as it is, 10000 tests take over half an hour here.

Documentation

Whilst the high-level details are explained rather well, there’s parts of the QuickCheck documentation that is rather lacking. First of all, how to use QuickCheck; I wasn’t aware that there was a community standard of starting the names of all properties with prop_ (though Real World Haskell deals relatively well with how to use QuickCheck). Also, it took me a while to dig through the (relatively un-documented) source to work out that a Result of “GaveUp{...}” is returned when too many values were discarded.

Keep going

You’ve found a value that breaks the property? Excellent (well, not in that it’s great to have a bug, but it’s great that it was picked up)! But can’t you please keep going and trying to find more?

Edit: one of the reasons I would like this behaviour is for when the test isn’t actually a failure per-se, it’s just a matter of my Arbitrary instances not being strict enough. For example, if it generates a String value that is actually a number (e.g. "1.2") and my data type can be either a Double or a String, then obviously this value should actually be parsed back as a Double; this though breaks the print . parse = id property in its most strictest sense. As such, if quickCheck kept going, then I could manually verify whether it is a bug or not and fix it (so that an arbitrary String isn’t actually a number) whilst it kept doing the rest of the tests.

Getting results

Related to the previous one: once quickCheck has found a data value that breaks a property, the only way of getting that value to manually determine why the property is breaking is to copy/paste it: whilst the output can be redirected, it’s an all or nothing affair of the entire output rather than just the data value itself. Even better if the Result data type was parametrised so that it could return the value in its Failure constructor, so that in my code I can manually write it to file using my wrapper script around the QuickCheck tests.

Recursive values

In graphviz, I have a DotGraph data type which contains a DotStatement value; this contains a list of DotSubGraph values, each of which contains a DotStatement value. As such, my initial implementation of Arbitrary for these data types resulted in large, deeply recursive structures even for “small” sample values; this resulted in making it almost impossible to track down the source of the problem that resulted in an error. As such, to solve this I’ve done the following:

  • Define an arbDotStatement :: Bool -> Gen DotStatement function which will only have a non-empty list of sub graphs if the boolean is True.
  • The Arbitrary instance for DotStatement has arbitrary = arbDotStatement True; that is, an arbitrary DotStatement value can contain DotSubGraphs.
  • The Arbitrary instance for DotSubGraphs uses arbDotStatement False to generate its DotStatement; that is, a DotSubGraph cannot have any DotSubGraphs of its own.

This results in an Arbitrary instance of any of these data types that won’t endlessly recurse and is thus easier to debug.

Brent Yorgey is doing work on testing functions that use recursive data types; that should also help in the future.

Shrinking

I think the inclusion of shrinking into QuickCheck is great, in how it helps find a minimal common case for a bug. I’ve found, however, that for large data types you need to be very careful how you implement the shrink method: I’ve found it useful to only shrink the sub-values that are most likely to have errors (that is, the Attributes) rather than checking every possible shrink of the integral node ID, etc.

How do you have 0.11043 of a shrink?

With shrinking, however, what does QuickCheck mean when it says something like 0.11043 shrinks? Is it trying to say how deeply its shrinking? Note that this doesn’t seem to be a real floating point number; it seems to be treated as Int . Int.

Standard

3 thoughts on “If wishes were tests, code would be perfect

  1. gwern says:

    > One of the problems with the shrinking behaviour of lists is due to the number of appends that occur; whilst lists are nicer/easier to deal with, using something like Seq from Data.Sequence might improve performance.

    It might; looking at the code in Arbitrary.hs, it’s obviously using a finite list (the call to ‘length’ gives it away). But Seq has its own overhead, and the instance is for [a], so you will need a toList somewhere in there, likely; are the ++s really expensive enough to justify a nontrivial rewrite?

      • gwern says:

        I guess I didn’t understand what you meant then; the Arbitrary typeclass demands that any Arbitrary instances have ‘shrink :: a -> [a]’, so even an ‘instance Arbitrary a => Seq a’ will require a toList call at the end for its ‘shrink’.

        Or were you musing about your own code, that you ought to replace the various []s with Seqs, and then write an Arbitrary instance with Seq, and hopefully then the fast Seq appends will outweigh the cost of the toList?

Leave a comment