I am talking about the suite of graph visualisation tools rather than my bindings for Haskell (for which I use a lower-case g). These are problems I mostly came across whilst both using Graphviz and writing the bindings.
What is a valid identifier?
In the main language specification page for the Dot language, it is said that the following four types of values are accepted:
- Any string of alphabetic ([a-zA-Z\200-\377]) characters, underscores ('_') or digits ([0-9]), not beginning with a digit;
- a number [-]?(.[0-9]+ | [0-9]+(.[0-9]*)? );
- any double-quoted string (“…”) possibly containing escaped quotes (\”);
- an HTML string (<…>).
Note that quotes are the only escaped values accepted.
However, it isn’t clear what should happen if a number is used as a string value: does it need quotes or not? Furthermore, that page doesn’t specifically mention that keywords (graph, node, edge, etc.) need to be quoted when used as string values (it just says that compass points don’t have to be quoted).
What is a cluster?
The language specification page mentions that it is possible to have sub-graphs inside an overall graph, and that these sub-graphs can have optional identifiers. The Attributes page has mention of cluster attributes. But the only way to tell how define a cluster is to look at the examples page and notice that a sub-graph is a cluster if it has an ID that begins with
cluster_ (with the underscore also appearing to be optional when playing with the Dot code manually). Furthermore, it isn’t specified that if you have more than one cluster, then they must have unique identifiers; it doesn’t even suffice to have two “main” clusters with identifiers of
Bar, each with a sub-cluster with an identifier of
Baz: the sub-clusters have to have unique identifiers as well; it took me a few hours to work this out.
If that isn’t bad enough, the fact that
cluster_ is at the beginning of every cluster identifier means that the normal quotation, etc. rules for values doesn’t seem to work: a HTML identifier for a cluster now has the form of
"cluster_<http://www.haskell.org>"; that’s right, it’s a URL prepended with a string and then wrapped in quotes! This plays merry hell with any attempts at properly generating and parsing identifiers for sub-graphs, especially when considering what happens to escaped quotes inside that string (my approach has been to do a two-level printing/parsing).
In several cases, the documentation for Graphviz contradicts itself. Take output values for example: the official list of output types can be found here. Yet, if we look at the documentation for how to define a color value, we find it mentions a non-existent “mif” output type. Not only that, but there are apparently various renderers and formatters available for each output type; not only are these renderers and formatters not listed anywhere, it isn’t even explained what these renderers and formatters do (let alone what’s the difference between them). Furthermore, to make matters even interesting on my system I have at least one more output type (
x11) than what is listed there.
Another annoying factor is how Graphviz treats named colors. The default colorscheme is to use X11 colors. However, if you compare Graphviz’s X11 colors to the “official” list (such as it is; there’s no real official standard, but most X11 implementations seem to use the same one) you’ll notice that they’re different: some colors have been added and others removed. I admit that it could arise from an older X11 implementation’s definition of X11 colors, but it prevented me from making a common library to use for X11 colors.
Every now and again, Graphviz fails to visualise a graph because an internal assertion failed; for example:
dot: rank.c:237: cluster_leader: Assertion `((n)->u.UF_size <= 1) || (n == leader)' failed. This is extremely annoying, not least because even looking through the relevant source code doesn’t reveal what the problem is. If these assertions are really needed for some reason, please say why and what the actual problem is.
I’m spoiled: #haskell is one of the largest IRC channels on Freenode, and the various Gentoo ones are usually rather large and helpful as well. Usually whenever I try to get help from #graphviz, I get no help; partially because there’s sometimes only two other people there, neither of whom respond (probably due to time zones).
There are other niggles I’ve had with Graphviz, but these are the main big problems I’ve had that I can recall.
Overall, however, Graphviz is a great set of applications; unfortunately, they seem to be feeling their age (along with keeping a large number of deprecated items floating around for compatibility purposes).