Working out the container-classes API

During AusHac, I worked on the container hierarchy I discussed in my previous post, which culminated in the initial release of container-classes. I had initially (and naively) thought I would have been able to whip something like this together on the Friday afternoon and spend the rest of the weekend working on graph libraries; in the end I just managed to release an initial draft version before we had to pack up on Sunday.

Now, I’m not saying this current setup is perfect; it’s basically a direct copy of all list-oriented functions from the Prelude along with a couple of functions from Data.List split into a generic Container class, a Sequence class for containers with a linear structure and Stream for infinite Sequences (i.e. lists and similar structures: those for which it makes sense to define a function like repeat).

First of all, here are a couple of design decisions I made with this library:

  • I want to be able to consider types with kind *; as such, most pre-existing classes are of no use.
  • Even when hacking together support for types of kind * -> * for mapping functions, etc. I couldn’t use Functor as it doesn’t let you constrain the type of the values being stored (for Sets, etc.).
  • To be able to have restrictions, we need to be able to specify the value type as part of the class definition. This means the usage of either MPTCs+fundeps or an Associated Type. I was initially using the latter, but due to the current lack of superclass constraints making the type signatures much uglier and longer, I switched to using MPTCs+fundeps instead.
  • Type signatures should be as short/nice as possible.
  • Provide as many default implementations as possible, and make those as efficient as possible.

However, with these design decisions there are some considerations I have to make:

  • How should I split up the various functions into type-classes? e.g. does it make sense to re-define standard classes like Foldable so that they’ll work with values of kind * (where possible) and if necessary have a constrained value type?
  • At the moment, the main constraints are all inherited from the Container class; if I have lots of smaller classes, is there a nicer way of abstracting out the constraint without duplicating it everywhere? rmonad has the Suitable, but in practice this seems to mean the addition of extra Suitable f a constraints on every function in addition; maybe this is because Suitable isn’t a superclass of the other classes though.
  • I’ve tried to define the default definitions of the various class methods with the eventual goal of implementing and using the foldr/build rule, but I’m not sure how to properly implement such a rule, let alone how well it’s going to work in practice:
    • If someone overrides the defaults to use custom/optimised versions (e.g. all the pre-defined list functions), then the inter-datatype optimisations will no longer be present.
    • As people may use more optimised variants of various class methods, any data type that extends another (using a newtype, etc.) will have to explicitly define each class instance rather than relying on the default definitions (if they want to keep using the optimised variants).
    • Cross-container optimisation could change some fundamental assumptions: e.g. going from a list to a Seq and then back to a list will typically preserve the value ordering; however if we replace the Seq with a Set then we’d expect the ordering in the final list to have changed (and be sorted); if I implement the foldr/build rule would it interfere with this ordering by removing the intermediate Set and the fact that it will insert values in sorted order?
  • Benchmarking: is there a nice way of doing per-class benchmarking to be able to compare the performance of different data structures? For example, being able to compare how long it takes to insert 100 random values into a set (by consing) compared to inserting those same values into a Set.

So, that seems to be the battle I’ve taken upon myself. I’d greatly appreciate any pointers people can give me either as comments here or by emailing me.

Oh, and a reminder: I’m going to stop collecting responses for my survey on what to call the “new FGL” library at about 12 PM UTC this Friday. I’ve already got about 60 votes; more are welcome.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s