«Insert Name Here»

15 March 2010

Repeat after me: “Cabal is not a Package Manager”

Filed under: Gentoo,Haskell,Rants — Ivan Miljenovic @ 11:24 PM

It seems that every few weeks someone (who has usually either just started to use Haskell or doesn’t seem to be active in the Haskell community) comes out with the fallacy that Cabal is the “Haskell package manager”. I’ve gotten sick of replying why this isn’t the case on IRC, blog comments, etc. that I’ve decided to write a blog post to set things straight.

Disclaimer: I am not a Cabal developer (I did submit a patch or two to fix a couple of documentation problems I found, but I don’t know if they were applied or if they were re-written, etc.) so this is not the official line, just my own opinion (hmmm… “IANACD” doesn’t quite trip off the tongue…).

Cabal /= cabal-install

First of all, there is the common misconception that Cabal provides the command line tool cabal. This is not the case: this tool is provided in the cabal-install package. This unfortunately (for comprehension’s sake) named tool is completely distinct from Cabal-the-library except for the fact that it uses that library. To help understand why there is this distinction (as opposed to other language-specific installation tools such as RubyGems), I highly recommend this video by Duncan Coutts. The short version is: Cabal depends directly on Haskell compilers and nothing else (and is indeed shipped with GHC); cabal-install needs other packages (for network access, etc.) as well.

For the rest of this blog post, I will assume that people are instead referring to cabal-install as a package manager (which is what most of them intend) or else the entire Hackage ecosystem (see below).

What is a package manager?

According to that most authoritative source Wikipedia, a Package management system (of which a package manager is merely one component, namely the tool that is used) is:

… a collection of tools to automate the process of installing, upgrading, configuring, and removing software packages from a computer.

Coupled with such a tool is also a large (well, it could be small but that kind of defeats the point) collection of packages used by the package manager. Together, these provide a way of (hopefully) seamlessly installing packages without end users having to consider what is needed to do so. For example, let us assume that a Gentoo user with no other Haskell-related software present is wishing to install XMonad (one of the most popular pieces of software written in Haskell). In that case, all they need to do is:

emerge xmonad

This will bring in GHC and all other related dependencies (even non-Haskell ones!) so that the user doesn’t have to worry about all that kind of stuff.

The Haskell “Package Management System”

The “equivalent” to a package management system for Haskell is the Hackage ecosystem: HackageDB + Cabal + cabal-install, which provide the list of packages, the build system and the command line tool respectively. However, this ecosystem is not a package management system:

HackageDB

HackageDB is the central repository of open-source Haskell software. However, it is limited solely to Haskell software that is installed using Cabal. As such, it is not closed under the dependency relation: entering the incantation cabal install xmonad into a prompt on a Haskell-free system will not bring in first GHC and then all other dependencies (actually, it is not even possible to utter that incantation since neither Cabal nor cabal-install will be available, unless a pre-built cabal-install is being used). Furthermore, for any GUI libraries/applications that use the Gtk2Hs wrapper library around Gtk+, then Hackage will also be unable to install those packages (to the great confusion of many who state that they do indeed have gtk+ installed, when Cabal and cabal-install are complaining about the gtk+ sub-library from Gtk2Hs) since it isn’t Cabalised.

As such, HackageDB cannot really fulfill the requirements of being a proper component of a package management system.

Cabal

Cabal is the Common Architecture for Building Applications and Libraries, which not only acts as the build system (analogous to ./configure && make && make install) of choice for Haskell packages, but also provides this metadata information to other libraries and applications that need it via a library interface. Nowadays, the only mainstream Haskell packages that don’t use Cabal are GHC itself (since it contains non-Haskell components; furthermore this would involve bootstrapping issues since a Haskell compiler is needed to build Cabal) and “legacy” libraries and tools such a Gtk2Hs (for which it is currently not possible to build using the Cabal framework for various reasons).

Cabal obtains its information from two two files:

  1. A .cabal file that contains a human-readable description of the package including dependencies, exported modules, any libraries and executables it builds, etc.
  2. A Setup.[l]hs file that is a valid Haskell program/script using the Cabal and which performs the actual configuration, building and installation of the package; for most packages this is a mere two lines long (one to import the Cabal library, the second that states that it uses the default build setup).

As such, Cabal is extremely elegant, especially when compared to ones such as Ant. However, it is not a valid specification format for packages that are part of a fully-fledged package management system: it cannot deal with all possibilities (e.g. Gtk2Hs) nor all dependencies (it allows you to state any C libraries needed at build time, but not for any libraries needed from other languages nor any other tools or libraries needed at run-time; for example my graphviz library for Haskell really needs the “real” Graphviz tool suite installed to work properly but there is no way of telling Cabal that).

Why not XML?

Some people have stated that Cabal should really switch to an XML-based file format because it is a “standard”. Even if we assume that XML really is such a well-defined standard (though we’d need to define a Schema for use with Cabal), XML has one large failing: it is not human readable. One of the greatest features of Cabal is that its file format is perfectly readable and understandable by humans (even if it is at times imprecisely defined in some aspects), such that if I have to check dependencies, etc. for a Cabalised package I can quickly skim through its .cabal file and just read it without having to decode it (especially since usage of XML would remove any need for newlines, etc. which are used for readability purposes). Furthermore, it would require Cabal to use an XML parser, which means an extra dependency (whereas at the moment it needs only a compiler).

cabal-install

The choice of both package and executable name for cabal-install was unfortunate (if understandable) in that too many people confuse it for Cabal. Whilst it may use Cabal and act as a wrapper around both it and HackageDB, it is indeed a completely separate package. So remember, whilst you may do cabal install xmonad, you’re not using Cabal to do that but rather cabal-install.

cabal-install brings a convenient command-line interface, dependency resolution and downloading to the Hackage ecosystem. Assuming that you have GHC install, cabal install xmonad will indeed determine, download and build all Haskell dependencies for XMonad. However, this is not always the case.

As a wrapper around Cabal and HackageDB, cabal-install inherits all of their reasons why it is not a valid package manager. However, to that it brings in a few warts of its own: it only manages libraries. That doesn’t mean that it can’t install applications, because it can; it’s just that once it has installed an application it can’t tell that it has done so. That’s because rather than have its own record of what it has installed and what is available, cabal-install uses GHC’s library manager ghc-pkg to determine which libraries are installed. Since GHC doesn’t know which applications are installed, neither does cabal-install. As such, if one tries to install haskell-src-exts (or any package that depends upon it) on a “virgin” Haskell system, then it will fail since the parser generator happy isn’t installed. cabal-install (or rather Cabal) can detect if happy is available, but will not automatically offer to download and install it for you. Whilst this might be more a temporary limitation (in the sense that no-one has yet added support for build-time tools to cabal-install’s dependency resolution system) rather than a problem with cabal-install, it still requires extra user intervention.

There is a yet more serious impediment to being able to properly consider cabal-install a package manager (since other package managers may also require user intervention at times when installation fails for some reason). Let us once again consider that definition of a package management system, this time with some emphasis added:

… a collection of tools to automate the process of installing, upgrading, configuring, and removing software packages from a computer.

Did everyone spot that not-so-subtle hint? cabal-install can’t un-install Haskell software. Why not? Partially because Cabal doesn’t support uninstallation: whilst Cabal can un-register a library from ghc-pkg, it won’t remove any files it installed. Furthermore, because cabal-install doesn’t track which applications it has installed, it is definitely unable to uninstall them since it has no idea which files it needs to delete.

Partially linked to this problem of uninstallation is another segment of that definition: upgrading. Whilst it can install a new version of a package, it cannot remove old versions. However, this isn’t why the cabal upgrade option has been disabled: GHC ships with several libraries upon which it itself depends upon; these are known as the boot libraries. Originally cabal-install offered to upgrade these libraries, with at times disastrous results.

But what if we fix cabal-install???

What if cabal-install starts recording which packages it installs and which files it installs? With that uninstallation will be possible, and if it can tell which libraries are boot libraries then upgrading should also be possible. As such, cabal-install could be considered a proper package management system, couldn’t it? Pretty please?

Unfortunately, no: as mentioned earlier, between them HackageDB and Cabal can only be used to install packages written in Haskell that are cabalised. As such packages cannot be closed under dependencies and cabal-install cannot install all necessary dependencies (both build-time and run-time).

Why you should use your distribution’s package management system

Many GNU/Linux users in the Haskell community express disdain for their distribution’s package management system and vehemently express their preference of cabal-install. Here are several reasons, however, that you should use your distribution’s package management system:

  • Proper dependencies: system packages can bring in all required dependencies, no matter what language they were written in, etc. How are they able to do this when cabal-install can’t? Because they are (hopefully) checked by the most clever computational device known: the human brain. Good system packages have all dependencies listed explicitly, and in the case of those that are marked as “stable” are usually tested on a larger number of machines, architectures and software configurations than the upstream developer is capable of.
  • Package patching: a common complaint with the current status of HackageDB and Cabal is that if there is a mistake with a package’s .cabal file (usually due to package maintainers being either too strict or too lax in terms of dependency version ranges), then users are forced to either manually download, edit and install (thus losing cabal-install’s dependency resolution) those packages or else wait for the package maintainer to release an update. System packages, however, are able to work around these problems by either providing ready-built binary versions of those packages or else patching the package to get it working. For example, when the duplicated instance problem arose between yi and data-accessor, I was able to edit the yi ebuild to remove its own instance definition so that it would clash with data-accessor’s: as such, Gentoo users who wanted to install yi wouldn’t even know such a problem existed, which is how it should be.
  • It Just Works: linked to the above point, system packages are more likely to install and work on the first attempt than using cabal-install, because they have (hopefully) been tested to do so within that particular distribution. This also includes choosing the correct compile-time flags, etc.
  • Integration: when using system packages, the Haskell packages you have installed are first class citizens of your machine, just like every other package. Applications are installed into standard directories, as are libraries, documentation, etc. They will also interact with all the other packages on your system better: for example, there are various wrapper scripts that come with different packages that wrap around darcs; by using a system install of darcs then these wrapper scripts will work, whereas they might not if it’s installed in your home directory.
  • Done for you: someone has put in the effort to write the system package for that particular Haskell package; why shouldn’t you be grateful for them for doing so and use it?

There are of course various reasons why people prefer not to use system packages for Haskell packages:

  • Out of date packages;
  • Limited variety of packages;
  • Not built with the wanted options.

However, there are two possible solutions to these problems. The first one is that if you are a serious Haskell hacker and your distribution doesn’t support Haskell well, then why not try another distribution? Arch and Gentoo are usually recognised as being those with the best Haskell support, with Fedora seeming to have decent Haskell support for a more release- and ready-to-go-oriented distribution.

Alternatively, get more involved with your distribution’s Haskell packaging team or start one if there isn’t one there already: get what you want in your distribution and help other people out at the same time. It usually isn’t hard to at least make unofficial packages: I started off writing Haskell ebuilds for Gentoo by copying and editing ones that were already there; nowadays we have our hackport tool (available at app-portage/hackport in the Haskell overlay) that generates most of the ebuild for you, especially for simple packages that don’t need much tweaking. Failing that, just ask for a new/updated Haskell package.

So is cabal-install useless?

Not at all: cabal-install still serves four useful purposes:

  1. Building and testing your own packages during development;
  2. On OSs without a package management system (e.g. Windows);
  3. You are unable to use your system’s package manager for some reason (e.g. need a custom build with different compile-time flags);
  4. You are unable to install system-wide packages on your work/university computer (which is the situation I face at uni, unless I take the “manage your own computer and if it breaks don’t come to us crying” approach).

However, I still strongly recommend you use your system package manager whenever possible.

Conclusion

I have tried to set out at least some of my reasonings about why I believe that cabal-install is not a package manager like so many people seem to believe, and that the Hackage ecosystem overall is not a valid package management system. Furthermore, I have covered why I believe Cabal should not switch to XML-based files for its metadata and why you should strongly consider using your OS’s package management system (if it has one) over installing packages by hand with cabal-install.

If nothing else, it is my sincere hope that this blog post will at least stop people talking about Cabal when they mean cabal-install.

15 November 2009

Past, Present and PEPM

Filed under: Gentoo,Haskell,Uni — Ivan Miljenovic @ 11:14 PM

I was planning on posting semi-regularly here, but I’ve been procrastinating way too much. It’s not that I don’t have anything to write about, it’s just that by the time I get on the computer I can usually find better things to do ;-)

Anyway, this is kind of a summary update about what I’ve been working on and what I’m going to be doing.

The Graph Trifecta

I’m responsible for three graph-related Haskell packages on HackageDB: graphviz, Graphalyze and SourceGraph. These three packages are interrelated: graphviz is used by Graphalyze which is in turn used by SourceGraph (which also uses graphviz).

graphviz

The graphviz provides bindings to the Graphviz (note the differences in capitalisation) suite of tools for visualising graphs. Well, OK, I’m not actually binding the actual C library, just generating the appropriate representation into Graphviz’s Dot code and calling the appropriate tool, but close enough. I’m hoping that it will eventually become the definitive way of using Graphviz in Haskell (as opposed to all the other packages that either provide “bindings” to Graphviz or do so internally). It does have some limitations, however:

  • Only supports conversion to/from FGL graphs for now, as there’s (as-yet) no standard way of considering an arbitrary graph data structure.
  • Does not support arbitrary placement of Dot statements but follows a specific order; I did ask on the Haskell mailing lists if anyone wanted this feature but was told resoundly “no”; I might add it as an option in a future release however.

It seems every release of graphviz introduces an API change. Well, OK, not all; there are bug-fix releases, etc., and I use the Haskell Package Versioning Policy (PVP) so releases that do break the API are obvious. Also, after the 2999.5.0.0 release, I’m trying to keep the API changes to a minimum whilst still improving the library. After all, it’s all part of the process of making A Better Library (TM) for everyone ;-)

What I’m working on at the moment for graphviz is improved support for interaction with the actual Graphviz tools in how you call them and how you choose which format they’re created in. Rather than just indicating if the operation was successful or not and printing any error messages to stderror, it’s now using the Either datatype to return the actual error if there is one. The code isn’t perfect (a full disk exception will currently freeze it, etc.) but it’s getting there. I’ve also improved the image output support by allowing the programmer to use any (non-deprecated) image type that Graphviz officially supports and not just those that I happened to have had available on my system when the first version was written (note that there might be other types as well that aren’t listed in the official documentation; my install has support for a gv output type and I have no idea what it is); there is also the ability for the library to automatically add the correct file extension to the output filename. Canvas (i.e. make a window with the graph visualisation rather than create a file) output types are now also treated separately.

I’m also working on adding QuickCheck support to ensure that parse . print == id for all supported types. Note that it isn’t valid to consider this the other way round: graphviz doesn’t support arbitrary placement of statements in the Dot code, and it parses more liberally than it prints. This should hopefully find any lingering bugs I’ve got lying around in the parsing and printing side of things. The next step would then be to ensure that any generated Dot code is then able to be processed by Graphviz using the Dot (why does Graphviz have to call so many things “Dot”?) output type to add position information, etc. and have that fully parseable.

Also, documentation is gradually being improved; I’m never intending that users will be able to completely ignore the official Graphviz documentation; however, I’m trying to make it as obvious as possible the differences of opinions/terminology/etc. between Graphviz and what graphviz supports, as well as providing broad overviews of the different options and giving links to the appropriate upstream documentation.

Graphalyze

Graphalyze was the main focus of my mathematics honours thesis last year. It provides a library to use graph-theory to analyse the relationships in discrete data sets. Nowadays, it mainly serves to do the heavy lifting for SourceGraph.

Work on Graphalyze is mainly to add extra graph algorithms that I want for SourceGraph. I’m also wanting to eventually replace the current reporting framework with a pretty-printing based one; possibly removing the pseudo-option of using something apart from Pandoc and just making it the default; this will definitely necessitate a licensing change for most of the library however (which I have no problems with, and it doesn’t appear to be used by anyone else…).

SourceGraph

First of all, I was urged (by people who shall remain nameless unless they say its OK) to submit a paper based upon SourceGraph to Partial Evaluation and Program Manipulation 2010 (PEPM) which is going to be held in Madrid next January. I didn’t expect to get it expected (to be honest, there’s nothing that I think is that interesting/new in SourceGraph for the moment except for possibly the overall concept), but it gave me a figurative kick up the proverbial to actually update SourceGraph (which was my first ever real released Haskell program… :o ) and I thought that the actual process of writing the paper would be good practice. It turns out however that I was chosen, which resulted in a mad rush to update the paper (many thanks to quicksilver on #haskell for fixing up my Haskell terminology). So I now find myself preparing for my first ever real academic conference, let alone the first one that I’ll be presenting at! What makes the situation even more interesting is that I’m currently not a student (more on that later) and so the financial aspect will be interesting… I’ll be putting a version of my paper up here as soon as I fully understand the hoops I have to jump through to do so.

Whilst SourceGraph was just a sample application of Graphalyze in my honours thesis, it was why I’d originally thought up the topic; my supervisor convinced me to make the application aspects more general by splitting out the graph-theory side of things into a library and having several sample usages, of which I only ever ended up doing the one (mainly due to time constraints). It is now the driving force behind graphviz and Graphalyze (well, OK, users sending me bug reports/feature requests and my own sense of pride do help with graphviz; the latter was what prompted the re-writing of the printing/parsing layout whilst I was meant to be on holidays overseas visiting family…) and the focus (transitively or otherwise) of most of my Haskell hacking at the moment.

Anyway, I’m currently trying to improve the quality of the output produced by SourceGraph in terms of usage (by adding command line parameters) and in what kind of analyses it produces. I have kind of shot myself in the foot, however, as I’ve already updated it to use the as-yet-unreleased 1.8 series of Cabal, so I can’t release it (not that it’s anywhere close to being in a releasable state) until Cabal-1.8 is available…

Haskell in Gentoo

I’m going to do a proper post over on the official Gentoo-Haskell blog later on about most of this, but here’s a brief rundown on what I’ve been doing (along with kolmodin, trofi, etc.):

  • I’m still not an official dev, as I don’t have time/can’t be bothered (pick whichever you think is more accurate :p ) to finish off the dev quiz; I still help out on the IRC channel, answer (but not close) bugs on bugzilla, etc.
  • I’ve been doing some “spring-cleaning” in the overlay to try and remove old, unused packages (and versions of packages). I’m defining “unused” as “the version in the overlay is out of date, no-one has complained about it being out of date and I can’t be bothered checking through all the dependencies, etc.” ;-)
  • Trying to get GHC 6.12 RC1 to work; whilst I’ve managed to build an x86 binary bootstrap package, I’m not able to actually use it to bootstrap, and if I try and install that with USE="binary", then it appears our hack for not installing haddock with GHC doesn’t seem to work anymore :s
  • haskell-updater has been updated to work with GHC 6.12/Cabal 1.8 \o/ . Of course, I can’t actually test this out fully as I’m not able to install 6.12 here… The underlying code has also been rewritten to make it easier to add extra options down the track.

University

After I finished my maths honours last year, I had been hoping to study at the University of Edinburgh this year (starting in September, which would have been convenient for going to ICFP), but whilst I was accepted I was unable to obtain any funding :( As such, I’ve been either working at the University of Queensland (doing some casual sysadmin-ing as well as developing material for and helping to run the new SCIE1000 subject) or working on a few odds and ends at home (that is, the projects mentioned above) all this year.

Because my dreams of studying overseas have been dashed, I’ve now applied to study at Australian National University in Canberra under the supervision of Professor Brendan McKay. Whilst I haven’t been officially accepted, I’ve been told that I’m guaranteed a spot (and at least some funding) due to my academic results for honours last year. I’ll be working on combinatorial algorithms, so nothing officially Haskelly in nature (but I’m intending on doing most – if not all – of my coding in my favourite programming language).

PEPM

I mentioned previously that I’d be going to PEPM next year… will anyone else from the Haskell community be there (or at any of the other conferences associated with POPL)? I’ve as yet only had the pleasure of meeting one person (dibblego), and it’d be great if I could put some faces to some of the other names I see in #haskell or on the mailing lists…

That’s all folks

In case anyone cares, this is what I’ve been up to and am planning on doing for the most part. This blog post on the main was to try and get me writing here more often, as well as an attempt to stifle Joe Fredette’s continual complaints of lack of Haskell Weekly News material because I haven’t released anything recently ;-) So Joe, go ahead and write about this! :p

29 October 2008

Honours + LXDE

Filed under: Gentoo,Haskell,LXDE,Uni,Xfce,XMonad — Ivan Miljenovic @ 11:36 PM

In this post, I’m going to discuss the status of my Haskell-oriented Mathematics Honours thesis, as well as the new Desktop Environment I’m using.

(more…)

24 February 2008

Uni and Computers

Filed under: Gentoo,Haskell,Uni — Ivan Miljenovic @ 9:51 PM

The first semester of 2008 starts tomorrow. This year, I’ll be doing Honours in Mathematics at the University of Queensland. I’ve also got a couple more computers to put Gentoo onto!

(more…)

28 January 2008

LCA 2008

Filed under: Gentoo,Uncategorized,XMonad — Ivan Miljenovic @ 8:21 PM
Tags: ,

So I’m here at LCA 2008. Tomorrow morning there’s the Gentoo Down Under Mini-Conf, at which I’ll be giving a short-ish tutorial/talk on what I’m calling “The Portage Ecosystem”. That link will give you the slides I’m going to use. As it stands, I don’t know if I’ll end up using all of them as I’m only meant to be talking for 10-15 minutes :s

I’ll also be helping to man the Gentoo stall at the LCA Open Day, where I’m going to demo my laptop running Xfce+XMonad, on which I’ve spent a couple of hours playing with different layouts, etc. since posting my previous post.

20 January 2008

X²Monad

Filed under: Gentoo,Haskell,XMonad — Ivan Miljenovic @ 12:59 AM

I’ve been following the XMonad project recently. For those of you who don’t know what XMonad is, I recommend you have a browse through the home page and it’s Wikipedia article. I’ve been planning on playing with XMonad for a while, though my initial plans were to wait till I bought myself a new über box to use in conjunction with my laptop. However, I decided to delay purchasing a new computer (because I couldn’t decide which CPU to get, not due to lack of cash) but couldn’t delay using XMonad any further.

Square that X!

One problem I’ve always had with all the XMonad screenshots I’d seen was that to my mind, dzen and xmobar look rather ugly to me, though this could be due to everyone loving dark colours for their setup. Personally, I much prefer a live graphical image of the various meters, etc. that I have residing on my panels.

Since I heard that KDE 4.0 was going to be released soon (this was around the 9th of January when I started this), I thought I’d do one better than Spencer Janssen’s usage of kicker with XMonad and actually use the full KDE environment with XMonad replacing kwin. After all, if it’s possible to integrate XMonad into Gnome, doing it with KDE should be possible. However, I got sick of waiting for the release date and decided to install version 3.5.8 instead.

Whilst I prefer KDE to Gnome, in the end I couldn’t stand using KDE either. So I decided to go back to using Xfce 4.4, which I’ve been using since one of its betas. I quite like Xfce: for a Desktop Environment, it’s rather responsive and fast. My only complaints is that sometimes it’s too simplistic: so to change an autostart application I have to manually edit the file rather than using the GUI; and some Gnome-oriented apps keep dumbing down. So I decided to combine Xfce with XMonad to form X2Monad (though it turns out gour on #gentoo-haskell has been doing this for a while already without telling anyone).

So why use a DE?

First of all, why not? :p Sure, most people use XMonad in a very minimalistic-environment. However, I’ve become used to having an overall set of applications that integrate nicely, with a consistent manner for configuration (including setting the theme, etc.). Furthermore, I quite like the Xfce panels and the plugins for them (more so than kicker & co.).

Secondly, I’ve found that DEs have better menus and have them by default. Whilst I don’t use the menu often, it is often useful to get an idea of what’s installed and what the name of a particular program you have is (I’ve done this before where I installed something, decided I didn’t like it but couldn’t remember what it was called to uninstall it).

Finally, I want to use a DE with XMonad because I can… that’s the beauty of FLOSS! Especially when you bring in Gentoo’s USE-flags, etc.

How to setup Xfce and XMonad to play nicely together.

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.