This is another short tale about lazy evaluation but this time it is about a lack thereof, a moment of brief despair and confusion and how the underlying simplicity of a purely functional language like Haskell comes to the rescue.
Graphs are a fundamental data structure in computer science because a lot of problems can be modelled with them. Graph traversal, shortest path between two vertices, minimum spanning trees are all well-known algorithms and there is plenty of literature available. This applies to imperative languages but is it the same for functional languages? My first-hand experience is that this is not quite the case and answering a seemingly simple question like “how should I implement a graph algorithm in a functional programming language?” ends up being unexpectedly challenging.
Providing an application as a Docker executable image is a handy way to distribute an application: no need to install toolchains, frameworks and dependencies. One can just pull a Docker image and run it. It’s really that simple. Docker images can grow wildly in size because they need to install all the dependecies needed to run the application: this as a user can be quite annoying. Imagine you want to use a tiny application that solves a very specific problem and you have to download a 2GB Docker image! It’s undesirable. And it’s actually not needed: why not shipping only the executable in a very compact Docker image? How can this be achieved if the application is built in Haskell?
Lazy IO is so tricky to get right and has some intrinsic limitations that the usual recommendation is to simply avoid it. On the other hand sometimes it’s not desirable (or even possible) to use strict IO, mostly for memory efficiency reasons. This is the kind of problems that streaming libraries like conduit or pipes are designed to solve. In this post I want to show how I refactored a piece of code that uses lazy IO to use the conduit library (for those not familiar with it, please read this conduit tutorial first).
Lazy evaluation sometimes makes it trickier to really understand how a piece of code for folks used to languages with strict semantics (as I am). Sometimes introducing strictness is necessary to avoid space leaks and to make memory allocations more predictable in certain parts of our code. The usual suggestion is to “carefully sprinkle strict evaluation” in our code; one of the classic examples of memory leak is using foldl
to sum a list of ints, with the result that instead of returning a result using constant space, it ends up taking an outrageous amount of memory before returning a result because thunks pile up (this behaviour is known as space leak). Most of the times I personally find it tricky to add strictness to a piece of Haskell code, so I’d like to share my latest experience doing that.
We’ll be using the Bloom filter implemented in chapter 26 of Real World Haskell as an example, the version contained in the book creates the filter lazily: our goal will be to create a strict version of that particular piece of code.