A Small Adventure In Functors

The function composition operator in Haskell, i.e. (.), is much more complicated than at first sight, or rather, the function composition operator plus the type system and default currying of functions. The simple case is, well, simple, as this function that joins integers into a string demonstrates:

--concatenate a list of integers
concatIntegers :: [Integer] -> String  
concatIntegers = concat . map show  

Once Haskell type unification does its magic, the types of the functions and the composition operator fit like hand and glove:

map show :: Show a => [a] -> [String]  
concat :: [[a]] -> [a]  
(.) :: (b -> c) -> (a -> b) -> (b -> c)

a in the type of map show is matched to Integer, and [String], which is in fact [[Char]] is mapped to the first argument of concat, leading to the final output of [Char]. So far so simple.

Things get rather complicated when currying and more complicated types come into the picture, though. I ran into one such case recently while trying to complete the exercises of the CIS lectures. The homework for week 10 requires you to write a functor instance for the following Parser type:

newtype Parser a = Parser { runParser :: String -> Maybe (a, String) }  

This is a relatively straightforward task for which my solution was also straightforward, from a Python dev's perspective:

first :: (a -> b) -> (a,c) -> (b,c)  
first f (a,c) = (f a, c)

instance Functor Parser where  
    fmap f parser = Parser (\string -> (first f) <$> (runParser parser string))

The first function was given as a tip in the task. The solution was included as a part of the homework for week 11, and it looked radically different from mine:

first :: (a -> b) -> (a,c) -> (b,c)  
first f (x,y) = (f x, y)

inParser f = Parser . f . runParser

instance Functor Parser where  
  fmap = inParser . fmap . fmap . first

Well, at least the implementation of first is the same. The implementation of fmap for Parser here is in the celebrated point-free style which takes a while to get used to, but once you understand it, makes a whole lot of sense. I had to go through a pretty detailed step-by-step analysis of fmap for Parser to get how it's constructed, and the kind of thinking required to write such code; read on if you're interested.

inParser is superficially similar to the simple composition example above, but fmap is a totally different game. To understand it, let's look at each composition separately. The first one is fmap . first; here are the types of the individual functions:

first :: (a -> b) -> (a, c) -> (b, c)  
fmap :: Functor f => (a -> b) -> f a -> f b  

How is fmap mapped on to first? They both receive three arguments, but (.), whose type is (b -> c) -> (a -> b) -> (b -> c) (repeating because it's worth memorizing), combines functions with two arguments. The secret here is in that Haskell functions are curried by default. When they are passed two or more arguments, it is allright to think of functions like first and fmap as functions with two arguments, but in higher-order contexts like this one, you have to keep in mind that you can think of first as a function that receives one argument of type (a -> b), returning a function of type (a, c) -> (b, c). The same thing is valid for fmap: The output of first with only one argument (b in the type of (.)) is the first argument of fmap. This is where the type system of Haskell comes into play, unifying the types (a,c) -> (b,c) and (a -> b), so that the first argument to fmap becomes (a,c) -> (b,c). The type of the first composition is thus the following:

ghci> :t  fmap . first  
fmap . first :: Functor f => (a -> b) -> f (a, c) -> f (b, c)  

There are two more compositions to go. What does the second fmap composition do? There are two kinds of answers to this question: What does it programmatically do (at the level of types and transformations), and how should we think about it to write and understand similar code (aka functional thinking)? We can figure out the programmatic part again by treating the output of the second component of the composition (now fmap . first) as the first argument to fmap, and type-matching. So now f (a, c) -> f (b, c) has to be mapped to (a -> b), where we end up with a matched to f (a, c) and b matched to f (b, c). The result of this second fmap is the following (to be referred to as S1 later):

ghci> :t fmap . fmap . first  
fmap . fmap . first  
  :: (Functor f1, Functor f) =>
     (a -> b) -> f (f1 (a, c)) -> f (f1 (b, c))     -- [S1]

The functional thinking part has to wait for the last composition, because the types are determined right to left ((.) is right-associative), but functionality is built left to right, since the end result is determined by the last application. If you look at the type of inParser in ghci, you might get a bit confused:

  :: ((String -> Maybe (a1, String)) -> String -> Maybe (a, String))
     -> Parser a1 -> Parser a

What the hell is this supposed to be? And how is it going to match the output type of fmap . fmap . first? Let's start by understanding what inParser does. It takes a pretty complicated function (the argument f), and a parser, and returns another parser. The complicated function f has the following signature:

((String -> Maybe (a1, String)) -> String -> Maybe (a, String))  -- S2

and is supposed to modify the function that the first Parser wraps. As a reminder, this wrapped function has the following signature:

runParser :: String -> Maybe (a, String)  

This is also the first argument of the complicated function, which, only with this first argument, returns another function. So we're talking about a function that gets a function as an argument, and returns another function with the same signature. We can see this better if we regroup the arguments of f:

((String -> Maybe (a1, String)) ->
         (String -> Maybe (a, String)))

This signature has to match the return value of fmap . fmap . first. As you can remember, that was f (f1 (a, c)) -> f (f1 (b, c)). This is not possible without functions being functors themselves, which they in fact are:

instance Functor ((->) r) where  
    fmap = (.)

And fmap is nothing other than function composition itself! Nice plot twist there, no? We can make sure that the first fmap is getting applied to a function instance of functor by replacing it with (.), and checking the type of the resulting composition:

ghci> :t inParser . (.) . fmap . first  
inParser . (.) . fmap . first :: (a1 -> a) -> Parser a1 -> Parser a  

Good to know I'm not making stuff up. Keep in mind that the functor instance for function definition is made for ((->) r), fixing the first argument and polymorphic on the second (or rest, to be more precise). This means that f1 in S1 is matched to the second argument of the individual functions in S2, namely to Maybe (a, String). Maybe is of course also a functor, which takes care of the next functor instance in S1. We end up with the following signature instead of S1:

((String -> Maybe (a1, String)) -> String -> Maybe (a, String))

And the following actual implementation:

(Functor f1) => (.) (maybeFmap (a, c)) -> (.) (maybeFmap (b, c))

maybeFmap refers to the fmap of Functor instance for Maybe.

Now we can also grasp the kind of functional thinking that leads to the double fmaps. Texts on functors usually focus on how they are kinds of "containers" or "context providers" where operating on the value contained within requires a special access method that preserves the context, best examplified by Maybe a. If you think of the functor instance for function creation, though, it becomes obvious that functors go beyond containers. Functors are an abstract relationship between two types, where one is restricting how to compose on the second. When you are trying to reach into a parametric type and act on the inner type, and if one of the common abstract relationships such as functors is defined, it is possible to write code that is simpler in terms of new definitions. Instead of parsing a new function, you need only to be acquianted with standard ways of composing. In the case of that second fmap that turns out to be function composition, I wrote a function that executes the function wrapped by Parser and wraps its result. An experienced Haskell developer, on the other hand, sees the commonality between manipulating the result of a function, and what is inside a Maybe,and uses the same interface for both.

Functional programming is more than being able to pass functions as arguments; it's about treating behavior the same way as data structures, in fact turning functions into structures that we can reason about as we do with data.