Menu

Monads (finally)

By Noam Yadgar
#haskell #functional-programming #software-engineering

If you have searched for monads and reached this article, you probably have some degree of knowledge about monads. You may have already put a fair amount of effort into learning monads. In any way, whenever you’ve tried to learn monads, you have likely encountered two main approaches:

  • The: Monads are simple, here’s a simple example of a monad.
  • Or the famous: A monad is a monoid in the category of endofunctors.

In both approaches, the layout is similar. We start with a strong statement like: A monad is a powerful concept when applied to programming, and later apologize for being one of the most difficult programming concepts to grasp. The reality is that all of these statements about monads are valid, and the truth lies somewhere in the middle.

Part I - What is a monad?

Monads are indeed simple and powerful, but understanding the effects and conpect in their most generic form, requires some patience. In this article, we will build the knowledge for monads, step by step. We will use Haskell to get a good intuition for monads and try to consume just the correct dose of information for a single article.

But first, let’s ask: Why do we even need monads?

Why monads?

While other programming styles combine composed functions and procedures, functional programming is based exclusively on composition. If we want a system that is based exclusively on compositions, we better use pure functions; otherwise, it would be problematic to rely on such a system.

But what are pure functions?

Pure functions

For a function to be pure, it needs to:

  • Always return a value.
  • Always map the same input to the same output.
  • Produce no side effects (more on that later).

Here’s a simple function written in JavaScript. Is it a pure function?

function g(x) {
  console.log(`adding ${x} to 5`)
  return x + 5
}

It returns a value, and every input maps exactly to the same output, but the function produces a side effect. Any IO, errors, and manipulations of outside states are considered side effects. In the case of g(x), we’re calling for console.log which writes to standard output.

What is a side effect?

Let \(S\) represent drawing over some piece of paper. Every time we apply \(S\), the same piece of paper has a new mark.

Can \(S\) satisfy the criteria of a pure function?

As an input, \(S\) can be given a list of coordinates to describe point-to-point motion; if we decide to show the paper after application (i.e., return the paper), \(S\) cannot consistently map inputs to outputs. If we have two different sets of coordinates \(u\) and \(v\) and we apply \(S\) in the order:

\[S(u) \rightarrow S(v) \rightarrow S(u)\]

The second application of \(S(u)\) will yield different results than the first one. Because by the second application of \(S(u)\), the results on the paper have changed by \(S(v)\).

\(S\) cannot be a pure function. The drawing action creates marks on the paper, and those are side effects, which make \(S\) itself inconsistent. Even if \(S\) doesn’t return the paper as a result, other things might interact with the paper, and \(S\) would make them inconsistent as well. Similar to writing on a piece of paper, writing to standard output is also a side effect.

Bridging the gap

A programming language that cannot support side effects is pretty useless, and the people behind Haskell understood that, during research conducted in the late 1980s to early 1990s, they managed to keep Haskell pure while supporting side effects.

A pure functional programming language that supports side effects is not an easy task, it requires a way of supporting the behavior like the above g(x) while maintaining the properties of a pure function. The people working on Haskell found that they can borrow the concept of a monad from Category theory, and it will provide the missing link between side effects and pure functional programming.

Although we borrow the monad from Category theory , we don’t have to study the theory.

We can rewrite g(x) in Haskell:

g x = do
  putStrLn 
    ("adding " ++ show x ++ " to 5")
  return (x + 5)

What makes the Haskell version pure? At first glance, both versions look similar, but they’re very different. If we examine the function signature of g we will see:

g :: Integer -> IO Integer

In the JavaScript version, the input and output of g(x) are the same type (the Number type). In Haskell, g takes an Integer and returns a wrapped Integer called IO Integer. The g function in Haskell is written using a syntactic sugar called do notation, but if we clear out this syntax, our function will look like this:

g x = putStrLn s >> return (x + 5)
  where s = "adding " ++ show x ++ " to 5"

It appears as if the functions putStrLn and return are connected with some operator (>>) (I pronounce it ta-ta for easier read). Let’s examine the function signatures of putStrLn, return, and (>>).

putStrLn :: String -> IO ()

Interestingly, putStrLn also returns an IO type; this time, it wraps the empty tuple type. The signature of putStrLn gives us a clue - maybe we’re passing the IO type from one function to another. Let’s continue with checking the return function:

class Applicative m => Monad m where
  ...
  return :: a -> m a

The return function is specifically defined for Monad, it simply takes any value and returns it in a monadic context. In our example, return (x + 5) takes the Integer value of (x + 5) and lifts it to be an IO Integer. Even if we don’t yet understand what is a monad, our first clue is that the IO type is probably one.

We’ve never specified the type signature of g, yet somehow, the compiler knew that our call for return should invoke the IO monad. It’s not surprising because the return function is connected to the putStrLn function by the mysterious (>>) operator:

class Applicative m => Monad m where
  ...
  (>>) :: m a -> m b -> m b

The (>>) operator is also a function definition for a Monad, it’s a binary operator that takes two monads of type \(m\): One with a value of some type \(a\) and the other with a value of some type \(b\). Then, it returns a Monad \(m\) with a value of type \(b\). It generally implies that when defining some type as a monad, we need to define how we reconcile the natural transition from a monad with some value \(a\) to the same monad with some value \(b\).

IO handles the (>>) operator by simply ignoring the first argument’s value and returning the second argument. This clever trick allows us to perform IO actions without leaving the monadic context. Since we’re not interested in the value of the first monad (the empty tuple - (), which is roughly equivalent to the void type in C), we use the (>>) operator to dispose this value and pick the second IO monad (IO Integer).

flowchart LR
    A("IO ()") -->| \>\> | B("IO (x + 5)")
    B -->|"Ignoring ()"| C("IO (x + 5)")

How is g pure?

For every x we pass to g, we will always get the same IO Integer in return; however, we call for putStrLn to write to standard output. How can we say that g is pure?

Imagine a magical paper, one that can replicate itself out of thin air and communicate with some real paper at some mysterious dimension we cannot reach.

Let \(F\) represent a function that takes a magical paper with a set of coordinates and returns a copy of the magical paper, with marks drawn from one position to another.

Does \(F\) meet the criteria of a pure function?

  • \(F\) returns a magical paper.
  • We can always map the same set of inputs to the same output. For example, if we have two different sets of coordinates \(u\) and \(v\) and a magical paper \(p\), we can apply \(F\) as following: \[F(F(F(p, u), v), u)\] Every application of \(F\) will be consistent.
  • In the safe dimension of which \(F\) exists, \(F\) does not create any side effects. It doesn’t live in a context where a real paper even exist.

If we accept the above, we understand that side effects are relative, and we can write a completely pure functional program that performs side effects at the real-world dimension. With this approach, relative to our perspective, g is a pure function that takes an Integer and returns an IO Integer. From a system-level standpoint, it invokes what Haskellers call an IO action that interacts with the real world, in our case, writing to standard output.

This is monad

This is it. The whole point of monads is that they provide a generic framework for wrapping and passing values with contexts. These contexts can have many things in the background, such as IO actions (writing to files, reading from files, logs, networking, etc.), error management, and more. All the side effects we need to make useful software are encapsulated within monads.

If you look at the definition of the IO type, you’ll see that it’s being constructed with a function that literally takes the State# RealWorld type and returns a pair of State# RealWorld and some type a. The State# RealWorld type is the magical paper of the IO monad.

newtype IO a = IO (State# RealWorld -> (# State# RealWorld, a #))

Part II - Bind

Let’s make a more interesting program and see another type of monad in action. We start by using a function to convert a String to an Integer. We can use the built-in function read:

read :: Read a => String -> a

The read function takes a String and converts it to a type that satisfies the Read class. Since Integer has an implementation for Read, we can use read to convert a String to an Integer.

The program can crash

This function has a problem; what happens if the String argument we pass to read cannot be converted to an Integer? The answer is that the program will crash. To fix this, Let’s use a modified version of read, called readMaybe.

readMaybe :: Read a => String -> Maybe a

The Maybe monad

The Maybe type is one of Haskell’s most famous built-in types. It’s an interesting monad that can capture the behavior of operations that might fail or are undefined in certain areas of their domain. Two ways can construct the Maybe type:

  • Nothing - Usually indicates that the function has failed and the returned value is nothing.
  • Just a - A constructor that holds a generic value a, usually the returned value from a successful evaluation.

If we call readMaybe with a string that cannot be parsed as an Integer, we get Nothing, and if the provided string can be parsed as an Integer, we get Just x, where x is the parsed value.

>> readMaybe "NaN" :: Maybe Integer
Nothing
>> readMaybe "123" :: Maybe Integer
Just 123

The Bind (>>=) operator

I wrote about the notion of progressing a value from one monad to another, but I’ve yet to reveal one of the most important operators of monads, the Bind operator (or (>>=)).

class Applicative m => Monad m where
  (>>=) :: m a -> (a -> m b) -> m b

This binary operator takes a monad and a function. The idea of this operator is that the provided function will be applied to the inner value of the monad and return a new monad with the results.

flowchart LR
    A(m a) -->|taking a from m a| B(f :: a -> m b)
    B -->|applying f to a to get m b| C(m b)

To show binding in action, let’s first define a function that takes an Integer, checks if it’s an even number, and returns a Maybe Integer. If the number is odd - we return Nothing; otherwise, we will return Just x where x is the original argument.

justEvens :: Integer -> Maybe Integer
justEvens x = 
  if even x then return x else Nothing

Now, let’s try to bind readMaybe x with justEvens:

>> readMaybe "100" >>= justEvens
Just 100
>> readMaybe "123" >>= justEvens
Nothing
>> readMaybe "NaN" >>= justEvens
Nothing

Maybe has a zero

We see that no matter what, if Nothing pops up, it will be returned. It is a perfect behavior for a monad that aims to capture failures, since it doesn’t matter how many functions we chain, if something fails in the process, we skip the evaluation and let the Nothing constructor fall through.

If we compare the Maybe type to numbers and (>>= | >>) functions to multiplication, the Nothing constructor acts as a \(0\). Much like:

\[(a \cdot b) \cdot 0 = a \cdot (b \cdot 0) = 0\]

Identities

To ensure monads behave correctly, we can check if they follow the monadic Identities:

  • Left identity:
(return a >>= k) == k a
  • Right identity:
(m >>= return) == m
  • Associativity:
(m >>= (\x -> k x >>= h)) == (m >>= k) >>= h

Let’s test the Maybe monad:

The left identity states that any value passed to return and bound to some function k is the same as applying k to this value. This identity ensures that the bind operator applies the provided function to the inner value of the monad.

>> (return 10 >>= justEvens) == justEvens 10
True

The right identity states that any monad bound to the return function, should return the same monad with the same inner value. This identity ensures that the return function is just lifting a value to a monadic context.

>> (justEvens 10 >>= return) == justEvens 10
True

Associativity means that the order of binding is not essential. In our example:

>> a = return "10" >>= (\s -> readMaybe s >>= justEvens)
>> b = (return "10" >>= readMaybe) >>= justEvens
>> c = return "10" >>= readMaybe >>= justEvens
>> a == b && b == c 
True

Final program

Let’s write a program that:

  • Reads a string from the user.
  • Tries to parse it as an Integer
  • If the string can be an Integer it checks if it’s an even number
  • If the result is not an even number, it prints “failed”
  • Otherwise, it applies g to the number and prints the results

We’ve already encountered all the functions in this program, except getLine, which takes an input string from the user.

getLine :: IO String

Here’s the full program:

run :: IO ()
run = getLine >>= 
      (\s -> case readMaybe s >>= justEvens of
              Nothing -> putStrLn "failed"
              Just x  -> g x >>= (\n -> 
                putStrLn ("= " ++ show n)))

Let’s run it three times and provide different inputs:

>> run
10
adding 10 to 5
= 15

>> run
hello
failed

>> run
5
failed

The do notation

Although being very exact and explicit, our run function is a bit hard to read. It involves one bind between getLine and some big lambda function, and the lambda itself has a nested lambda inside a case clause. This is a known issue in Haskell; for a language that often receives the informal title of the most beautiful programming language, the overuse of arrows, parenthesis, and lambdas makes for a pretty unpleasant syntax.

To fix this, Haskell has an alternative syntax known as do notation. The do notation can be used to replace a pattern like:

f = m >>= 
  (\x -> f x >>= 
    (\y -> g y >>= return)) 

With:

f = do 
  x <- m
  y <- f x 
  g y

That is undoubtedly a much nicer syntax to read. We can rewrite the run function with do notation, and you will notice that the code changes to be almost one-to-one with the requirements that we wrote in plain English.

run :: IO ()
run = do
  s <- getLine
  case readMaybe s >>= justEvens of
    Nothing -> putStrLn "failed"
    Just x  -> do
      y <- g x
      putStrLn ("= " ++ show y)

Summary

Besides letting their inner values naturally pass through from function to function, monads are values by themselves. They represent actions, states, and statuses that are carried over from one function to another, They are context carriers that provide a background layer to perform all sorts of necessary effects. Therefore, they are the ultimate link between pure functional programming and the real world.