Monads (finally)
By Noam YadgarIf you have searched for monads and reached this article, you probably have some degree of knowledge about monads. You may have already put a fair amount of effort into learning monads. In any way, whenever you’ve tried to learn monads, you have likely encountered two main approaches:
- The: Monads are simple, here’s a simple example of a monad.
- Or the famous: A monad is a monoid in the category of endofunctors.
In both approaches, the layout is similar. We start with a strong statement like: A monad is a powerful concept when applied to programming, and later apologize for being one of the most difficult programming concepts to grasp. The reality is that all of these statements about monads are valid, and the truth lies somewhere in the middle.
Part I - What is a monad?
Monads are indeed simple and powerful, but understanding the effects and conpect in their most generic form, requires some patience. In this article, we will build the knowledge for monads, step by step. We will use Haskell to get a good intuition for monads and try to consume just the correct dose of information for a single article.
But first, let’s ask: Why do we even need monads?
Why monads?
While other programming styles combine composed functions and procedures, functional programming is based exclusively on composition. If we want a system that is based exclusively on compositions, we better use pure functions; otherwise, it would be problematic to rely on such a system.
But what are pure functions?
Pure functions
For a function to be pure, it needs to:
- Always return a value.
- Always map the same input to the same output.
- Produce no side effects (more on that later).
Here’s a simple function written in JavaScript. Is it a pure function?
function g(x) {
console.log(`adding ${x} to 5`)
return x + 5
}
It returns a value, and every input maps exactly to the same output, but the function produces a side effect.
Any IO, errors, and manipulations of outside states are considered side effects. In the case of g(x)
,
we’re calling for console.log
which writes to standard output.
What is a side effect?
Let \(S\) represent drawing over some piece of paper. Every time we apply \(S\), the same piece of paper has a new mark.
Can \(S\) satisfy the criteria of a pure function?
As an input, \(S\) can be given a list of coordinates to describe point-to-point motion; if we decide to show the paper after application (i.e., return the paper), \(S\) cannot consistently map inputs to outputs. If we have two different sets of coordinates \(u\) and \(v\) and we apply \(S\) in the order:
\[S(u) \rightarrow S(v) \rightarrow S(u)\]The second application of \(S(u)\) will yield different results than the first one. Because by the second application of \(S(u)\), the results on the paper have changed by \(S(v)\).
\(S\) cannot be a pure function. The drawing action creates marks on the paper, and those are side effects, which make \(S\) itself inconsistent. Even if \(S\) doesn’t return the paper as a result, other things might interact with the paper, and \(S\) would make them inconsistent as well. Similar to writing on a piece of paper, writing to standard output is also a side effect.
Bridging the gap
A programming language that cannot support side effects is pretty useless, and the people behind Haskell understood that, during research conducted in the late 1980s to early 1990s, they managed to keep Haskell pure while supporting side effects.
A pure functional programming language that supports side effects is not an easy task,
it requires a way of supporting the behavior like the above g(x)
while
maintaining the properties of a pure function. The people working on Haskell found that they
can borrow the concept of a monad from Category theory, and it will provide the missing link
between side effects and pure functional programming.
Although we borrow the monad from Category theory , we don’t have to study the theory.
We can rewrite g(x)
in Haskell:
g x = do
putStrLn
("adding " ++ show x ++ " to 5")
return (x + 5)
What makes the Haskell version pure? At first glance, both versions look similar,
but they’re very different. If we examine the function signature of g
we will see:
g :: Integer -> IO Integer
In the JavaScript version, the input and output of g(x)
are the same type (the Number
type).
In Haskell, g
takes an Integer
and returns a wrapped Integer
called IO Integer
.
The g
function in Haskell is written using a syntactic sugar called do
notation,
but if we clear out this syntax, our function will look like this:
g x = putStrLn s >> return (x + 5)
where s = "adding " ++ show x ++ " to 5"
It appears as if the functions putStrLn
and return
are connected with some operator (>>)
(I pronounce it ta-ta for easier read).
Let’s examine the function signatures of putStrLn
, return
, and (>>)
.
putStrLn :: String -> IO ()
Interestingly, putStrLn
also returns an IO
type; this time, it wraps the empty tuple type.
The signature of putStrLn
gives us a clue - maybe we’re passing the IO
type from one function to another.
Let’s continue with checking the return
function:
class Applicative m => Monad m where
...
return :: a -> m a
The return
function is specifically defined for Monad
, it simply takes any value
and returns it in a monadic context. In our example, return (x + 5)
takes the
Integer
value of (x + 5)
and lifts it to be an IO Integer
. Even if we don’t
yet understand what is a monad, our first clue is that the IO
type is probably one.
We’ve never specified the type signature of g
, yet somehow, the compiler knew that
our call for return
should invoke the IO
monad. It’s not surprising
because the return
function is connected to the putStrLn
function by the mysterious (>>)
operator:
class Applicative m => Monad m where
...
(>>) :: m a -> m b -> m b
The (>>)
operator is also a function definition for a Monad
,
it’s a binary operator that takes two monads of type \(m\): One with a value of some type \(a\) and the other
with a value of some type \(b\). Then, it returns a Monad
\(m\) with a value of type \(b\). It generally implies that when
defining some type as a monad, we need to define how we reconcile the natural transition from
a monad with some value \(a\) to the same monad with some value \(b\).
IO
handles the (>>)
operator by simply ignoring the first argument’s
value and returning the second argument.
This clever trick allows us to perform IO actions without leaving the monadic context.
Since we’re not interested in the value of the first monad
(the empty tuple - ()
, which is roughly equivalent to the void
type in C),
we use the (>>)
operator to dispose this value and pick the second IO
monad (IO Integer
).
flowchart LR A("IO ()") -->| \>\> | B("IO (x + 5)") B -->|"Ignoring ()"| C("IO (x + 5)")
How is g
pure?
For every x
we pass to g
,
we will always get the same IO Integer
in return;
however, we call for putStrLn
to write
to standard output. How can we say that g
is pure?
Imagine a magical paper, one that can replicate itself out of thin air and communicate with some real paper at some mysterious dimension we cannot reach.
Let \(F\) represent a function that takes a magical paper with a set of coordinates and returns a copy of the magical paper, with marks drawn from one position to another.
Does \(F\) meet the criteria of a pure function?
- \(F\) returns a magical paper.
- We can always map the same set of inputs to the same output. For example, if we have two different sets of coordinates \(u\) and \(v\) and a magical paper \(p\), we can apply \(F\) as following: \[F(F(F(p, u), v), u)\] Every application of \(F\) will be consistent.
- In the safe dimension of which \(F\) exists, \(F\) does not create any side effects. It doesn’t live in a context where a real paper even exist.
If we accept the above, we understand that side effects are relative, and we can write a completely
pure functional program that performs side effects at the real-world dimension.
With this approach, relative to our perspective, g
is a pure function that takes
an Integer
and returns an IO Integer
. From a system-level standpoint, it invokes what Haskellers call
an IO action that interacts with the real world, in our case, writing to standard output.
This is monad
This is it. The whole point of monads is that they provide a generic framework for wrapping and passing values with contexts. These contexts can have many things in the background, such as IO actions (writing to files, reading from files, logs, networking, etc.), error management, and more. All the side effects we need to make useful software are encapsulated within monads.
If you look at the definition of the IO
type, you’ll see that it’s being constructed with
a function that literally takes the State# RealWorld
type and returns a pair of State# RealWorld
and some type a
. The State# RealWorld
type is the magical paper of the IO
monad.
newtype IO a = IO (State# RealWorld -> (# State# RealWorld, a #))
Part II - Bind
Let’s make a more interesting program and see another type of monad in action.
We start by using a function to convert a String
to an Integer
.
We can use the built-in function read
:
read :: Read a => String -> a
The read
function takes a String
and
converts it to a type that satisfies the Read
class.
Since Integer
has an implementation for Read
,
we can use read
to convert a String
to an Integer
.
The program can crash
This function has a problem; what happens if the String
argument we pass to read
cannot be converted to an Integer
? The answer is that the program will crash.
To fix this, Let’s use a modified version of read
, called readMaybe
.
readMaybe :: Read a => String -> Maybe a
The Maybe monad
The Maybe
type is one of Haskell’s most famous built-in types.
It’s an interesting monad that can capture the behavior of operations that might
fail or are undefined in certain areas of their domain. Two ways can construct the Maybe
type:
Nothing
- Usually indicates that the function has failed and the returned value is nothing.Just a
- A constructor that holds a generic valuea
, usually the returned value from a successful evaluation.
If we call readMaybe
with a string that cannot be parsed as an Integer
, we get Nothing
,
and if the provided string can be parsed as an Integer
, we get Just x
, where x
is the parsed value.
>> readMaybe "NaN" :: Maybe Integer
Nothing
>> readMaybe "123" :: Maybe Integer
Just 123
The Bind (>>=)
operator
I wrote about the notion of progressing a value from one monad
to another, but I’ve yet to reveal one of the most important operators of monads, the Bind
operator (or (>>=)
).
class Applicative m => Monad m where
(>>=) :: m a -> (a -> m b) -> m b
This binary operator takes a monad and a function. The idea of this operator is that the provided function will be applied to the inner value of the monad and return a new monad with the results.
flowchart LR A(m a) -->|taking a from m a| B(f :: a -> m b) B -->|applying f to a to get m b| C(m b)
To show binding in action, let’s first define a function that takes an Integer
, checks if it’s an even number,
and returns a Maybe Integer
. If the number is odd - we return Nothing
; otherwise,
we will return Just x
where x
is the original argument.
justEvens :: Integer -> Maybe Integer
justEvens x =
if even x then return x else Nothing
Now, let’s try to bind readMaybe x
with justEvens
:
>> readMaybe "100" >>= justEvens
Just 100
>> readMaybe "123" >>= justEvens
Nothing
>> readMaybe "NaN" >>= justEvens
Nothing
Maybe has a zero
We see that no matter what, if Nothing
pops up, it will be returned.
It is a perfect behavior for a monad that aims to capture failures,
since it doesn’t matter how many functions we chain, if something fails in the process,
we skip the evaluation and let the Nothing
constructor fall through.
If we compare the Maybe
type to numbers and (>>= | >>)
functions
to multiplication, the Nothing
constructor acts as a \(0\).
Much like:
Identities
To ensure monads behave correctly, we can check if they follow the monadic Identities:
- Left identity:
(return a >>= k) == k a
- Right identity:
(m >>= return) == m
- Associativity:
(m >>= (\x -> k x >>= h)) == (m >>= k) >>= h
Let’s test the Maybe
monad:
The left identity states that any value passed to return
and bound
to some function k
is the same as applying k
to this value. This identity ensures that the
bind operator applies the provided function to the inner value of the monad.
>> (return 10 >>= justEvens) == justEvens 10
True
The right identity states that any monad bound to the return
function,
should return the same monad with the same inner value. This identity ensures that the return
function is just
lifting a value to a monadic context.
>> (justEvens 10 >>= return) == justEvens 10
True
Associativity means that the order of binding is not essential. In our example:
>> a = return "10" >>= (\s -> readMaybe s >>= justEvens)
>> b = (return "10" >>= readMaybe) >>= justEvens
>> c = return "10" >>= readMaybe >>= justEvens
>> a == b && b == c
True
Final program
Let’s write a program that:
- Reads a string from the user.
- Tries to parse it as an
Integer
- If the string can be an
Integer
it checks if it’s an even number - If the result is not an even number, it prints “failed”
- Otherwise, it applies
g
to the number and prints the results
We’ve already encountered all the functions in this program, except
getLine
, which takes an input string from the user.
getLine :: IO String
Here’s the full program:
run :: IO ()
run = getLine >>=
(\s -> case readMaybe s >>= justEvens of
Nothing -> putStrLn "failed"
Just x -> g x >>= (\n ->
putStrLn ("= " ++ show n)))
Let’s run it three times and provide different inputs:
>> run
10
adding 10 to 5
= 15
>> run
hello
failed
>> run
5
failed
The do notation
Although being very exact and explicit, our run
function is a bit
hard to read. It involves one bind between getLine
and some big
lambda function, and the lambda itself has a nested lambda inside a case
clause. This is a known issue in Haskell; for a language that often
receives the informal title of the most beautiful programming language,
the overuse of arrows, parenthesis, and lambdas makes for a pretty unpleasant syntax.
To fix this, Haskell has an alternative syntax known as do notation. The do notation can be used to replace a pattern like:
f = m >>=
(\x -> f x >>=
(\y -> g y >>= return))
With:
f = do
x <- m
y <- f x
g y
That is undoubtedly a much nicer syntax to read. We can rewrite the run
function with do notation, and you will notice that the code changes to
be almost one-to-one with the requirements that we wrote in plain English.
run :: IO ()
run = do
s <- getLine
case readMaybe s >>= justEvens of
Nothing -> putStrLn "failed"
Just x -> do
y <- g x
putStrLn ("= " ++ show y)
Summary
Besides letting their inner values naturally pass through from function to function, monads are values by themselves. They represent actions, states, and statuses that are carried over from one function to another, They are context carriers that provide a background layer to perform all sorts of necessary effects. Therefore, they are the ultimate link between pure functional programming and the real world.