## Monads, part three: monads made easy

In the previous three parts of this series (zero, one, two), I have tried to explain what makes something "a monad" and why and when you could want to use one. In this post, I will show you a simple process to create a brand new monad tailor-made for a particular use-case, and walk through a couple examples.

### Factoring monads

To reiterate, a monad is a way to define an abstract machine with different computational semantics. In a sense, the monadic abstraction is the reification of the Church-Turing equivalence: it's an easy, cheap way to simulate other models of computation (e.g. Turing machine/imperative programming) within lambda calculus (i.e. functional programming). This is made easy by decomposing the notion of an abstract machine into three essential components:

- Pure computation. All computing models still have some "leaf" computations that are just pure arithmetic, and if you start with a language that can handle those, there is no need to reinvent the wheel here. The cheapness of monads as a mini-language implementation mechanism mostly comes from this fact: that you can reuse your host language for the purely computational part of the new language.
- Sequencing pure computations and effects. Structurally,
`bind`

gives monads an opportunity to interleave simulated effects with computation. While we've not really done that in the previous posts, one could think of constructing a monadic value as purely describing the desired sequence of interleaved computations and effects. In other words, a monadic value is just a data structure. - Execution. Once we have a description of what we want to happen, we can make
it happen. This is the role of the
`run`

function, which from this perspective is an integral part of what makes a given monad useful, despite not being part of the`Monad`

type class in Haskell.

This decomposition suggests a starker demarcation between constructing a monadic value and executing it than we've done in the past few posts; let's first walk through the abstract process of creating a monad based on this decomposition, then we'll look at some examples.

### Monad factory

For the first part, we have literally nothing to do, and that's the magic of
it. We can rely entirely on the host language for all of the pure computations.
If we think of the second part as defining *structure*, it follows that it
should be possible to define it purely as a data type. This turns out to be the
case. Finally, the third part is pretty much just writing a function. At this
point, we essentially have a data structure representing a language to evaluate
(but where all computations are host-language functions), and we can just
implement a tiny interpreter that only needs to care about effects (as
computations are taken care of by the host language already).

Because `bind`

(`>>=`

) and `return`

always have the same *structure* and differ
only in terms of their assigned *meaning*, their structural definition is
always going to be the same. Here is a "naked", meaningless monadic structure
definition in Haskell:

```
{-# LANGUAGE GADTs #-}
import Control.Monad (ap,liftM)
instance Functor MyMonad where fmap = liftM
instance Applicative MyMonad where pure = return; (<*>) = ap
instance Monad MyMonad where return = Return; (>>=) = Bind
data MyMonad a where
Bind :: MyMonad a -> (a -> MyMonad b) -> MyMonad b
Return :: a -> MyMonad a
```

That's it. The actual definition is just the three lines starting with `data`

;
the three `instance`

lines are there to tell the Haskell compiler that we want
to be able to use `do`

syntax.

This really just says: a value of type `MyMonad a`

is either `Return a`

or
`Bind ma f`

.^{1} This part will always be the same for all monads (modulo the name
of the data type we're defining, which in most cases you'll want to be more
specific than just `MyMonad`

). With this definition, you can write:

```
example :: MyMonad Int
example = do
a <- pure 15
b <- pure 18
return $ a + b
```

and that will produce the same as if you had written:

```
example :: MyMonad Int
example = Bind (Return 15) (\a -> Bind (Return 18) (\b -> Return (a + b)))
```

In the general sense, the monad abstraction is purely structural and does not have much meaning; we can assign a meaning to this structure by writing an interpreter for it. For this first dry run, let us just assign the simplest possible meaning, which is that of executing each step in order. The skeleton of our interpreter becomes:

```
{-# LANGUAGE LambdaCase #-}
run :: MyMonad a -> a
run = \case
Return a -> a
Bind ma f -> run (f (run ma))
```

This is the simplest possible monad: it just sequences pure computations. I
claimed above that the essence of monads is to interleave pure computations and
effects; it would make sense that the simplest possible monad is the one with
no effect to interleave, and that makes it a good first example. However, in
most cases we will want to have *some* effects, so let's look at a slightly
less simple case next.

### Revisiting Maybe

`Maybe`

is a bit special^{2} in Haskell as it serves as both a "standard" data
type and a monadic one. This is compact and efficient, but it also muddies the
waters a bit, so let's revisit it through our new process here. First, we need
to think about the effects we want it to have. In the case of the `Maybe`

monad, the one effect we wanted to simulate was short-circuiting. We'll
represent this as a separate data constructor in our monad, and we'll therefore
call the monad `ShortCircuit`

.

```
instance Functor ShortCircuit where fmap = liftM
instance Applicative ShortCircuit where pure = return; (<*>) = ap
instance Monad ShortCircuit where return = SCReturn; (>>=) = SCBind
data ShortCircuit a where
SCBind :: ShortCircuit a -> (a -> ShortCircuit b) -> ShortCircuit b
SCReturn :: a -> ShortCircuit a
SCStop :: ShortCircuit a
```

Yes, it's that easy: just add a type constructor for each effect you want to
simulate. Note that we had to add an `SC`

prefix to all the names here because
I'm working through all these examples in the same file, and Haskell
wants all type names to be unique (hence `SCBind`

, `SCReturn`

, etc. instead of
`Bind`

, `Return`

, etc.).

We can rewrite the computation from part one as:

```
exampleSC :: ShortCircuit Double
exampleSC = do
r <- pure 5.0
r <- pure $ r + 7.0
r <- pure $ r - 2.0
r <- sqr_inv r
return $ r * 3.0
where
sqr_inv :: Double -> ShortCircuit Double
sqr_inv a = do
x <- pure a
x <- sqrt x
div 1.0 x
sqrt :: Double -> ShortCircuit Double
sqrt a = if a < 0.0 then SCStop else return $ a ** 0.5
div :: Double -> Double -> ShortCircuit Double
div a b = if b == 0 then SCStop else return $ a / b
```

This is pretty much the exact same code (translated to Haskell), except we're
now returning `SCStop`

instead of `Nothing`

to signal termination.

What should the meaning of that `SCStop`

case be? If we hit it, we need to
signal to our caller that the computation did not properly terminate. If the
computation failed, we need to express the absence of a value, which we can do
by using a `Maybe`

(using it strictly as a data structure, not as a monad, this
time):

```
runSC :: ShortCircuit a -> Maybe a
runSC = \case
SCReturn a -> Just a
SCBind ma f -> case runSC ma of
Nothing -> Nothing
Just a -> runSC (f a)
SCStop -> Nothing
```

Most of the power of the monad abstraction is already illustrated here, and
lies in the ability of the `run`

method to make decisions on the result of
`run ma`

.

### Revisiting Logger

The `Logger`

monad we defined in part one cannot be translated directly
to Haskell because, in Haskell, not everything can be turned into a string. We
also cannot collect all the values themselves because there is no "type
hierarchy" and therefore no root to it in Haskell.

So what can we do? We can define a monad with one extra effect, that of saving a value to some sort of log.

```
data Logger l a where
LBind :: Logger l a -> (a -> Logger l b) -> Logger l b
LReturn :: a -> Logger l a
LOutput :: l -> Logger l ()
```

We could have gone with `String`

for the log, but why needlessly restrict it?
The magic incantations to get `do`

syntax still work with this extra type
parameter:

```
instance Functor (Logger l) where fmap = liftM
instance Applicative (Logger l) where pure = return; (<*>) = ap
instance Monad (Logger l) where return = LReturn; (>>=) = LBind
```

The `run`

function needs to thread through, and return, a log. For simplicity
we'll keep it as a list of `l`

. When sequencing computations, we first need to
run the first computation, then save its log, then run the second computation,
and then combine both logs.

```
runL :: Logger l a -> ([l], a)
runL = \case
LReturn a -> ([], a)
LBind ma f ->
let (prev, a) = runL ma
(next, b) = runL (f a)
in (prev <> next, b)
LOutput l -> ([l], ())
```

The example from part one can be rewritten:

```
exampleL :: Logger Int Int
exampleL = do
let a = 1
LOutput a
let b = a + 3
LOutput b
let c = b + 4
LOutput c
let result = 3 * c
LOutput result
return result
```

where, as mentioned, we now need to explicitly decide when (and what) to log.

### Revisiting Fiber

The `Fiber`

monad we defined in part one actually was following this
recipe, only it was doing it in Java. This is why it defines two subclasses
called `Bind`

and `Return`

.

In part one, we were working in Java, which lets us print at any point,
so we could define the `debug`

function as a pseudo-monadic step. Haskell is
not so lenient, but we still want to see the order in which computations
happen. To get that, we'll reuse the `Logger`

structure with its `LOutput`

effect.

Apart from this logging behaviour we want to add, the `Fiber`

monad itself does
not have any special structure. Therefore, we can directly reuse the `Logger`

structure and layer on concurrent semantics just by writing a different
interpreter.

```
exampleF1 :: Logger String Double
exampleF1 = do
let a = 5
LOutput $ "[1]: a = " <> show a
let b = a + 3
LOutput $ "[1]: b = " <> show b
let c = b + 5
LOutput $ "[1]: c = " <> show c
let i = b * c
LOutput $ "[1]: i = " <> show i
return $ i
exampleF2 :: Logger String Double
exampleF2 = do
let a = 1
LOutput $ "[2]: a = " <> show a
let b = 2
LOutput $ "[2]: b = " <> show b
let c = 1
LOutput $ "[2]: c = " <> show c
let b2 = b * b
LOutput $ "[2]: b2 = " <> show b2
let ac = a * c
LOutput $ "[2]: ac = " <> show ac
let delta = b2 - 4 * ac
LOutput $ "[2]: delta = " <> show delta
let rac = delta ** 0.5
LOutput $ "[2]: rac = " <> show rac
return ((rac - b) / (2 * a))
runFiber :: [Logger l a] -> ([l], [a])
runFiber ms = loop ms [] []
where
step :: Logger l a -> ([l], Logger l a)
step = \case
LReturn a -> ([], LReturn a)
LOutput l -> ([l], LReturn ())
LBind ma f -> let (l, ma') = step ma
in (l, LBind ma' f)
loop :: [Logger l a] -> [l] -> [a] -> ([l], [a])
loop [] log rets = (reverse log, rets)
loop (ma:ms) log rets = case ma of
LReturn a -> loop ms log (a:rets)
LOutput l -> loop ms (l:log) rets
LBind (LReturn a) f -> loop (ms <> [f a]) log rets
LBind (LOutput l) f -> loop (ms <> [f ()]) (l:log) rets
LBind ma f -> let (l, ma') = step ma
in loop (ms <> [LBind ma' f]) (l <> log) rets
```

### Revisiting State

Finally, let's walk through the `State`

monad from part two using this
process. We want some sort of "mutable" state with the two operations `get`

and
`put`

, respectively getting the current state and setting up a new current
state. The structure for this just needs two new cases:

```
data State s a where
SBind :: State s a -> (a -> State s b) -> State s b
SReturn :: a -> State s a
SPut :: s -> State s ()
SGet :: State s s
```

and the inerpreter just threads the state through as before:

```
runS :: s -> State s a -> (s, a)
runS old_state = \case
SReturn a -> (old_state, a)
SPut new_state -> (new_state, ())
SGet -> (old_state, old_state)
SBind ma f -> let (new_state, a) = runS old_state ma
in runS new_state (f a)
```

While we *can* build up a stack machine on top of that, I find it generally
simpler to build up the monad I want directly than to try to build it on top of
another monad. So here is a stack machine monad:

```
instance Functor (Stack s) where fmap = liftM
instance Applicative (Stack s) where pure = return; (<*>) = ap
instance Monad (Stack s) where return = StReturn; (>>=) = StBind
data Stack s a where
StBind :: Stack s a -> (a -> Stack s b) -> Stack s b
StReturn :: a -> Stack s a
StPush :: s -> Stack s ()
StPop :: Stack s s
runSt :: [s] -> Stack s a -> ([s], a)
runSt stack = \case
StReturn a -> (stack, a)
StBind ma f -> let (new_stack, a) = runSt stack ma
in runSt new_stack (f a)
StPop -> (tail stack, head stack)
StPush s -> (s:stack, ())
```

### Conclusion

At this point I hope you're convinced that making a monad is pretty easy, and that, at least in Haskell, it can be pretty useful. In the next post, I'll walk through a couple real examples from my own code.