At Azavea, full-time software developers have the opportunity to put 10% of their time towards learning and research. I am using my 10% time to explore a pure, statically-typed functional programming language called Idris by working my way through the excellent Type-Driven Development with Idris written by Edwin Brady, the creator of the language.
The thesis of the book is that instead of trying to use tests to check for the presence of errors, we can use types to demonstrate the absence of errors in our code. Not only that, we can make types work for us by leveraging the properties of Idris and its powerful type system to help us write code. In type-driven development, we start with the types and the compiler guides us along as we implement our program.
Let’s take a look at how.
Functions
Here is a simple function signature in Idris:
add : Int -> Int -> Int
Function signatures describe a function’s inputs and outputs. In Idris, you can think of the function signature as a contract that describes exactly what the function must do. In the above example, we are defining a function named add
which takes two Int
s as input and returns an Int
as output (Int
is a built-in type representing integers):
add : Int -> Int -> Int ^^^ ^^^ ^^^ input input output
We’ll always start with a function signature and work towards the implementation from there. In this case, the implementation would look like this:
add x y = x + y
add
takes two arguments x
and y
and we add them together using the +
operator. The body of our function is an expression which evaluates to a value that is implicitly returned by the function. As such, no explicit returns are needed. We end up with this:
add : Int -> Int -> Int add x y = x + y
We can fire up an Idris REPL within the context of the Idris file we’re working in (Add.idr
) by running idris Add.idr
and then call our function like so (notice the lack of parentheses):
*Add> add 1 2 3 : Int
We can see from the output (3 : Int
) that we got back a 3
that is of type Int
.
This function, like all Idris functions, is pure. That means that the same input to the function must produce the same output and the function can have no side effects (nothing besides returning a value). In short, pure functions are dependable and predictable. They can always be relied on to do exactly what they say and nothing more; they won’t fetch data from a server, mutate some global state, launch a rocket into the sun, or any other shenanigans.
As we will see later, this property of purity will be leveraged extensively by the compiler to not only make sure our programs are correct, but also to help us write them!
Types
You can think of types as a description of how data should be used. When practicing type-driven development in Idris, our function definitions flow from the function signature and the types contained within them. The function signatures are more than just documentation enforced by the compiler — they’re a starting point for our implementation.
Let’s start with a very simple example:
invert : Bool -> Bool
We want to write a function that will take a Bool
and return a Bool
(note that types in Idris must start with a capital letter). The Bool
type is defined using the data
keyword like so:
data Bool = False | True
Bool
is a union type that can be either True
or False
. Thus, our invert
function should switch between these two possibilities (True
becomes False
and False
becomes True
).
Now, to help us code in Idris, there are plugins available for common text editors such as Vim, Emacs, and Atom which provide a set of standard commands to facilitate type-driven development. To start, we can use the definition command (<LocalLeader>d
in Vim) to build out an initial function definition:
invert : Bool -> Bool invert x = ?invert_rhs
We see that Idris has started implementing this function for us and already has added an x
argument for our function equation. Cool!
But what about the ?invert_rhs
? That is called a type hole. A type hole always starts with a question mark and represents a yet-to-be-implemented part of your program. If we check the type of our type hole (<LocalLeader>t
in Vim) using the plugin we see this output:
x : Bool -------------------------------------- invert_rhs : Bool
That’s telling us that x
is a Bool
and invert_rhs
(the expression our function body evaluates to) is too, as we would expect.
Next we can use the case-split command (<LocalLeader>c
in Vim) to continue implementing the function. By case-splitting on the argument, we get:
invert : Bool -> Bool invert False = ?invert_rhs_1 invert True = ?invert_rhs_2
Idris has helpfully leveraged what it knows about Bool
and filled in the possible values for the implementation of our function. It has created a new function equation for each case to pattern match on. Now it’s up to us to fill in those type holes to complete our function:
invert : Bool -> Bool invert False = True invert True = False
Here’s how the function can be implemented start-to-finish using interactive editing:
We can run the function like so:
*Invert> invert True False : Bool *Invert> invert False True : Bool
Runtime Errors
If you’re not used to static typing, you may at this point be wondering what might happen if you gave invert
the wrong type, something that’s not a Bool
. For example, let’s try to apply invert
to "banana"
(a String
instead of a Bool
):
invert : Bool -> Bool invert False = True invert True = False result : Bool result = invert "banana"
We get this compilation error:
| 6 | result = invert "banana" | ~~~~~~~~~~~~~ When checking right hand side of result with expected type Bool When checking an application of function Main.invert: Type mismatch between String (Type of "banana") and Bool (Expected type)
This is not valid Idris code; it does not compile and we can’t run it since it did not pass type checking. This is actually a good thing since the compiler won’t let us write incorrect code and thus catches many errors during development before they can become bugs.
One of the great things about static typing is that you’ll never have to write a test for what happens if your function gets or returns the wrong type because it simply can’t happen — it would be impossible to write a test case like that because it wouldn’t compile! The more powerful the type system, the more precisely we can define our types and the more errors we can prevent at runtime. And as we will continue to see, Idris’s type system is quite powerful!
Defining Custom Types
We can of course define our own types based on our needs. In this example we are going to encode a simple AST in Expr
for evaluating mathematical expressions which supports a few simple operations: addition, subtraction, and multiplication.
data Expr = Val Int | Multiply Expr Expr | Add Expr Expr | Subtract Expr Expr
This type is recursive, meaning it is defined in terms of itself (as such, expressions can be nested). We have types for our three binary operations we have represented (addition, subtraction, and multiplication) with the constructors Add
, Subtract
, and Multiply
, respectively, as well as Val
to hold our integer literals.
Let’s examine Expr
more closely in the REPL using :doc
which will show the documentation for any function or type:
*Expr> :doc Expr Data type Main.Expr : Type Constructors: Val : Int -> Expr Multiply : Expr -> Expr -> Expr Add : Expr -> Expr -> Expr Subtract : Expr -> Expr -> Expr
We can see that Expr
is in fact a Type
, whereas Val
, Multiply
, Add
, and Subtract
are constructors for that type, meaning that they are simply functions that return an Expr
. Notice that we didn’t have to actually write any documentation; Idris knows this about Expr
by virtue of the type definition.
Let’s look at some examples to see how we could use Expr
. We can represent the value 4
by itself like so:
*Expr> Val 4 Val 4 : Expr
And 4 * 2
like so:
*Expr> Multiply (Val 4) (Val 2) Multiply (Val 4) (Val 2) : Expr
And 4 * (2 - 1)
like so:
*Expr> Multiply (Val 4) (Subtract (Val 2) (Val 1)) Multiply (Val 4) (Subtract (Val 2) (Val 1)) : Expr
Notice the type in all of these examples is Expr
.
Now let’s write a function to evaluate our Expr
with the following signature:
evaluate : Expr -> Int
Using our handy plugin, let the compiler be our guide! Running the definition command gives us:
evaluate : Expr -> Int evaluate x = ?evaluate_rhs
And case-splitting on our x
argument gives us:
evaluate : Expr -> Int evaluate (Val x) = ?evaluate_rhs_1 evaluate (Multiply x y) = ?evaluate_rhs_2 evaluate (Add x y) = ?evaluate_rhs_3 evaluate (Subtract x y) = ?evaluate_rhs_4
Once we finish fleshing out our implementation by filling in the type holes, we get this:
evaluate : Expr -> Int evaluate (Val x) = x evaluate (Multiply x y) = evaluate x * evaluate y evaluate (Add x y) = evaluate x + evaluate y evaluate (Subtract x y) = evaluate x - evaluate y
We can now use our evaluate
function to evaluate the examples shown earlier:
*Expr> evaluate (Val 4) 4 : Int *Expr> evaluate (Multiply (Val 4) (Val 2)) 8 : Int *Expr> evaluate (Multiply (Val 4) (Subtract (Val 2) (Val 1))) 4 : Int
This shows the process of implementing the function using interactive editing:
Modifying Our Code
Let’s say we want to add a new operation to our program: division. This will require a change to our Expr
type:
data Expr = Val Int | Multiply Expr Expr | Add Expr Expr | Subtract Expr Expr | Divide Expr Expr
Immediately the compiler lets us know of a potential problem:
| 8 | evaluate (Val x) = x | ~~~~~~~~~~~~~~~~~~~~ Main.evaluate is not total as there are missing cases
The error message indicates that the compiler is looking for our evaluate
function to account for the possible input of Divide x y
. This is because Idris enforces totality, an even stronger property than purity. A total function is guaranteed to produce an output for any well-typed input in finite time, and a function is assumed to be total unless it is explicitly annotated as a partial function using partial
. This means that we are forced to account for all possible inputs, which we are not doing in this case.
To make our function total, we can modify evaluate
like so:
evaluate : Expr -> Int evaluate (Val x) = x evaluate (Multiply x y) = evaluate x * evaluate y evaluate (Add x y) = evaluate x + evaluate y evaluate (Subtract x y) = evaluate x - evaluate y evaluate (Divide x y) = evaluate x / evaluate y
However, once we do so we see another compiler error:
| 12 | evaluate (Divide x y) = evaluate x / evaluate y | ~~~~~~~~~~~~~~~~~~~~~~~ When checking right hand side of evaluate with expected type Int Can't find implementation for Fractional Int
Division is not a valid operation on Int
since we cannot have fractional integers. If we replace our Int
with Double
our program will compile and we can use the division operator to calculate 4 / (2 - 1)
:
*Expr> evaluate (Divide (Val 4) (Subtract (Val 2) (Val 1))) 4.0 : Double
If we check the documentation for our function in the REPL using :doc
, we see that it is total:
*Expr> :doc evaluate Main.evaluate : Expr -> Double The function is Total
In making this change, we see that the strong guarantees Idris affords us around type safety and totality help us avoid errors at compile time.
Generic Types
Sometimes we will want to make our types be able to take a parameter which could be of any type. For instance, the generic type List
takes a type parameter elem
(note that types are always UpperCamelCase whereas type parameters are always lowercase).
Observe that we can have lists of different types:
Idris> [1, 2, 3] [1, 2, 3] : List Integer Idris> ["a", "b", "c"] ["a", "b", "c"] : List String
The ::
operator (pronounced “cons”) is used for adding an element to the front of a list:
Idris> 0 :: [1, 2, 3] [0, 1, 2, 3] : List Integer
The above examples use syntactic sugar that makes working with List
s easier. We can use the more verbose syntax to construct a list as well, starting with the empty list primitive Nil
and using ::
to build up our list backwards by glomming head elements onto it:
Idris> 0 :: (1 :: (2 :: (3 :: Nil))) [0, 1, 2, 3] : List Integer
When using syntactic sugar for lists, Idris will desugar the comma-separated elements inside of square brackets to these primitive forms.
To better understand how this works, let’s examine the List
type definition:
data List elem = Nil | (::) elem (List elem)
We can see that List
is a union type of Nil
(empty list) and potentially a chain of other List
s constructed via cons-ing together heads and other lists (::
is a constructor that takes a head elem
and a tail List elem
). Notice that all of the List
s take the same elem
parameter which will necessarily be the same type.
Let’s get some practice working with List
s by implementing the takeList
function (we’ll call it takeList
to avoid a name conflict with the built-in take
function). This function will retrieve the first n
elements from a list, returning the whole list if too many elements are requested:
Idris> takeList 1 [1, 2, 3] [1] : List Integer Idris> takeList 2 [1, 2, 3] [1, 2] : List Integer Idris> takeList 3 [1, 2, 3] [1, 2, 3] : List Integer Idris> takeList 4 [1, 2, 3] [1, 2, 3] : List Integer
The function signature will look like this:
takeList : (n : Nat) -> (list : List a) -> List a
Notice that this function does not take an Int
but instead takes a Nat
, a natural number. This type is commonly used for working with collections in Idris since Nat
must be greater than or equal to zero and collections cannot have a negative length. This is yet another example of precisely defining our types to remove potential error cases that one would otherwise need to account for.
Nat
is a very simple recursive type:
data Nat = Z | S Nat
We can build up our Nat
recursively starting from Z
(Zero) by using S
(Successor) to add to it:
Idris> Z 0 : Nat Idris> S Z 1 : Nat Idris> S (S Z) 2 : Nat
With that new knowledge of Nat
in hand, let’s get back to implementing our takeList
function, starting with the function signature:
takeList : (n : Nat) -> (list : List a) -> List a
You’ll notice that our function arguments look a little different than what we’ve seen up until this point. This function signature provides names for its arguments. Our first argument of type Nat
is called n
and our second argument of type List a
is called list
. As we will soon see, these names are used for better variable naming when code is generated via interactive editing.
Again using our plugin, let’s first build out an initial definition:
takeList : (n : Nat) -> (list : List a) -> List a takeList n list = ?takeList_rhs
Next, we case-split on n
:
takeList : (n : Nat) -> (list : List a) -> List a takeList Z list = ?takeList_rhs_1 takeList (S k) list = ?takeList_rhs_2
We know that if Nat
is Z
(zero), then we’re requesting zero elements and we can return an empty list, so we can fill in ?takeList_rhs_1
with []
like so:
takeList : (n : Nat) -> (list : List a) -> List a takeList Z list = [] takeList (S k) list = ?takeList_rhs_2
The core of our function logic will depend on what we do with list
. Let’s case-split on the second list
to see what values it might be:
takeList : (n : Nat) -> (list : List a) -> List a takeList Z list = [] takeList (S k) [] = ?takeList_rhs_1 takeList (S k) (x :: xs) = ?takeList_rhs_3
We also know that if the list we’re working with is empty, we want to return an empty list, no matter the value of k
, which means that ?takeList_rhs_1
can also be an empty list, giving us:
takeList : (n : Nat) -> (list : List a) -> List a takeList Z list = [] takeList (S k) [] = [] takeList (S k) (x :: xs) = ?takeList_rhs_3
The type hole ?takeList_rhs_3
is all that remains to be implemented. In the function equation on the last line of our implementation, we’re using the cons operator to pattern match on the head and tail (x
and xs
) of our list so we can work with them. By matching using ::
, we not only access the head and tail but also assert that there is a head and tail to match on.
To build out the result that will be returned, we want to grab the head and then continue taking the remaining k
elements from the rest of our list:
takeList : (n : Nat) -> (list : List a) -> List a takeList Z list = [] takeList (S k) [] = [] takeList (S k) (x :: xs) = x :: takeList k xs
Here is how this can be implemented with interactive editing:
While this function works great on lists, in Idris we can do one better. By using a more precise collection type, we can write a takeList
function where you cannot request more elements than exist in the collection and have this be enforced at compile time. Seem impossible? That’s the power of dependent types!
Dependent Types
Idris makes it possible to define types that are computed from some other value. These are called dependent types. Vect
is an example of a dependent type: its length is part of its type and can therefore be checked at compile time. To put it another way, the type of Vect
depends on its length. Whereas generics have a type specified as a type parameter, dependent types depend on a value being specified to determine their type.
Here is an example of a Vect
:
import Data.Vect fourInts : Vect 4 Int fourInts = [1, 2, 3, 4]
This Vect
has a length of 4 and is made up of Int
s. If we look at the type of Vect
in a REPL using :t
we see:
> :t Vect Vect : Nat -> Type -> Type
Vect
takes a Nat
which defines its length explicitly and a Type
for the type of element contained within it and returns a new Type
. This is a distinctive property of Idris: types are first-class. That means we can pass them into and return them from functions. This property is important for making dependent types possible.
In the case of Vect
, crucially, a Vect 4 Int
is a different type than Vect 5 Int
.
Let’s get some practice working with dependent types by writing a takeVect
function that operates on Vect
s. As usual, we start with the function signature:
takeVect : (n : Nat) -> Vect (n + m) elem -> Vect n elem
This function signature looks different than others we’ve seen so far, specifically the second argument: Vect (n + m) elem
. This is saying that the Nat
used to construct the Vect
is greater than or equal to n
(since m
can be greater than or equal to zero).
To start implementing our function, first we use the definition command:
takeVect : (n : Nat) -> Vect (n + m) elem -> Vect n elem takeVect n xs = ?takeVect_rhs
Followed by case-splitting on n
:
takeVect : (n : Nat) -> Vect (n + m) elem -> Vect n elem takeVect Z xs = ?takeVect_rhs_1 takeVect (S k) xs = ?takeVect_rhs_2
Next we can use the search command (<LocalLeader>o
in Vim) to search for an expression that satisfies the type hole:
takeVect : (n : Nat) -> Vect (n + m) elem -> Vect n elem takeVect Z xs = [] takeVect (S k) xs = ?takeVect_rhs_2
In this case, Idris was able to fill in the expression for us since when n
is zero, we necessarily must have an empty Vect
. The last remaining step is to implement ?take_rhs_2
similarly to how it was done for takeList
:
takeVect : (n : Nat) -> Vect (n + m) elem -> Vect n elem takeVect Z xs = [] takeVect (S k) (x :: xs) = x :: takeVect k xs
This shows the process of implementing our function:
So what does this get us? Well, if we try to take more elements than are available, we get a compile time error:
*Take> takeVect 5 [1, 2, 3, 4] (input):1:13:When checking argument xs to constructor Data.Vect.::: Type mismatch between Vect 0 elem1 (Type of []) and Vect (S m) elem (Expected type) Specifically: Type mismatch between 0 and S m
As we can see, by crafting our function signatures correctly, we can leverage the power of dependent types to make invalid operations impossible and have that be enforced at compile time.
Conclusion
As we have seen, instead of using types to check for correctness after the fact, they can be the starting point for your programs. In Idris, your functions flow from your types with the compiler as your guide. By precisely defining our types, once the code has compiled successfully, we can be reasonably confident that it is working as intended. And any future changes can be made knowing that the compiler has you covered.
Though Idris is primarily a research language and is not yet ready for production, the ideas behind it are compelling. Dependent types have many potentially interesting applications to aid in writing correct programs beyond the examples already shown, such as money being dependent on time. As an example in a geospatial context, we could imagine using dependent types to ensure map algebra operations are only performed on rasters with the same CRS and resolution.
Type-driven development is a powerful way to write correct programs, and what’s shown here is only a taste of what Idris has to offer. To learn more, check out the Idris docs or pick up a copy of Type-Driven Development with Idris.