What the Heck are Algebraic Data Types? ( for Programmers )

This post is meant to be a gentle introduction to Algebraic Data Types.

Some of you may be asking why you should learn Algebraic Data Types and how will they change your world?  I am not going to answer that, but suffice it to say that Algebraic Data Types are the underpinning of the type systems to the ML derived languages, Haskell and OCaml included, and their construction and properties allow for the power (and inference) that accompanies these type systems.  They are cropping up in other languages, like Scala, F#, and Clojure.  Well, here are my 2 cents.

I wrote this blog post because I had trouble learning what an Algebraic Data Type was, and could not find an introduction that did not either 1) leap immediately to copy and pasted phrases, 2) mislead, like make the claim that "algebraic data types are the same thing as discriminated unions" or 3) delve into graduate level formalism.  This blog post represents the results of what I have teased from numerous sources.

There are now tons of us programmers with sadly insufficient mathematical grounding, who are attempting to branch out and learn something new.  The confusion that comes from a sea of new words and overloaded acronyms can be very frustrating  (e.g. don't use the acronym ADT for algebraic data types, because that generally means abstract data type -- a completely orthogonal concept).   I wanted to be able to put a check mark next to my internal to-do list item for "What the heck are Algebraic Data Types?"   This blog post is meant to help people do the same.

Quick Start
 To start, let us attempt a definition of algebraic because for some weird reason, no introduction to Algebraic Data Types ever tries to explain this etymology.  For purposes of simplicity, let us define algebra to be two things: 1) a set of objects (do not think objects as in object oriented) and 2) the operations used on those objects to create new objects from that same set.  An introduction to algebraic data types is therefore an introduction to types and the operations used to create new types.

Examples of some other Algebras
In the case of numeric algebra (informally known as high-school algebra), the set is the set of numbers (whether they be natural, rational, real, or complex) and the operations used on these objects can be (but definitely not limited to be) addition, or multiplication.  The algebra of numbers is therefore the study of this set, and the laws by which these operators generate (or don't generate) new members from this set.

Another example: the relational algebra.  In this algebra, the set of objects are the set of relations.  Unlike numbers which we tend to (erroneously) view as atomic, a relation is a thicker kind of object containing a number of named columns, their associated domains, and a number of rows.  Each 'relation' (otherwise known as 'table') is therefore the object unto which operators may operate.  These operations include, amongst others,
  • projection (π) create a new relation from the existing one by removing columns, 
  • selection(σ) create a new relation from an existing one by removing rows that satisfy a predicate function,  
  • joining (⋈) create a new relation from two existing ones by matching up the domains and values of a chosen column,
  • renaming (ρ) create a new relation from an existing one by renaming a column

The key thing here to realize is that an algebra allows us to talk about the objects and the operations abstractly, and to consider the laws that these operations obey as they operate on the underlying set.

So, again, what is an algebraic data type? Let's repeat these questions. What is the set?  What are the operators?  First, the underlying set is the set of types.  For you programmers out there, types are so familiar that we often overlook how fundamental they are.  If we were to start with a set of base types, we could use operators to create new ones.  Our first operator, the product operator, will create what is known as a product type, or a record type, or a tuple-type.  We pair, or treble, or quadruple, or quintuple, or n-tuple existing types together, to create an aggregate type.   A classic example is the Pair type.  As a hard example, consider the Pair as implemented in a bevy of object oriented languages:

class Pair{
   float x;
   float y;
}

change the above code's syntax to your favorite python, ruby, scala, etc.  It doesn't matter.   The act of writing 'class' and defining this type IS our product operator.  Object oriented programmers tend to call this operation composing -- but whatever you call it, it's the same thing. From the pre-existing type float we have created a new type called Pair.  Does our operator allow us to keep on going with this?  Let's try it:

class PairOPairs{
   Pair start;
   Pair end;
   int time;
}

Well, there you go.  From two existing types (Pair and int), we have created a third. This is what operators do in an algebra.

Our class definition as a product operator has spun a new type into existence from other ones, just like
  • '+' allows us to combine 3 and 4 to get 7,  
  • '*' gives us 12 from  3 and 4,
  • 'join' allows to create a new relation (which we generally don't name) from 'Employee' and 'Salary',
  • 'or' allows us to get 'true' from 'true' and 'false' together,
  • '+' allows us to get "hotdamn" from "hot" and "damn"  (commutativity matters though!),
  • 'not' allows to turn 'true' into 'false' (the only non-binary operator in this list).
We have used an operator on an underlying set to create/synthesize a new member of that set.

Now, some languages allow us to create these types anonymously, what we call tuples.  For theoretic purposes, the tuple containing real numbers can be much the same (isomorphic) as a Pair class containing real numbers.  The differences?  Well those are practical and vary from programming language to programming language, but are about ease-of-use and the languages implementation of type checking.  Practical matters are of great importance to programmers.

The Haskell syntax for product types is:
  data Pair = Pair Real Real.

Note how the 'Pair' on the left is different from the 'Pair' on the right. The one on the left is the type.  The one on the right is the particular named product operator, what Haskell calls a data constructor.

All this time we have used the product operator to create new types from existing ones.  And we called it a product type.  Why?  Well.  The answer is that we are producing a Cartesian product (or multiplication) on members of several underlying sets.  Again, consider the Pair.  Let's rename it to Point to drive home a point:

class Point{
   float x_coordinate;
   float y_coordinate;
}

In the act of creating the class 'Point' we have created a new type of x's and y's together.   The x's have values in the real domain.  The y's have values in the real domain too.  The Point is a Cartesian product from values of type of Reals against values of type Reals or (R X R).  As another example, we could have done this:

class Person{
   String firstName;
   String lastName;
   int ssn;
}

Here we are using Cartesian multiplication to produce a Cartesian product of two strings and an integer ("firstname","lastname",123456714).  The three constituent elements may range throughout the space of all possible strings and integers (this is not yet ideal, as we would love to restrict the length of our integer type -- see dependent types) to create a value that is a triple of strings and an integer.  Thus our new type is of (String X String X Int).  Are there any differences?  Of course.  I don't know of a way to implement a comparison operator between the Person, modulo subjective judgements on character.  With Point, I can implement a 'distance' from the origin metric for some ordering.

How you use the types is 100% up to you.  Do you want to use 'Pair' to capture a point in x/y space, or to represent the length of your left foot and right foot?  Well that's up to you. The algebra of data types is abstract.  The choice of names and how we compose these types from other types is the act of domain modelling and is something for which programmers get paid.

So far we have looked at one operator.  A 'product' operator.  Are there others? Sure.  The object oriented guys have something called sub-typing which allows you to generate a new type from another via 'extension'.  Anything else?  Well, there is an interesting one that the typical Java enterprise developer rarely sees.  It is called a variant type or a sum type or a discriminated union, and allows you again to compose a new aggregate type from existing types.  But what is the difference between this and a product type? Well, consider again a product type:

class Point{
   float x
   float y
}

and now consider a completely different example as a  sum type:

union Identity{
   String name
   UUID uuid
}

In the first case, a value of type 'Point' has constituent values of an 'x' AND a 'y'.  In the case of the discriminated or disjoint union (sum type), values of the type Identity have constituent values of a 'name' OR a 'uuid' -- but not both at the same time.  The difference is fundamental.  If Java had this then something like the following would be possible, AND typecheck!

Identity myIdentity = new Identity();
if( db.hasOnlyName() ){
    myIdentity = "Daniel Eklund";       // could be a string
}else{
    myIdentity = java.util.UUID.randomUUID();  // OR a UUID
}

The Haksell notation for the sum type is:
   data Identity = S String | U UUID

(note how the bar symbol '|' is the sum operator)

Mathematicians like to overload symbols and words.  We all subconsciously respect that '+' means something different when it is used on two numbers versus when it used with two strings versus when it used on two vectors.  Why do we call this new operator a 'sum' and its result a 'sum type' then?  Well I could answer that, or let you figure that out on your own.  The important thing to do is to let go of concrete biases about what a 'sum' is but also realize that there must be a reason that we analogize with the pre-existing operator 'sum' from a different algebra. (Hint: think about the algebra of boolean values and what AND and OR mean.  Now think about 0s and 1s.)

What programming languages choose to expose as mechanisms for operating on types tells us a lot about what the language wants us to respect.  Some languages are concerned about the mathematical tractability of what can be expressed or inferred about these types from limiting the operators.  When we have this tractability, we can write algorithms in our type checker that do wondrous things for us.  Some languages are not so concerned about what can be proven mathematically about these types, but want us to have notions that make sense to us.  These languages often require us to constantly remind ourselves what types we have by annotating the values when they are used as parameters or return values.  Some languages ignore types. 

It is important to realize that an algebra of data types exists in all of these languages.  But what kind of algebra is a different question. 

Thus when you hear about algebraic data types, you will most often be referring to a Haskell-like system -- something called an initial algebra.  Definitely, the wikipedia article presumes this.  The wikipedia article is a hodge-podge mess.

In this case we mean algebraic data type to mean one which has both a 'product' and 'sum' operator, which is closed, which has a unit-type, which has associative properties, a fixed-point operator, and a bunch of other stuff which someone else's blog post  and a stackoverflow discussion do a better job explaining.  Please do read this other post as it is both more rigorous and continues exploring how the analogy (morphism) of arithmetic algebra with type algebra can give us beautiful things like derivative types.  My gosh. It's cool.

The purpose of this posting is merely an introduction, and, as such, will not be rigorous.  I have erred on the side of hand-wavey explanation and can even now see things which are technically suspect. But whatever.  Study.  Figure it out.

17 comments:

  1. Ah, now *that* was genuinely useful... the first truly ground-floor introduction to algebraic datatypes that I've come across. Thanks for putting this together, I may have actually learned something today!

    ReplyDelete
  2. Very tasty ground knowledge for a layman, thanks!

    ReplyDelete
  3. Sum types do exist (and are typesafe) in Java; you simply have to encode them either by using subclassing and the Visitor pattern (which acts kind of like a catamorphism for the sum type) or by writing the catamorphism directly. Visitor is definitely the more standard way to go for sum types in the Java world, though, at least the Visitor pattern *implemented correctly* (i.e. neither the accept or the visit calls return unit, as in this Scala example: http://logji.blogspot.com/2009/12/reader-monad-for-visitor-pattern.html) - this post is about that the fact that Visitor is also a monad, but I think if you squint you can see the sum-type-ness.

    ReplyDelete
  4. What a superb tutorial. Thanks.

    ReplyDelete
  5. Thanks a lot. Easy. Clear. Short. Useful.

    ReplyDelete
  6. Thanks! Really useful introduction to the Algebraic Data Types.

    ReplyDelete
  7. This comment has been removed by a blog administrator.

    ReplyDelete
  8. This was helpful. Thanks!

    ReplyDelete
  9. This is a very useful and understandable explanation! It helps the reader put algebraic data types in a larger context of algebras and programming.

    ReplyDelete
  10. Best introductory on ADTs I've ever seen

    ReplyDelete
  11. This was excellent, thanks.

    ReplyDelete
  12. currently munching through Learn You a Haskell, which is a great resource but takes algebraic data types for granted - your text really helped me out, thanks! nicely written & explanatory.

    ReplyDelete