Algebraic Data Types in JavaScript
Merging unfold, fold and map transformations
Download
adt.js (9.2Kb)
Introduction
Algebraic data types are types like you can declare in Haskell with data declarations.
If you are not familiar with Haskell, you can try the
Wikipedia page about algebraic data types.
I have written the library in Javascript 1.8, which means that as of this writing it only runs in Firefox 3.0. I have chosen to keep it 1.8, because the code is a lot cleaner, thanks to the new expression closures. But there is nothing that cannot be made to work in ECMAScript edition 3. And I have no doubt that the same thing could be done in Python or Ruby.
Declaring algebraic data types
A first example:
// Haskell:
// data Color = Red | Green | Blue | Yellow
// data Point = Pt Float Float Color
Color = Data(function() ({ Red: {}, Green: {}, Blue: {}, Yellow: {} }))
Point = Data(function() ({ Pt: { x: Number, y: Number, color: Color } }))
Now you can create point objects like this: Pt(1, 2, Red). Note that you don't need new,
alltough you could; Pt is a real constructor. All of the following works as expected:
Pt(1, 2, Red).x == 1 Pt(1, 2, Red).color == Red Pt(1, 2, Red).constructor == Pt Pt(1, 2, Red) instanceof Pt Pt(1, 2, Red) instanceof Point
You can also define recursive types. F.e. Peano numbers:
// Haskell: data Peano = Z | S Peano
Peano = Data(function(peano) ({ Z: {}, S: { prev: peano }}))
With Peano numbers you would represent f.e. 2 as S(S(Z)). Note that for the recursion,
we use the first argument of the declaration function.
Now lets take a look at type parameters. We take lists as example:
// Haskell: data List a = Nil | Cons a (List a)
List = Data(function(list, a) ({ Nil : {}, Cons: { head: a, tail: list } }))
Type parameters become extra arguments of the declaration function, after the recursion argument.
Finally, let's take a look at a more complete example, a simplified XML data structure:
// Haskell:
// data Attrs n v = Attr n v
// data Node n v = Elem n (List (Attrs n v)) (List (Node n v)) | Text v
Attrs = Data(function(_, n, v) ({ Attr: {name: n, value: v} }))
Node = Data(function(node, n, v) ({
Elem: { name: n, attributes: List(Attrs(n, v)), childNodes: List(node) },
Text: v
}))
There are two things to note here: one is that we instantiate the parameterized types List
and Attrs, and the second is that adding names to the constructor arguments makes the type easier to understand.
Using algebraic data types
Unfold
With unfold you can easily generate algebraic data structures from other data. The following example shows how to generate a decreasing list of numbers.
var counter = List.unfold(function(n, c) n ? c.Cons(n, n - 1) : c.Nil)
Unfolds do not use the actual constructors, but stand-ins that perform recursion at the right places.
In this example c.Cons recursively calls the unfold function for it's second argument,
and then calls the Cons constructor with the result. Let's try it out:
>>> counter(5) Cons(5, Cons(4, Cons(3, Cons(2, Cons(1, Nil)))))
Fold
Fold is the reverse of unfold. It destructs data into a return value. Here is an example that multiplies a list of numbers.
var prod = List.fold({ Nil: 1, Cons: function(h, t) h * t })
You can understand a fold as replacing the constructors with the given functions. Let's test this code:
>>> prod(counter(5)) 120
Merging unfold and fold
You may have noticed that with combining prod and counter we have created the
factorial function. However, it is quite wasteful as it builds up a list which it then destroys again.
But this adt library has a solution for that, it allows you to merge functions that are defined over the same
data type:
>>> var fact = counter.merge(prod) >>> fact(5) 120
Now we have a factorial function that was neatly defined using the structure of List, but no list is created.
c.Cons instead of calling the Cons constructor, now directly calls the Cons
function from prod.
Map
With map you provide a function for each type parameter. Here are 2 examples using the XML data types:
var addPrefix = Node.map(function (name) "x:" + name, id) var normalizeSpace = Node.map(id, function (value) value.replace(/^\s+|\s+$/g, ""))
The first adds a prefix to each name, leaving values alone (using the id function), and the second strips whitespace from
values, leaving names alone. For a more complete example, let's write a serialization function for the XML data:
var serialize = Node.fold({
Elem: function(name, attrs, children) "<" + name + attrs + ">" + children + "</" + name + ">",
Text: function(s) s,
Attr: function(name, value) " " + name + "='" + value + "'",
Cons: function(h, t) h + t,
Nil : ""
})
Now if we wanted to add prefixes, normalize space and then serialize the xml, we could write:
serialize(normalizeSpace(addPrefix(xml)))
But then we would traverse the data structure 3 times. We would like to merge these functions so that the data structure is traversed only once. (When this is done in the compiler it is called deforestation or fusion.)
>>> var prefixNormalizeAndSerialize = addPrefix.merge(normalizeSpace).merge(serialize)
>>> var xml = Elem("test", Nil, Cons(
Elem("hoi", Cons(Attr("href", "w3future.com"), Nil), Nil), Cons(
Text(" bla "), Cons(
Elem("doei", Nil, Nil), Nil))))
>>> prefixNormalizeAndSerialize(xml)
"<x:test><x:hoi x:href='w3future.com'></x:hoi>bla<x:doei></x:doei></x:test>"
In general you can merge one fold, one unfold and any number of maps, as long as they are defined on the same data type.
Advanced uses
Transformers
All these unfolds, folds, maps and any merged combination of them are special cases of transformers. A transformer declaration has 4 parts: a generator (unfold), transformation of constructors (fold), transformation of type parameters (map) and transformation of non-algebraic types. The last tree are best explained with an example:
Test = Data(function(test, a) ({ Foo: Number, Bar: [a, test] }))
var xf = Test.getTransformer({
getCtorXF : function(ctor) f,
getParamXF: function(pos ) g,
getAtomXF : function(ctor) h
})
Then the following are equivalent:
xf(Bar("x", Bar("y", Foo(1))))
f(g("x"), f(g("y"), f(h(1))))
Ie. each constructor is replaced with f, each value in a type parameter position is wrapped with g,
and other (non-algebraic) values are wrapped with h.
Derived properties
Derived properties are like derived classes in Haskell, they apply to every algebraic data object that is created.
The adt library already contains clone, source, equals, size and
items. Some examples for the previously defined xml object:
>>> xml.size
6
>>> xml.items
["test", "hoi", "href", "w3future.com", " bla ", "doei"]
>>> xml.equals(xml.clone)
true
>>> xml.source
"Elem("test", Nil, Cons(Elem("hoi", Cons(Attr("href", "w3future.com"), Nil), Nil), Cons(Text(" bla "),
Cons(Elem("doei", Nil, Nil), Nil))))"
But you can also add your own derived properties. Here is how you would define items:
Array.collect = function(list) Array.reduce(list, function(a, b) a.concat(b), [])
Data.addDerivedProperty("items", {
getCtorXF : function(ctor) function() Array.collect(arguments),
getParamXF: function(pos ) function(x) [x],
getAtomXF : function(ctor) function(x) [x]
})
Data types à la carte
To see if even advanced Haskell code would work in JavaScript, I took a shot at Data types à la carte by Wouter Swierstra. It addresses the Expression Problem:
The goal is to define a data type by cases, where one can add new cases to the data type and new functions over the data type, without recompiling existing code, and while retaining static type safety.
Of course, that last property is lost in the translation to JavaScript, but otherwise this code works.
First the initial data type, expressions with values and addition.
Expr = function(f) Data(function(expr) ({ In: f(expr) }))
Plus = function(f, g) Data(function(_, e) ({ Inl: f(e), Inr: g(e) }))
ValT = Data(function(_, e) ({ Val: Number }))
AddT = Data(function(_, e) ({ Add: [e, e] }))
ExprValAdd = Expr(Plus(ValT, AddT))
Next, we create an evaluation function with fold.
var evalAlgebra = {
Val: function(x) x,
Add: function(e1, e2) e1 + e2,
In: id, Inl: id, Inr: id
};
var ev = ExprValAdd.fold(evalAlgebra)
var val = function(x) In(Val(x))
var add = function(x, y) In(Add(x, y))
>>> ev(add(val(30000), add(val(1330), val(7)))) 31337
Now we can add multiplication to the expression type.
MulT = Data(function(_, e) ({ Mul: [e, e] }))
var mul = function(x, y) In(Mul(x, y))
ExprValAddMul = Expr(Plus(ValT, Plus(AddT, MulT)))
evalAlgebra.Mul = function(e1, e2) e1 * e2
var ev = ExprValAddMul.fold(evalAlgebra)
>>> ev(add(mul(val(80), val(5)), val(4))) 404
And finally we add a render function.
var render = ExprValAddMul.fold({
Val: function(x) "" + x,
Add: function(e1, e2) "(" + e1 + " + " + e2 + ")",
Mul: function(e1, e2) "(" + e1 + " * " + e2 + ")",
In: id, Inl: id, Inr: id
})
>>> render(add(mul(val(80), val(5)), val(4))) ((80 * 5) + 4)