Tuesday, September 21, 2010

Why Functional Programming Matters - An exploration of functional Scala.

package nl.grons.whyfp
/**

A translation of Miranda examples in the paper Why Functional Programming Matters by John Hughes, 1984.

@author Erik van Oosten
*/

object WhyFp_Abstract {
/*

Scala is in the focus of attention because it allows a smooth migration from one of the most popular programming languages ever to a language that supports (among other things) a functional style of programming.

In this article I explore the functional aspects of Scala based on a very influential paper on functional programming: ‘Why Functional Programming matters’. To do so I will translate the Miranda examples of chapter 3 and 4 (next article) to Scala. To keep things pure, the standard Scala libraries are avoided.

I will conclude that Scala has sufficient features to live up to Hughes’ standards with the added bonus of being statically type safe.

Just like the paper I just assume you know a bit about functions, partial functions and partially applied functions. In addition I assume you know basic Scala syntax. However, as I am still a Scala newby myself, any comments that will improve this article are welcomed.

This article focusses on the Miranda -> Scala translation and will not repeat the message of the paper. So go ahead, and read the paper first.

All text in this article is properly commented, so the whole article can be copy pasted into your favourite text-editor.

*/
object Chapter3_GlueingFunctionsTogether {
import RichFunction2._
/*

The original article completely defines a list right there in just one line:

listof X ::= nil | cons X (listof X) (Miranda)

Scala is not so powerful and this mainly because we need to translate Miranda’s ‘|’ operator to a class hierarchy.

Lets define List more or less as the Scala library does it (as anonymous points out, it can be done much shorter).

*/
/** The base type for our list. */ sealed abstract class List[+A] { def head: A def tail: List[A] } /** Cons represents an element and the rest of the list. */ case class Cons[A](hd: A, tl: List[A]) extends List[A] { override def head: A = hd override def tail: List[A] = tl override def toString = "Cons(" + hd + "," + tl + ")" } /** Nil represents the empty list. */ case object Nil extends List[Nothing] { override def head: Nothing = { throw new NoSuchElementException("head of empty list") } override def tail: List[Nothing] = { throw new NoSuchElementException("tail of empty list") } override def toString = "Nil" }
/*

Note that Cons is a case class, therefore we can use Cons(x,y) as synonym for the constructor and also as extractor during pattern matching.

We can now use the list as follows:

[] => Nil [1] => Cons(1, Nil) [1,2,3] => Cons(1, Cons(2, Cons(3, Nil)))

As in the original article, what follows is a number of definitions of ‘sum’. I name each definition differently so that this whole article can be compiled as is.

My first approach mimics the Miranda approach with 2 partial functions. Scala does not automatically combine partial functions se we’ll need to do ourselves.

*/
// Miranda: // sum nil = 0 // sum (cons num list) = num + sum list // val sum_nil: PartialFunction[List[Int], Int] = { case Nil => 0 } val sum_cons: PartialFunction[List[Int], Int] = { case Cons(num, list) => num + sum_combinedpartials(list) } def sum_combinedpartials = sum_nil orElse sum_cons
/*

The second, more idiomatic Scala, version wraps this into a single method:

*/
def sum_idiomatic(li: List[Int]): Int = li match { case Nil => 0 case Cons(num, list) => num + sum_idiomatic(list) }
/*

Note that both approaches use pattern matching, just like the Miranda original.

The following definitions of ‘sum’ are more interesting. They are very close to the original example. Note that this actually defines an immutable reference (a val) to a function. In case you are wondering, List[Int] => Int is the type of the function: it accepts one argument, List[Int], and has Int as return type.

*/
// Miranda: sum = reduce add 0 val sum: List[Int] => Int = reduce(add, 0)
/*

Note that nowadays most languages instead of ‘reduce’, use the term ‘foldr’ (fold-right).

Scala has type inference, so if we let the compiler derive the return type, its even closer. Note that we need to tuck an underscore to the end to let the compiler know we really want a partially applied function.

*/
val sum_noTypes = reduce(add, 0) _
/*

And of course we need a definition for ‘add’:

*/
// Miranda: add x y = x + y def add(x: Int, y: Int): Int = x + y
/*

For function ‘reduce’, I again use pattern matching just like the Miranda original. Notice that ‘reduce’ defines 2 parameter lists. One with parameters f and x, and the second with parameter l. This allows for easy creation of partially applied functions. And this is exactly what happened in the definitions of ‘sum’ and ‘sum_noTypes’ above.

*/
// Miranda // (reduce f x) nil = x // (reduce f x) (cons a l) = f a ((reduce f x) l) // /** * Right fold a list with a function f and a constant x. * * @param f a function to apply * @param x the result for an empty list * @param l the list to fold * @param [A] the type of list elements * @param [B] the result type * @return the result of the expression where in l, * each Cons is replaced by f, and Nil by x */ def reduce[A, B](f: (A, B) => B, x: B)(l: List[A]): B = l match { case Nil => x case Cons(head, tail) => f(head, reduce(f, x)(tail)) }
/*

More examples:

*/
// Miranda: product = reduce multiply 1 def product: List[Int] => Int = reduce(multiply, 1)
/*

Just for fun I define ‘multiply’ a little bit different then ‘add’. Where ‘add’ is a method (which can be used as a function), ‘multiply’ is directly defined as a function. Each underscore declares an anonymous parameter. To avoid the ugly (_: Int) syntax, the type of these anonymous parameters are derived form the return type of ‘multiply’.

*/
val multiply: (Int, Int) => Int = _ * _
/*

We can also inline these functions. If you do so the compiler still needs some hints to infer the types of the anonymous parameters. In the following definitions of ‘anytrue’ and ‘alltrue’ I solve that by adding the generic types of ‘reduce’ (the [Boolean,Boolean] part).

With ‘alltrue’ the return type is inferred by the compiler (again, this requires an underscore to confirm we want partial application of ‘reduce’). In the definition of ‘alltrue’, the return type is defined and therefore the underscore is not needed.

Personally I prefer the explicit return type as the function is too complex to just see what it should be. My rule is omit the return type only when it is blindingly obvious (as in ‘add’ above).

*/
// Miranda // anytrue = reduce or false // alltrue = reduce and true // def anytrue: List[Boolean] => Boolean = reduce[Boolean, Boolean](_ || _, false) def alltrue = reduce[Boolean, Boolean](_ && _, true) _
/*

Lets also explore how ‘reduce’ works and calculate the sum of [1,2,3]. Note how we replace Cons with the given function ‘add’ and Nil with the given constant ‘0’.

Starting point: Cons(1, Cons(2, Cons(3, Nil))) Reduce applied: add(1, add(2, add(3, 0 ))) == 6 And for multiply: multiply(1, multiply(2, multiply(3, 1 ))) == 6

Here is my translation of ‘append’.

*/
// Miranda: append a b = reduce cons b a def append[A](a: List[A], b: List[A]): List[A] = reduce[A, List[A]](Cons[A], b)(a)
/*

Note that the first parameter we pass to ‘reduce’ is actually the Cons object (with generic type parameter [A]). This is possible because anything with an ‘apply’ method is recognized as a Scala function.

The generic type ‘[A]’ on Cons is essential; without it the compiler generates nothing but misleading error messages. It took me many hours to figure this one out.

If you are new to this complex kind of type declarations (like I am), you’ll find that careful reading is in order. I am afraid this is just something you need to practice. Also, it helps greatly when you wrote this kind of stuff yourself.

Lets skip ahead to ‘doubleall’.

*/
// Miranda: // doubleall = reduce doubleandcons nil // where doubleandcons num list = cons (2*num) list def doubleall_1: List[Int] => List[Int] = { def doubleandcons(num: Int, list: List[Int]) = Cons(2 * num, list) reduce(doubleandcons, Nil) }
/*

As you can see Miranda’s where keyword nicely translates to nested methods.

Next version:

*/
// Miranda: // doubleandcons = fandcons double // where double n = 2*n // fandcons f el list = cons (f el) list def doubleandcons: (Int, List[Int]) => List[Int] = { def double(n: Int) = 2 * n def fandcons[A, B](f: A => B)(el: A, list: List[B]) = Cons(f(el), list) fandcons(double) }
/*

So far, I had little problem with translating this to Scala. However, the next line needed quite a bit more thought:

fandcons f = cons . f

Scala (Miranda as well) only defines the ‘compose’ method on functions of arity 1. For Miranda this is no problem as all functions written down with multiple arguments are actually syntactic sugar for functions with a single argument that produce further functions with 1 argument. So the type of ‘add’ in

add x y = x + y

is not (Int, Int) => Int as it would be in Scala, but Int => Int => Int.

In our Scala code Cons.apply has 2 arguments and thus arity 2. To be precise, the type of Cons[B] _ is (B, List[B]) => List[B]. (Note in Scala we write Cons[B] _ to indicate we need it as a partially applied function).

In order to apply ‘compose’ we first need to convert Cons[B].apply to something of type B => List[B] => List[B] (a unary function that accepts a B, and gives another unary function that accepts a List[B], and gives a List[B]). This is called currying and is performed with the ‘curried’ function.

The result of currying is a unary function that we can compose with ‘f’. The next step is to uncurry back to a function of arity 2. This is done with ‘uncurried’. Not lets put this in code:

*/
// Miranda: fandcons f = cons . f def fandcons[A, B](f: A => B): (A, List[B]) => List[B] = Function.uncurried( (Cons[B] _).curried.compose(f) )
/*

However, this is very a unwieldy syntax and I am going to need it a lot in the upcoming examples. Therefore, I’ll pimp the Scala library to include ‘compose’ also on functions of arity 2 (see RichFunction2 below).
We can now condense the code to:

*/
// Miranda: fandcons f = cons . f def fandcons_short[A, B](f: A => B): (A, List[B]) => List[B] = (Cons[B] _).compose(f)
/*

And then we try it out with some arguments to see it is correct:

fandcons(f, el)(list) = Cons.compose(f)(el)(list) = Cons(f(el), list)

And finally (no longer defining ‘double’ as a nested function):

*/
// Miranda: doubleall = reduce (cons . double) nil def double(n: Int) = 2 * n def doubleall: List[Int] => List[Int] = reduce((Cons[Int] _).compose(double), Nil) _
/*

The next step is to extract ‘map’:

*/
// Miranda: // doubleall = map double // map f = reduce (cons . f) nil def doubleall_map: List[Int] => List[Int] = map(double) def map[A, B](f: A => B): List[A] => List[B] = reduce((Cons[B] _).compose(f), Nil) _
/*

Map could for example be used to calculate the sum in a matrix, a list of list of int.

*/
// Miranda: summatrix = sum . map sum val summatrix: List[List[Int]] => Int = sum.compose(map(sum))
/*

Here is a fragment of the scala console in which I try out summatrix:

scala> import nl.grons.whyfp.Chapter3_GlueingFunctionsTogether._ scala> val a: List[List[Int]] = Cons(Nil, Cons(Cons(1, Cons(2, Nil)), Cons(Cons(4, Nil), Nil))) a: nl.grons.whyfp.Chapter3_GlueingFunctionsTogether.List[nl.grons.whyfp.Chapter3_GlueingFunctionsTogether.List[Int]] = Cons(Nil,Cons(Cons(1,Cons(2,Nil)),Cons(Cons(4,Nil),Nil))) scala> summatrix(a) res0: Int = 7
/*

Lets move on and create a tree.

*/
//Miranda: treeof X ::= node X (listof (treeof X)) sealed abstract class Tree[+A] { def label: A def children: List[Tree[A]] } case class Node[A](l: A, c: List[Tree[A]]) extends Tree[A] { override def label: A = l override def children: List[Tree[A]] = c }
/*

The tree 1 o / \ / \ 2 o o 3 | | o 4 in Scala would look like:

Node(1, Cons( Node(2, Nil), Cons(Node(3, Cons(Node(4, Nil), Nil)), Nil)))

Here is my version of ‘redtree’:

*/
// Miranda: // redtree f g a (node label subtrees) = // f label (redtree' f g a subtrees) // redtree' f g a (cons subtree rest) = // g (redtree f g a subtree) (redtree' f g a rest) // redtree' f g a nil = a /** * Reduce a tree with functions f and g and constant a. * @param f a function to apply to a node label and * the result of g on the node's subtree * @param g a function to apply to a subtree and * the result of g on the following sibling subtrees * @param a the result for an empty tree * @param tree the tree to reduce * @param [F] the return type of f and redtree itself * @param [G] the return type of g * @param [A] the type of the tree labels * @return the result of the expression where in tree, * each Node is replaced by f, each Cons by g, and Nil by x */ def redtree[F, G, A](f: (A, G) => F, g: (F, G) => G, a: G)(tree: Tree[A]): F = { def redchildren(children: List[Tree[A]]): G = children match { case Cons(subtree, rest) => g(redtree(f, g, a)(subtree), redchildren(rest)) case Nil => a } f(tree.label, redchildren(tree.children)) }
/*

Notice how Scala’s nested functions are more elegant then Miranda’s top level function ‘redtree' ’ (renamed to ‘redchildren’ here); in Scala the arguments of ‘redtree’ are not passed to ‘redchildren’ as the nested function can access them directly.

Lets use redtree to sum all integer labels in a tree of type Tree[Int]:

*/
// Miranda: sumtree = add add 0 val sumtree: Tree[Int] => Int = redtree(add, add, 0)
/*

Next: collecting all the labels in a list:

*/
// Miranda: labels = redtree cons append nil def labels[A]: Tree[A] => List[A] = redtree(Cons[A], append[A], Nil)
/*

And the last example: create a new tree by applying a function to each label:

*/
// Miranda: maptree f = redtree (node . f) cons nil def maptree[A, B](f: A => B): Tree[A] => Tree[B] = redtree((Node[B] _).compose(f), Cons[Tree[B]], Nil)
} */

Parting thoughts

Although it took quite some time to write the code and then translate this to my 100th article, I had great fun doing so. In addition, I learned quite a bit on writing more complex Scala code.

The code for chapter 4 has already been translated, so expect another article soon.

P.S. RichFunction2

As was written above, Scala does not define ‘compose’ on Function2 (the class representing functions with 2 parameters), only on Function1 ‘compose’ is available. To remedy this, we’ll define it ourselves with the pimp-my-library pattern.

*/
class RichFunction2[T1, T2, R](f: Function2[T1, T2, R]) { /** * (f compose g)(x, y) == f(g(x), y) */ def compose[A](g: (A) => T1): (A, T2) => R = { (x: A, y: T2) => f(g(x), y) } } object RichFunction2 { // Note: complete return type is required here because the method // is used before it is declared. implicit def richFunction2[T1, T2, R](f: Function2[T1, T2, R]): RichFunction2[T1, T2, R] = new RichFunction2(f) }