Friday, November 7, 2008

Beautiful folding

> {-# LANGUAGE ExistentialQuantification #-}

> import Data.List (foldl')

If you're not a Haskeller, and were thus hoping to learn how to fold a shirt beautifully, I'm afraid you're out of luck. I don't know either.

Much has been said about writing a Haskell function to calculate the mean of a list of numbers. For example, see Don Stewart's "Write Haskell as fast as C". Basically, one wants to write "nice, declarative" code like this:

> naiveMean :: Fractional a => [a] -> a
> naiveMean xs = sum xs / fromIntegral (length xs)

but if xs is large, sum will bring the whole thing into memory, but the garbage collector won't be able to collect it, since we still need it to calculate the length.

The solution is to calculate both the sum and the length in one pass, and it's usually written something like this:

> uglyMean :: Fractional a => [a] -> a
> uglyMean xs = divide $ foldl' f (P 0 0) xs
> where
> f :: Num a => Pair a Int -> a -> Pair a Int
> f (P s l) x = P (s + x) (l + 1)
> divide (P x y) = x / fromIntegral y

where P is a strict pair constructor. This works, but where is the elegance, abstraction and modularity that Haskell is supposed to be famous for? Don's solution is even uglier (sorry, Don): not only does he write the reductor (our f) explicitly, but also the fold itself.

What I hope to do here is to abstract this pattern away, by making "combinable folds". I only do foldl', although foldl1' could be handy.

To make folds combinable, we need to turn folds into data: a fold is a function (the reductor) with an initial value. To make folds more readily combinable, we add a post-processing function (here it is divide). Now that we have the post-processor, we don't need to look at the accumulator directly, so we make it existential. The type Fold b c is for folds overs lists of type [b], with
results of type c.

> data Fold b c = forall a. F (a -> b -> a) a (a -> c)

We'll need a strict pair type, and I don't want to give my blog a dependency on the strict package, so I introduce my own:

> data Pair a b = P !a !b

Now that folds are data, we can start manipulating them. For example, we can
combine two folds to get a pair of results (we make the result an ordinary tuple for convenience, but use strict pairs for the accumulator to get the rightstrictness). The (***) defined here is like the one in Control.Arrow, but takes a strict pair as input. The reductor (comb f g) is basically (first f) . (second g) for strict pairs.

> both :: Fold b c -> Fold b c' -> Fold b (c, c')
> both (F f x c) (F g y c') = F (comb f g) (P x y) (c *** c')
> where
> comb f g (P a a') b = P (f a b) (g a' b)
> (***) f g (P x y) = (f x, g y)

Our next combinator simply adds an extra post-processor.

> after :: Fold b c -> (c -> c') -> Fold b c'
> after (F f x c) d = F f x (d . c)

The next one, bothWith, is a combination of both and after.

> bothWith :: (c -> c' -> d) -> Fold b c -> Fold b c' -> Fold b d
> bothWith combiner f1 f2 = after (both f1 f2) (uncurry combiner)

Now that we have tools to build folds, we want to actually fold them, so here is combinator foldl':

> cfoldl' :: Fold b c -> [b] -> c
> cfoldl' (F f x c) = c . (foldl' f x)

Now lets see a few basic folds:

> sumF :: Num a => Fold a a
> sumF = F (+) 0 id

> productF :: Num a => Fold a a
> productF = F (*) 1 id

> lengthF :: Fold a Int
> lengthF = F (const . (+1)) 0 id

And, the moment we've all been waiting for, combining basic folds to get the mean of a list:

> meanF :: Fractional a => Fold a a
> meanF = bothWith (/) sumF (after lengthF fromIntegral)

> mean :: Fractional a => [a] -> a
> mean = cfoldl' meanF

Pretty simple, eh? Perhaps not quite as pretty as naiveMean, but best of all, it doesn't eat your memory and kill your swap like naiveMean does.

> main = do
> let xs = [1..10000000]
> print $ mean xs

Compiling with GHC 6.8.2 and -O2, this runs in about 1.2 seconds (on my three-year-old laptop) and uses less than a meg of memory. GHC generates the same code for mean and uglyMean. [Originally uglyMean was slightly faster, but this was because of type defaulting: the result of lengthF defaulted to Integer]

One thing remains. What do Haskellers do when there's a pretty way and a fast way (or at least a way that's more susceptible to optimisation) to do the same thing? We write rewrite rules. So we'd like to convert sum, length, etc. into combinable folds, and then combine them. Something like this:

> {-
> {-# RULES
> "sum/sumF" sum = cfoldl' sumF
> "product/productF" product = cfoldl' productF
> "length/lengthF" length = cfoldl' lengthF
> "multi-cfoldl'" forall c f g xs. c (cfoldl' f xs) (cfoldl' g xs)
> = cfoldl' (bothWith c f g) xs
> #-}
> -}

So why are these commented out? Unfortunately, GHC doesn't like the
all-important "multi-foldl'" rule: it doesn't have a named function at its head (it has the variable c). GHC doesn't allow rules of this form, presumably for efficiency and simplicity in the compiler.

So unfortunately, we can't go back to writing pretty-but-naive code, but with these combinators at our disposal, we are at least saved from writing *ugly* code.

Monday, March 10, 2008

Mathematics for Arithmeticians: Pons asinorum

In today's post, I will show Pappus's proof of the so-called Pons asinorum or Bridge of Asses. The theorem is given this name because it is the first difficult theorem in Euclid's Elements. The theorem states that the base angles of an isosceles triangle are equal.

Consider triangle ABC, with AB = BC. Now we want to prove that angle A is equal to angle C. One useful trick in geometry for proving things equal is to use congruence of triangles, but we only have one triangle.

Or do we? Pappus' trick is to look at one triangle two ways: ABC and its reflection CBA. AB = CB (by our starting assumption), BA = BC (by the same assumption), and AC = CA (they're the same line). So ABC ≡ CBA (all three corresponding sides are equal, so they're congruent). Since they're congruent, the corresponding angles are equal: angle A equals angle B.

That's it. This trick allows us to prove the theorem in just a few steps.

Why do I think this elegant? By looking at one thing in two ways, we learnt something new, and remarkably quickly. And the great thing is that you can use this trick again, to prove the converse using angle-side-angle congruence. And a trick you can use more than once is a technique.

If you followed this, well done. You have crossed the Bridge of Asses.

Monday, March 3, 2008

Mathematics for Arithmeticians: Prime Numbers

I've decided to stop calling myself a computer science student — I got tired of being the computer guy — and go for mathematics instead.

People used to ask me to fix their computers. Now they ask me to calculate the tip and divide the bill when we go to restaurants, too.

Yeah, so people don't know what mathematics is any better than they know what computer science is.

Maths is not arithmetic (though arithmetic sometimes helps us do maths). Maths is not a set of facts you don't understand handed down from on high for you to remember. It's a world to explore and understand. To enjoy it, you'll want to explore yourself, but I'm going to show you some of the sights. Hopefully, a bit of high-school vocabulary and a bit of thought should be enough to understand this. *

As one of my lecturers pointed out the other day, one of the reasons the natural numbers (whole, positive numbers) are interesting is because they have two kinds of structure: additive and multiplicative. The building block of the additive structure is one: we can make any number by taking one, adding one, adding one again, etc.

But the multiplicative structure is a little more intricate. The building blocks are called "prime": the numbers you can't make by multiplying other numbers together. I'm sure you know them. 2, 3, 5, 7, 11, and so on.

And so on? Does that mean I'm too lazy to list them all. Nope... there are infinitely many primes. How do I know? I have a proof.

First off, what does it mean for there to be infinitely many primes? Clearly, it means for any prime, there is another one bigger than it. So how can we show that there is a prime bigger than p?

Take all the primes up to p, multiply them together, and add one: 2×3×5×...p + 1. This number isn't divisible by 2 (since the number is two times something plus 1, when you divide it by two you get a remainder of 1). Similarly, it isn't divisible by 3, 5, or any other prime up to p.

So either 2×3×5×...p + 1 is prime, or there is some prime number bigger than p which divides it.

So p is not the largest prime. But this argument works for any p, so there is no largest prime.

This proof is due to Euclid. We proceeded from an obvious starting point, by obvious steps (if any of them were not obvious, leave a comment and I'll explain in further detail), to a non-obvious, interesting conclusion.

I hope you think this is as beautiful as I do.

* The things I plan to present in this series are basically the subset of things that I've been thinking about and discussing with my friends which I can explain to my girlfriend, who is interested in maths but has never studied it beyond school.

Tuesday, January 22, 2008

Unmonad Tutorial: IO in Haskell without understanding monads

Forget monads!

These days, I hear a lot of "what kind of language makes you learn category theory to do IO" and "monads are a kludge to allow IO in a pure language."

Both of these ideas are rubbish. You can do IO without understanding monads, and monads aren't a way of doing IO. The usual way to demonstrate this is to write yet another monad tutorial. But I'm bored of monad tutorials: half of them don't make sense, and anyway, I just want to do some frickin' IO.

It's pretty simple. Haskell introduces a datatype for IO actions, and a DSL for creating them. The fact that IO is an instance of the Monad typeclass is irrelevant for our purposes. I just want to do some frickin' IO.

The DSL is simple: do introduces an IO action. IO actions are formatted either as brace-enclosed, semicolon-delimited (like C) or indented, newline-delimited statements (like Python). Each statement is one of the following:

  • let n = v binds n to the value v.

  • n <- a executes the action a and binds the name n to the result.

  • a, on its own, executes the action a.

This is all rather similar to an imperative language, except that two types of assignment are distinguished.

The result of running your newly constructed action is the result of the last action in your do block. Which leaves just one more thing you'll need return. This is not like the return of other languages: it's not a flow control statement. return x is an IO action that does nothing but yield x when executed.

Notice that the let n = v form is actually unneeded: it is equivalent to n <- return v. So there's really only one type of assignment, if you prefer to look at it that way.

You may be wondering how you pass arguments to an IO action. You don't. You make a function which takes arguments and returns an IO action.

As a short example, I'll write a program that reads in some integers, and outputs a running total.

totalLoop takes the old total, and produces an IO action which reads a line from standard input, converts it to an integer, prints the total, and then runs itself with the new total. It loops forever, so we give it the return type (), the empty tuple, which is used like void in Haskell.

> totalLoop :: Integer -> IO ()
> totalLoop oldTotal =
> do
> input <- getLine -- getLine :: IO String
> let number = read input -- read :: String -> Integer
> let newTotal = oldTotal + number
> print newTotal -- print :: Integer -> IO ()
> totalLoop newTotal

main simply runs totalLoop with a zero total.

> main :: IO ()
> main =
> do
> totalLoop 0

And there you have it: an IO-performing (if uninteresting) Haskell program, without any understanding of what monads are.

Friday, January 18, 2008

First player wins Superghost

Inspired by the XKCD's blag post on the solution to Ghost, I first verified his results, and then moved onto the somewhat harder task of solving Ghost's wicked step-sister, Superghost.

Briefly, the rules of Superghost: the first player names a letter; then, each player in turn either adds a letter to the beginning ("conses" a letter) or to the end ("snocs" a letter; the words "cons" and "snoc" are not game terminology, but functional-programming jargon). The first player to create a string of letters which is not part of a real word, or completes a real word, loses. I consider only the two-player version.

The solver is written in Haskell, and uses the Ubuntu British English word list. Only words with four or more letters are considered. Words containing capital or accented letters are ignored.

The program took about 22.5 seconds to find the solution: The first player wins, by playing a, i, o, s, or v.

The winning responses for the second player:

('b',[Snoc 'w'])
('c',[Cons 'g',Cons 'w',Snoc 'd',Snoc 'q'])
('d',[Cons 'c',Cons 'f',Cons 'g',Snoc 'w'])
('e',[Snoc 'r'])
('f',[Cons 'f',Cons 'h',Cons 'l',Cons 'x',Snoc 'd',Snoc 'f',Snoc 'g',Snoc 'n',Snoc 'p',Snoc 'w'])
('g',[Cons 'f',Cons 'h',Cons 'l',Cons 'x',Snoc 'c',Snoc 'd',Snoc 'j',Snoc 'm',Snoc 'w',Snoc 'z'])
('h',[Cons 'm',Cons 'n',Cons 'x',Snoc 'f',Snoc 'g',Snoc 'k',Snoc 'q'])
('j',[Cons 'g',Cons 'k',Cons 'p',Cons 'r',Cons 'u'])
('k',[Cons 'h',Cons 'k',Cons 't',Cons 'w',Cons 'y',Snoc 'j',Snoc 'k'])
('l',[Cons 'w',Cons 'x',Snoc 'f',Snoc 'g'])
('m',[Cons 'g',Snoc 'h'])
('n',[Cons 'f',Cons 'x',Snoc 'h',Snoc 'w',Snoc 'x'])
('p',[Cons 'f',Snoc 'j'])
('q',[Cons 'c',Cons 'h',Cons 'x'])
('r',[Cons 'e',Cons 'r',Snoc 'j',Snoc 'r'])
('t',[Snoc 'k'])
('u',[Snoc 'j'])
('w',[Cons 'b',Cons 'd',Cons 'f',Cons 'g',Cons 'n',Cons 'w',Snoc 'c',Snoc 'k',Snoc 'l',Snoc 'w',Snoc 'y'])
('x',[Cons 'n',Snoc 'f',Snoc 'g',Snoc 'h',Snoc 'l',Snoc 'n',Snoc 'q'])
('y',[Cons 'w',Snoc 'k'])
('z',[Cons 'g'])

Byorgey wanted code... he gets code. Sorry about the lack of comments, but this is a for-fun hack.

module Main where

import qualified Data.Set as S
import qualified Data.Map as M
import qualified Data.List as L
import Data.Maybe (fromMaybe)
import Control.Applicative

type SuperghostDict = M.Map String (S.Set String)

data Play = Cons Char | Snoc Char deriving (Show, Eq, Ord)

getWords :: IO [String]
getWords = filter (all (`elem` ['a'..'z'])) <$>
lines <$>
readFile "/usr/share/dict/words"

makeDict :: [String] -> SuperghostDict
makeDict words = M.unionWith S.union
(M.fromListWith S.union $
concatMap (\word -> let tails = L.tails word in
zip (tail tails) $ map S.singleton tails)
(M.fromAscList $ map (\x -> (x, S.empty)) $ words)

wordsEnding :: String -> SuperghostDict -> S.Set String
wordsEnding word dict = (fromMaybe S.empty $ M.lookup word dict)

wordsStarting :: String -> SuperghostDict -> S.Set String
wordsStarting word dict = S.fromAscList $
(takeWhile (word `L.isPrefixOf`) $ map fst $
M.toAscList $ snd $ M.split word dict)

wordsMiddling :: String -> SuperghostDict -> S.Set String
wordsMiddling word dict = (foldr S.union S.empty
(map snd $ takeWhile ((word `L.isPrefixOf`) . fst) $
M.toAscList $ snd $ M.split word dict))

wordsWith :: String -> SuperghostDict -> S.Set String
wordsWith word dict = (wordsEnding word dict) `S.union`
(wordsStarting word dict) `S.union`
(wordsMiddling word dict)

plays :: String -> SuperghostDict -> S.Set Play
plays word dict = ( (\ w -> (Snoc $ w !! (length word))) $
wordsStarting word dict)
`S.union` ( (\ w -> Cons $ head w)
$ wordsEnding word dict)
`S.union` ( (\ w -> Cons $ head w)
$ wordsMiddling word dict)

apply :: String -> Play -> String
apply word (Snoc c) = word ++ [c]
apply word (Cons c) = c:word

winnable :: SuperghostDict -> String -> Bool
winnable dict word = let moves = S.toList $ plays word dict in
if null moves then
any (not . (winnable dict) . (apply word)) moves

winningPlays :: SuperghostDict -> String -> [Play]
winningPlays dict word = let moves = S.toList $ plays word dict in
filter (not . (winnable dict) . (apply word)) moves

forever :: (Monad m) => m a -> m ()
forever x = x >> forever x

main = do
dict <- makeDict <$> filter ((> 3) . length) <$> getWords
print $ winningPlays dict ""