The Most Fuun You Can Have

Friday, November 7, 2008

Beautiful folding


> {-# LANGUAGE ExistentialQuantification #-}

> import Data.List (foldl')

If you're not a Haskeller, and were thus hoping to learn how to fold a shirt beautifully, I'm afraid you're out of luck. I don't know either.

Much has been said about writing a Haskell function to calculate the mean of a list of numbers. For example, see Don Stewart's "Write Haskell as fast as C". Basically, one wants to write "nice, declarative" code like this:


> naiveMean :: Fractional a => [a] -> a
> naiveMean xs = sum xs / fromIntegral (length xs)

but if xs is large, sum will bring the whole thing into memory, but the garbage collector won't be able to collect it, since we still need it to calculate the length.

The solution is to calculate both the sum and the length in one pass, and it's usually written something like this:


> uglyMean :: Fractional a => [a] -> a 
> uglyMean xs = divide $ foldl' f (P 0 0) xs
>     where 
>         f :: Num a => Pair a Int -> a -> Pair a Int
>         f (P s l) x = P (s + x) (l + 1) 
>         divide (P x y) = x / fromIntegral y

where P is a strict pair constructor. This works, but where is the elegance, abstraction and modularity that Haskell is supposed to be famous for? Don's solution is even uglier (sorry, Don): not only does he write the reductor (our f) explicitly, but also the fold itself.

What I hope to do here is to abstract this pattern away, by making "combinable folds". I only do foldl', although foldl1' could be handy.

To make folds combinable, we need to turn folds into data: a fold is a function (the reductor) with an initial value. To make folds more readily combinable, we add a post-processing function (here it is divide). Now that we have the post-processor, we don't need to look at the accumulator directly, so we make it existential. The type Fold b c is for folds overs lists of type [b], with
results of type c.


> data Fold b c = forall a. F (a -> b -> a) a (a -> c)

We'll need a strict pair type, and I don't want to give my blog a dependency on the strict package, so I introduce my own:


> data Pair a b = P !a !b

Now that folds are data, we can start manipulating them. For example, we can
combine two folds to get a pair of results (we make the result an ordinary tuple for convenience, but use strict pairs for the accumulator to get the rightstrictness). The (***) defined here is like the one in Control.Arrow, but takes a strict pair as input. The reductor (comb f g) is basically (first f) . (second g) for strict pairs.


> both :: Fold b c -> Fold b c' -> Fold b (c, c')
> both (F f x c) (F g y c') = F (comb f g) (P x y) (c *** c')
>     where
>         comb f g (P a a') b = P (f a b) (g a' b)
>         (***) f g (P x y) = (f x, g y)

Our next combinator simply adds an extra post-processor.


> after :: Fold b c -> (c -> c') -> Fold b c'
> after (F f x c) d = F f x (d . c)

The next one, bothWith, is a combination of both and after.


> bothWith :: (c -> c' -> d) -> Fold b c -> Fold b c' -> Fold b d
> bothWith combiner f1 f2 = after (both f1 f2) (uncurry combiner)

Now that we have tools to build folds, we want to actually fold them, so here is combinator foldl':


> cfoldl' :: Fold b c -> [b] -> c
> cfoldl' (F f x c) = c . (foldl' f x)

Now lets see a few basic folds:


> sumF :: Num a => Fold a a
> sumF = F (+) 0 id

> productF :: Num a => Fold a a
> productF = F (*) 1 id

> lengthF :: Fold a Int
> lengthF = F (const . (+1)) 0 id

And, the moment we've all been waiting for, combining basic folds to get the mean of a list:


> meanF :: Fractional a => Fold a a
> meanF = bothWith (/) sumF (after lengthF fromIntegral)

> mean :: Fractional a => [a] -> a
> mean = cfoldl' meanF

Pretty simple, eh? Perhaps not quite as pretty as naiveMean, but best of all, it doesn't eat your memory and kill your swap like naiveMean does.


> main = do
>         let xs = [1..10000000]
>         print $ mean xs

Compiling with GHC 6.8.2 and -O2, this runs in about 1.2 seconds (on my three-year-old laptop) and uses less than a meg of memory. GHC generates the same code for mean and uglyMean. [Originally uglyMean was slightly faster, but this was because of type defaulting: the result of lengthF defaulted to Integer]

One thing remains. What do Haskellers do when there's a pretty way and a fast way (or at least a way that's more susceptible to optimisation) to do the same thing? We write rewrite rules. So we'd like to convert sum, length, etc. into combinable folds, and then combine them. Something like this:


> {-
> {-# RULES
> "sum/sumF"         sum = cfoldl' sumF
> "product/productF" product = cfoldl' productF
> "length/lengthF"   length = cfoldl' lengthF
> "multi-cfoldl'"    forall c f g xs. c (cfoldl' f xs) (cfoldl' g xs) 
>                                         = cfoldl' (bothWith c f g) xs
>  #-}
> -}

So why are these commented out? Unfortunately, GHC doesn't like the
all-important "multi-foldl'" rule: it doesn't have a named function at its head (it has the variable c). GHC doesn't allow rules of this form, presumably for efficiency and simplicity in the compiler.

So unfortunately, we can't go back to writing pretty-but-naive code, but with these combinators at our disposal, we are at least saved from writing *ugly* code.

Monday, March 10, 2008

Mathematics for Arithmeticians: Pons asinorum

In today's post, I will show Pappus's proof of the so-called Pons asinorum or Bridge of Asses. The theorem is given this name because it is the first difficult theorem in Euclid's Elements. The theorem states that the base angles of an isosceles triangle are equal.

Consider triangle ABC, with AB = BC. Now we want to prove that angle A is equal to angle C. One useful trick in geometry for proving things equal is to use congruence of triangles, but we only have one triangle.

Or do we? Pappus' trick is to look at one triangle two ways: ABC and its reflection CBA. AB = CB (by our starting assumption), BA = BC (by the same assumption), and AC = CA (they're the same line). So ABC ≡ CBA (all three corresponding sides are equal, so they're congruent). Since they're congruent, the corresponding angles are equal: angle A equals angle B.

That's it. This trick allows us to prove the theorem in just a few steps.

Why do I think this elegant? By looking at one thing in two ways, we learnt something new, and remarkably quickly. And the great thing is that you can use this trick again, to prove the converse using angle-side-angle congruence. And a trick you can use more than once is a technique.

If you followed this, well done. You have crossed the Bridge of Asses.

Monday, March 3, 2008

Mathematics for Arithmeticians: Prime Numbers

I've decided to stop calling myself a computer science student — I got tired of being the computer guy — and go for mathematics instead.

People used to ask me to fix their computers. Now they ask me to calculate the tip and divide the bill when we go to restaurants, too.

Yeah, so people don't know what mathematics is any better than they know what computer science is.

Maths is not arithmetic (though arithmetic sometimes helps us do maths). Maths is not a set of facts you don't understand handed down from on high for you to remember. It's a world to explore and understand. To enjoy it, you'll want to explore yourself, but I'm going to show you some of the sights. Hopefully, a bit of high-school vocabulary and a bit of thought should be enough to understand this. ^*

As one of my lecturers pointed out the other day, one of the reasons the natural numbers (whole, positive numbers) are interesting is because they have two kinds of structure: additive and multiplicative. The building block of the additive structure is one: we can make any number by taking one, adding one, adding one again, etc.

But the multiplicative structure is a little more intricate. The building blocks are called "prime": the numbers you can't make by multiplying other numbers together. I'm sure you know them. 2, 3, 5, 7, 11, and so on.

And so on? Does that mean I'm too lazy to list them all. Nope... there are infinitely many primes. How do I know? I have a proof.

First off, what does it mean for there to be infinitely many primes? Clearly, it means for any prime, there is another one bigger than it. So how can we show that there is a prime bigger than p?

Take all the primes up to p, multiply them together, and add one: 2×3×5×...p + 1. This number isn't divisible by 2 (since the number is two times something plus 1, when you divide it by two you get a remainder of 1). Similarly, it isn't divisible by 3, 5, or any other prime up to p.

So either 2×3×5×...p + 1 is prime, or there is some prime number bigger than p which divides it.

So p is not the largest prime. But this argument works for any p, so there is no largest prime.

This proof is due to Euclid. We proceeded from an obvious starting point, by obvious steps (if any of them were not obvious, leave a comment and I'll explain in further detail), to a non-obvious, interesting conclusion.

I hope you think this is as beautiful as I do.

* The things I plan to present in this series are basically the subset of things that I've been thinking about and discussing with my friends which I can explain to my girlfriend, who is interested in maths but has never studied it beyond school.

Tuesday, January 22, 2008

Unmonad Tutorial: IO in Haskell without understanding monads

Forget monads!

These days, I hear a lot of "what kind of language makes you learn category theory to do IO" and "monads are a kludge to allow IO in a pure language."

Both of these ideas are rubbish. You can do IO without understanding monads, and monads aren't a way of doing IO. The usual way to demonstrate this is to write yet another monad tutorial. But I'm bored of monad tutorials: half of them don't make sense, and anyway, I just want to do some frickin' IO.

It's pretty simple. Haskell introduces a datatype for IO actions, and a DSL for creating them. The fact that IO is an instance of the Monad typeclass is irrelevant for our purposes. I just want to do some frickin' IO.

The DSL is simple: do introduces an IO action. IO actions are formatted either as brace-enclosed, semicolon-delimited (like C) or indented, newline-delimited statements (like Python). Each statement is one of the following:

let n = v binds n to the value v.

n <- a executes the action a and binds the name n to the result.

a, on its own, executes the action a.

This is all rather similar to an imperative language, except that two types of assignment are distinguished.

The result of running your newly constructed action is the result of the last action in your do block. Which leaves just one more thing you'll need return. This is not like the return of other languages: it's not a flow control statement. return x is an IO action that does nothing but yield x when executed.

Notice that the let n = v form is actually unneeded: it is equivalent to n <- return v. So there's really only one type of assignment, if you prefer to look at it that way.

You may be wondering how you pass arguments to an IO action. You don't. You make a function which takes arguments and returns an IO action.

As a short example, I'll write a program that reads in some integers, and outputs a running total.

totalLoop takes the old total, and produces an IO action which reads a line from standard input, converts it to an integer, prints the total, and then runs itself with the new total. It loops forever, so we give it the return type (), the empty tuple, which is used like void in Haskell.


> totalLoop :: Integer -> IO ()
> totalLoop oldTotal =
>   do
>       input <- getLine                    -- getLine :: IO String
>       let number = read input             -- read :: String -> Integer
>       let newTotal = oldTotal + number
>       print newTotal                      -- print :: Integer -> IO ()
>       totalLoop newTotal

main simply runs totalLoop with a zero total.


> main :: IO ()
> main =
>   do
>       totalLoop 0

And there you have it: an IO-performing (if uninteresting) Haskell program, without any understanding of what monads are.

Friday, January 18, 2008

First player wins Superghost

Inspired by the XKCD's blag post on the solution to Ghost, I first verified his results, and then moved onto the somewhat harder task of solving Ghost's wicked step-sister, Superghost.

Briefly, the rules of Superghost: the first player names a letter; then, each player in turn either adds a letter to the beginning ("conses" a letter) or to the end ("snocs" a letter; the words "cons" and "snoc" are not game terminology, but functional-programming jargon). The first player to create a string of letters which is not part of a real word, or completes a real word, loses. I consider only the two-player version.

The solver is written in Haskell, and uses the Ubuntu British English word list. Only words with four or more letters are considered. Words containing capital or accented letters are ignored.

The program took about 22.5 seconds to find the solution: The first player wins, by playing a, i, o, s, or v.

The winning responses for the second player:


('a',[])
('b',[Snoc 'w'])
('c',[Cons 'g',Cons 'w',Snoc 'd',Snoc 'q'])
('d',[Cons 'c',Cons 'f',Cons 'g',Snoc 'w'])
('e',[Snoc 'r'])
('f',[Cons 'f',Cons 'h',Cons 'l',Cons 'x',Snoc 'd',Snoc 'f',Snoc 'g',Snoc 'n',Snoc 'p',Snoc 'w'])
('g',[Cons 'f',Cons 'h',Cons 'l',Cons 'x',Snoc 'c',Snoc 'd',Snoc 'j',Snoc 'm',Snoc 'w',Snoc 'z'])
('h',[Cons 'm',Cons 'n',Cons 'x',Snoc 'f',Snoc 'g',Snoc 'k',Snoc 'q'])
('i',[])
('j',[Cons 'g',Cons 'k',Cons 'p',Cons 'r',Cons 'u'])
('k',[Cons 'h',Cons 'k',Cons 't',Cons 'w',Cons 'y',Snoc 'j',Snoc 'k'])
('l',[Cons 'w',Cons 'x',Snoc 'f',Snoc 'g'])
('m',[Cons 'g',Snoc 'h'])
('n',[Cons 'f',Cons 'x',Snoc 'h',Snoc 'w',Snoc 'x'])
('o',[])
('p',[Cons 'f',Snoc 'j'])
('q',[Cons 'c',Cons 'h',Cons 'x'])
('r',[Cons 'e',Cons 'r',Snoc 'j',Snoc 'r'])
('s',[])
('t',[Snoc 'k'])
('u',[Snoc 'j'])
('v',[])
('w',[Cons 'b',Cons 'd',Cons 'f',Cons 'g',Cons 'n',Cons 'w',Snoc 'c',Snoc 'k',Snoc 'l',Snoc 'w',Snoc 'y'])
('x',[Cons 'n',Snoc 'f',Snoc 'g',Snoc 'h',Snoc 'l',Snoc 'n',Snoc 'q'])
('y',[Cons 'w',Snoc 'k'])
('z',[Cons 'g'])

Byorgey wanted code... he gets code. Sorry about the lack of comments, but this is a for-fun hack.


module Main where

import qualified Data.Set as S
import qualified Data.Map as M
import qualified Data.List as L
import Data.Maybe (fromMaybe)
import Control.Applicative

type SuperghostDict = M.Map String (S.Set String)

data Play = Cons Char | Snoc Char deriving (Show, Eq, Ord)

getWords :: IO [String]
getWords = filter (all (`elem` ['a'..'z'])) <$>
            lines <$>
            readFile "/usr/share/dict/words"

makeDict :: [String] -> SuperghostDict
makeDict words = M.unionWith S.union
                 (M.fromListWith S.union $
                    concatMap (\word -> let tails = L.tails word in
                                        zip (tail tails) $ map S.singleton tails)
                               words)
                 (M.fromAscList $ map (\x -> (x, S.empty)) $ words)

wordsEnding :: String -> SuperghostDict -> S.Set String
wordsEnding word dict = (fromMaybe S.empty $ M.lookup word dict)

wordsStarting :: String -> SuperghostDict -> S.Set String
wordsStarting word dict = S.fromAscList $
                            (takeWhile (word `L.isPrefixOf`) $ map fst $
                                                M.toAscList $ snd $ M.split word dict)

wordsMiddling :: String -> SuperghostDict -> S.Set String
wordsMiddling word dict = (foldr S.union S.empty
                        (map snd $ takeWhile ((word `L.isPrefixOf`) . fst) $
                                                M.toAscList $ snd $ M.split word dict))

wordsWith :: String -> SuperghostDict -> S.Set String
wordsWith word dict = (wordsEnding word dict) `S.union`
                        (wordsStarting word dict) `S.union`
                            (wordsMiddling word dict)

plays :: String -> SuperghostDict -> S.Set Play
plays word dict = (S.map (\ w -> (Snoc $ w !! (length word))) $
                        wordsStarting word dict)
                    `S.union` (S.map (\ w -> Cons $ head w)
                                $ wordsEnding word dict)
                        `S.union` (S.map (\ w -> Cons $ head w)
                                    $ wordsMiddling word dict)

apply :: String -> Play -> String
apply word (Snoc c) = word ++ [c]
apply word (Cons c) = c:word

winnable :: SuperghostDict -> String -> Bool
winnable dict word = let moves = S.toList $ plays word dict in
                        if null moves then
                            True
                        else
                            any (not . (winnable dict) . (apply word)) moves

winningPlays :: SuperghostDict -> String -> [Play]
winningPlays dict word = let moves = S.toList $ plays word dict in
                            filter (not . (winnable dict) . (apply word)) moves

forever :: (Monad m) => m a -> m ()
forever x = x >> forever x

main = do
        dict <- makeDict <$> filter ((> 3) . length) <$> getWords
        print $ winningPlays dict ""

Wednesday, October 24, 2007

An update: results and research

I have seen the ICFP, and it is full of funky symbols. Yes, we won a prize; two of the twelve of us headed of to Freiburg-im-Breisgau; and we came second, beaten by Team Smartass of Google. Interestingly, of the four contestants in Freiburg to collect prizes, three were South African (myself and Marco from the United Coding Team, as well as Daniel Wright of Team Smartass).

A few months ago I posted an translation of Wolfram's (2, 3) Turing Machine into Fuun DNA. What I called "an open problem of Wolfram" is now closed. Today Wolfram announced that this machine has been proven universal by Alex Smith. Since there is known to be no (2, 2) UTM, Wolfram and Smith's machine is the smallest Universal Turing Machine.

Somewhat related to Wolfram and Smith, but *very* related to programmable Fuun DNA: programmable terrestial DNA — Programmed Mutagenesis Is a Universal Model of Computation by Khodor and Gifford. Perhaps Endo isn't as alien as we thought?

Wednesday, September 19, 2007

Haskelling the SACO 1

These blog entries is to record my early adventures in Haskell, to advertise the SACO, and show off the awesomeness of Literal Haskell (which means that this entire post can be pasted into a text file and compiled) by documenting my solutions to programming contest problems.

I've finally decided to learn Haskell. For about the third time.

But this time I'm actually doing it.

Since we've just hosted another successful South African Computer Olympiad^[1], I decided to code some Haskell solutions. Looking through the problems, I decided on Living Dead Parrots: I didn't code a check solution before the contest, so it wouldn't be too boring, but it was easy enough that I would spend more time learning the language than figuring out the algorithm. Most importantly, it was an interactive problem^[2], meaning I would be brought face-to-face with the IO monad, a fearful prospect for the novice Haskeller.

Well, it wasn't as scary as I'd feared. By no means.

In short, the problem we're solving is this: the evaluator has an array of N booleans. You can ask it for the number of true values between two indices (inclusive, base one). Less than a hundredth of the booleans are true. There is a limit on the number of queries you can make, but we ignore it, since it is always larger than the number needed by the correct algorithm.

This solution scores 100% (within the C++ time limit, when compiled with GHC), but is probably not the best Haskell ever written. I'd appreciate any advice you can give for improvements. That's why I'm posting, after all.

First, the preliminaries:


> module Main
>     where
>
> import System.IO
>
> rInt :: String -> Int
> rInt = read
>
> flush :: IO ()
> flush = hFlush stdout

Next, the function to query the evaluator. We output "Q a b", and read in an integer. The answer is obviously wrapped in the IO monad.


> query :: Int -> Int -> IO Int
> query a b = do putStrLn ("Q " ++ (show a) ++ " " ++ (show b))
>                flush
>                answer <- getLine
>                return $ rInt answer

I'm not particularly proud of this function, which wraps a value in the IO monad. If it's the right thing to do, there's a library function for it; if it's the wrong thing to do, I shouldn't do it. If you know which it is, leave a comment.


> coerceIO :: a -> IO a
> coerceIO x = do return x

Here comes the meaty part. Search takes a range, and the number of parrots (oh, yes, those booleans represent whether a parrot in a certain cage is pining for the fjords or has kicked the bucket) in that range, and determines where those parrots are.


> search :: Int -> Int -> Int -> IO [Bool]

There are three cases. The first is that the range contains no parrots.


> search start end 0 = coerceIO $ take (end-start+1) (repeat False)

The next case is that the range contains only one element, then we have found the location of a living parrot.


> search start end 1          | start == end  = coerceIO [True]

The interesting case is the remaining one, where we divide the list into two, and recurse.


> search start end numParrots = let middle = (end+start) `div` 2
>                                in do leftNum <- query start middle
>                                      left <- search start middle leftNum
>                                      right <- search (middle+1) end (numParrots-leftNum)
>                                      return (left ++ right)

All that remains is main, which contains my most suspicious use of coerceIO, to bind purely functional variables in a monad: I haven't figured out how to use let in a do.

It first reads a line containing two integers: the size of the parrot array, and the number of questions. We ignore the latter. We then find the number of live parrots in the array, and begin the search. The output format is "A 0 0 0 0 1 0 1 0 1 ...".


> main :: IO ()
> main = do line <- getLine
>           n <- coerceIO $ rInt $ head $ words line
>           numParrots <- query 1 n
>           parrots <- search 1 n numParrots
>           putStrLn $ "A " ++ (unwords $ map (\x -> if x then "1" else "0") parrots)
>           return ()

Notes on Haskell syntax, for non-Haskellers: f $ x is the same as f x, except that it binds more loosely. Its purpose is mostly to eliminate lots of irritating superfluous parentheses. (\x -> y) is an anonymous function mapping x to y

^[1]We now run a parallel contest online: same problems, but open to anyone. You should check it out this time next year. We might also be running the contests from the training camps online (training camps happen about three times a year, starting in February). Unfortunately, Haskell is not one of the supported languages, but after the contests you can download the evaluator programs and test data to score solutions in any language.

^[2]In the SACO, interactive problem solutions communicate through pipes with an evaluator, rather than reading an input file and writing an output file. This allows problems where access to the full dataset is unnecessary, and is either not allowed or too big to be read completely within the time limit.