Search And Destroy

Look out honey, ’cause I’m using technology!

Archive for the ‘Haskell’ Category

Heavy Lifting

Posted by kilfour on December 21, 2011

Yesterday I was having lunch with some co-workers of mine and I mentioned ‘lifting’.

I gave a very general description of what it is and what it’s used for and of them said : ‘It sounds beautiful, but I have no idea what you’re talking about’.

So here’s an example.

The ‘metaphysical’ description goes something like this :

A lifting function takes a function that operates in a certain context as it’s argument and returns a transformation of that function that can operate in a different context.

A C# example :

Consider the following class which wrappes an int and that I have imaginatively called Wrapper :

public class Wrapper
{
    public Wrapper(int value)
    {
        Value = value;
    }

    public int Value { get; private set; }
    public override string ToString()
    {
        return string.Format("{0} wrapped", Value);
    }
}

And then we have the following functions which operate on int’s.

public int Add(int a, int b)
{
    return a + b;
}

public int Multiply(int a, int b)
{
    return a * b;
}

We can then define a lifting function which unwraps the int’s, applies another function and then wraps up the result again.
public Wrapper Lift(Func<int, int, int> f, Wrapper a, Wrapper b)
{
    return new Wrapper(f(a.Value, b.Value));
}

Then we can use the two functions above and apply them to instances of Wrapper like so :

Console.WriteLine(Lift(Add, new Wrapper(41), new Wrapper(1)));

results in : Wrapper 42
Console.WriteLine(Lift(Multiply, new Wrapper(333), new Wrapper(2)));

results in : Wrapper 666

Now I can’t think of a lot of situations (well I can’t think of any actually) where code like that would be useful in C# but let’s see what we can do with it in Haskell.

Translating above code :

data Wrapper = Wrapper Int
  deriving Show 

It’s pretty identical to the C# Wrapper class although we don’t need all that boilerplate code.
The deriving Show part gives us a default ToString Implementation f.i.

Here’s the ‘lift’ function :

lift f (Wrapper a) (Wrapper b) = Wrapper (f a b)

Again similar to the C# example, although we can now use pattern matching and the compiler’s type inference capabilities to be ‘slightly’ less verbose.

We don’t have to define the add and multiply functions in Haskell as operators are functions as well, so there is no need to wrap them in yet more boilerplate code.

Let’s fire up the GHCI and verify that it behaves as expected.

>lift (+) (Wrapper 41) (Wrapper 1)
Wrapper 42
>lift (*) (Wrapper 2) (Wrapper 333)
Wrapper 666

Well that seems to work.

But even though everything is type checked at compile-time (Haskell being a static language just like C#), Haskell’s rich type system allows us to generalize even further to an extent that is simply not possible in C#.

The lift function does not touch the value of the Wrapper’s field, so it really could be anything and the function would still work.

Let’s change the definition of our Wrapper datatype to a more general version by introducing a type parameter.

data Wrapper a = Wrapper a
  deriving Show

The a is the type parameter, kind of like a generic in C#, albeit quite a bit more powerful.
Retrying the code again in the GHCI gives exactly the same result as before.
But now we can also do it for other types f.i. :

Doubles :

>lift (+) (Wrapper 41.5) (Wrapper 0.5)
Wrapper 42.0

Concatenating strings :

>lift (++) (Wrapper "hello ") (Wrapper "world")
Wrapper "hello world"

The types of the wrapped value don’t even have to be same as long as the function we’re lifting matches the parameters.
F.i. :

>let convertAndAppend a b = (read a):b 
>lift convertAndAppend (Wrapper "1") (Wrapper [2,3,4])
Wrapper [1,2,3,4]

‘read’ is similar to C#’s Parse(Int/Double/DateTime/…) functions and (:) is the operator for appending an element to a list.
So ‘a’ should be a string and ‘b’ should be a list of things that ‘a’ can be parsed into.

In Haskell the compiler is your friend, all this just works but when you try to do something nonsensical such as multiplying two strings you get a compile time error.

>lift (*) (Wrapper "hello ") (Wrapper "world")

<interactive>:1:6:
    No instance for (Num [Char])
      arising from a use of `*'
    Possible fix: add an instance declaration for (Num [Char])
    In the first argument of `lift', namely `(*)'
    In the expression: lift (*) (Wrapper "hello ") (Wrapper "world")
    In an equation for `it':
        it = lift (*) (Wrapper "hello ") (Wrapper "world")

Now even though I already think all that kind of parametric polymorphism is pretty kewl and useful, let’s look at why lifting is such an integral part of the Haskell language.

Take the following data structure :

data Session = Session ClockTime ClockTime

Just a simple struct with two fields.
The first representing the beginning of a session, the second the end.
I could have made that more explicit by using record syntax, but I’m trying to keep things simple.

And then we have the following function that calculates the total number of seconds elapsed for a given session.

sessionLength (Session begin end) =  tdSec $ diffClockTimes end begin

This is a ‘pure’ function.
Meaning it has no side-effects and every time we pass in the same Session parameter we can be sure we get the same result.

Now consider the following function :

timeIt = do
  begin <- getClockTime
  putStr "Timing, press any key to stop."
  getChar
  end <- getClockTime
  return $ Session begin end 

This function gets the current time (through getClockTime), then waits for the user to press a key gets the current time again and returns a Session structure with both times filled in.

It’s pretty easy to see that this function is not referentially transparent.
It will return a different result every time we call it.
So what happens when we want to apply our pure function to the result of this ‘impure’ function.

>sessionLength timeIt 

<interactive>:1:15:
    Couldn't match expected type `Session'
                with actual type `IO Session'
    In the first argument of `sessionLength', namely `timeIt'
    In the expression: sessionLength timeIt
    In an equation for `it': it = sessionLength timeIt

The compiler starts moaning.

The error message already explains it but looking at the actual type of our functions will make things clearer if you’re not used to Haskell’s compiler errors.

>:type sessionLength 
sessionLength :: Session -> Int
>:type timeIt 
timeIt :: IO Session

sessionLength takes a Session as a parameter and returns an Int (the number of seconds elapsed).
timeIt however, does not return a Session, but an IO Session.

So what’s this IO thing.

This is Haskell’s way of saying that we’re executing code that produces side-effects.

Seeing as the getClockTime function is impure, everything we do with the result of that will also be impure.

This kind of code is executed inside a specific context.
In this case the IO Monad.
And once something is inside a Monad it can never, ever, ever escape it.

This protects us from leaking side-effects into places where we don’t expect them.

It does however restricts us a little.

One solution would be to rewrite the sessionLength function to allow for side effects but that would deprive us of all of the advantages (such as ease of program verification for one) of pure code.

A better way to deal with this is lifting.

Although nothing is ever allowed out of the IO Monad, we _are_ allowed to bring things in.
More specifically we can lift pure functions into the IO Monad.

Similarly as in our Wrapper toy example the Control.Monad module defines a lifting function called liftM which does just that.
It takes a pure function and lifts it into the IO Monad context.

So this :

>liftM sessionLength timeIt 
Timing, press any key to stop.
2

Behaves exactly the way we want.

And now we can write most of our program and algorithms in a pure coding style and separate the side-effects in a very specific place that we then ofcourse unit-, or better yet, quickcheck-test like hell.

Posted in Haskell | Leave a Comment »

On Having Fun, Elaborating Part II

Posted by kilfour on September 13, 2011

On a sidenote :
String notation : “hello world”
A Char : ‘h’
An empty list : []
List containing the numbers 1, 2 and 3 : [1, 2, 3]

A String in Haskell is actually a list of Char’s.
I.e, :

"hello world" == ['h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd']

The ‘:’ operator is for appending an element to a list, I.e. :

"hello world" == 'h':['e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd']

Or even :
"hello world" == 'h':'e':'l':'l':'o':' ':'w':'o':'r':'l':'d':[]

The compareLine function.

compareLine :: Int -> Int -> String -> String -> String

As we can tell from the signature, this is a function which takes two Int’s, two String’s and returns a string.
It is a recursive function which we use to compare two lines (the String parameters).
The first int parameter is the lineIndex which we just pass in from the calling function and thread around every time we make a recursive call.
The colIndex parameter is initially 0 (passed in by calling code, see CompareLines) and get’s incremented with every recursive call.

What follows the functions signature, might look like a bunch of overloads to most C# devs, but it is actually pattern-matching.
When we call this function the arguments are matched against the expression on the left of the assignment operator, and if a match is found the expression on the right is executed.
These patterns are evaluated in the order that they are defined, so let’s go over them, one by one.

End of line or empty strings

compareLine lineIndex colIndex [] []

The Int’s can be any value as there is no specific pattern defined.
They are bound to the argument names so they can be used in the right hand of the function, just like c# parameters.

Both Strings however _are_ matched.
[] is the empty list, and in the case of a Char List, also an empty string. So : “” == []

This pattern will only match if we pass in two empty strings. I.e. :

compareLine 42 666 "" ""

In the context of our recursive compareLine function this pattern evaluates when we have arrived at the end of the line(s), and no difference was found.

Different string length

compareLine lineIndex colIndex [] (y:ys) = 		failure lineIndex colIndex
compareLine lineIndex colIndex (x:xs) [] = 		failure lineIndex colIndex

The new part of this pattern uses the list append operator.

(x:xs) is a common way of writing :
an x (atleast one) followed by a bunch of xs (could be zero).

Or differently put, a list with atleast one element.

In above pattern (x:xs) will match :
– “hello world”, which could be written as ‘h’:”ello world”
– “h”, which could be written as ‘h’:[]
But not “”.
In the context of our recursive compareLine function this pattern evaluates when one line is longer than the other.

Different Char’s

compareLine lineIndex colIndex (x:xs) (y:ys) | x /= y = failure lineIndex colIndex

Ok, so we know this will match if we have two non empty strings (x:xs) (y:ys).
But there’s a ‘guard’ in the pattern.
The expression between the pipe token and the assignment operator is the guard : x /= y.

‘/=’ is Haskell’s inequality operator.

x and y are defined in the pattern left of the guard.
In the same way parameters are bound to argument names, values broken down by patterns are assigned to the names used in the pattern.
Which means x is the first element of the first string parameter and y is the first element of the other string parameter.

The guard is doing char comparison.

Above pattern evaluates if we have two non-empty strings, and the first character of the first string differs from the first character of the second string.

Recursion

compareLine lineIndex colIndex (x:xs) (y:ys) | x == y = compareLine lineIndex (colIndex + 1)  xs ys

Knowing what we know we can read the pattern now as : we have two non-empty strings, and the first character of the first string _equals_ the first character of the second string.
The interesting part here is the right hand expression.
This is where the recursion happens.
We keep the lineIndex, increment the colIndex counter and pass in the decomposed ‘tail’ lists (xs and ys).
F.i.

compareLine 0 0 "hello" "hello"

Will match, seeing as :
('h':"ello") ('h':"ello") | 'h' == 'h'

And results in :
compareLine 0 1 "ello" "ello"

Posted in Haskell | Leave a Comment »

On Having Fun, Elaborating Part I

Posted by kilfour on September 9, 2011

The module declaration.

module FileCompare (
    fileDiff
)

Modules are the primary way of encapsulation in Haskell.
The statement above exposes one function to the outside-world : ‘fileDiff’.
Everything else we use in the module can not be used anywhere else, it is ‘private’.

Import statement(s)

import System.IO (readFile)

Quite similar to the using statement in c#, this declares dependencies of one module on another.
An interesting thing to note though, is that I am able to restrict which functions of the System.IO module (namespace if you will) are imported.
In this case I’m only using readFile so I inform the compiler of this, which will allow it to optimize.

It also expresses intent quite clearly.
If I inadvertently use something else I _will_ get a compiler error.
I.e. the module’s declaration will give me some important clues of what is going on in the implementation.

The failure function

failure :: Int -> Int -> String
failure lineIndex colIndex  =
    "Files differ at line : " ++ (show lineIndex) ++ ", column : " ++ (show colIndex) ++ "."

The first part is the function’s signature.
It’s name followed by it’s arguments and ending with the return type.
So this one takes two Int’s and returns a String.

The slightly weird arrow notation actually makes sense if you take in to account currying.

The following (not very usefull) functions show this by example.

failureOnLineTwo :: Int -> String
failureOnLineTwo lineIndex colIndex = failure 2

failureOnLineTwoColumnThree :: String
failureOnLineTwoColumnThree = failureOnLineTwo 3

As you can tell from the signature ‘failureOnLineTwo’ is a function that takes an int and returns a string.
‘failureOnLineTwoColumnThree’ is just a string. All the free parameters have been filled in so it evaluates to a value.

‘++’ is Haskell’s string concatenation operator.

Posted in Haskell | Leave a Comment »

On Having Fun

Posted by kilfour on August 27, 2011

As mentioned (often) before, a while ago I took a break from professional programming.
During which I started honing my functional programming skills.
I reckon’ I was, and still am, quite proficient in OCaml.

After that I got really interested in Haskell, so I experimented.
But the unforgiving nature of the type system was really hard to deal with and I had a lot of trouble implementing anything usefull.
I managed to pull some things off, but really didn’t get why.

Recently I have picked it up again, and it’s all starting to make sense.

Still having trouble making stuff compile and getting things to work.
The fact that I’m using a different setup/IDE/OS than before, probably isn’t helping.
But I’m really, _really_ having fun with this.

Currently looking into Happstack/HStringTemplate/et-all in order to deliver a simple web app, and I was quite proud of myself that I got a ‘hello world’ working with minimal code, and that I actually understood all of what is going on ;-) .

Another thing that I’m currently investigating is Parsec, Haskell’s built in parser.
Delivering the power of regular expressions _without_ sacrificing readability.
It is not that much more verbose than that horrendous regex stuff, but I can actually program it without having to resort to some kind of tool that translates it for me.

Here’s a small (unrelated to all of the above) thing I cooked up :

module FileCompare (
    fileDiff
) where


import System.IO (readFile)

failure :: Int -> Int -> String
failure lineIndex colIndex  =
    "Files differ at line : " ++ (show lineIndex) ++ ", column : " ++ (show colIndex) ++ ".";

compareLine :: Int -> Int -> String -> String -> String
compareLine lineIndex colIndex [] [] = 				    ""
compareLine lineIndex colIndex (x:xs) [] = 			    failure lineIndex colIndex
compareLine lineIndex colIndex [] (y:ys) = 			    failure lineIndex colIndex
compareLine lineIndex colIndex (x:xs) (y:ys) | x /= y = failure lineIndex colIndex
compareLine lineIndex colIndex (x:xs) (y:ys) | x == y =	compareLine lineIndex (colIndex + 1)  xs ys

compareLines :: Int -> [String] -> [String] -> String
compareLines lineIndex (x:xs) (y:ys) = 	(compareLine lineIndex 0 x y) ++ (compareLines (lineIndex +1) xs ys)
compareLines lineIndex [] (y:ys) = 	"File two contains more lines then file one."
compareLines lineIndex (x:xs) [] = 	"File one contains more lines then file two."
compareLines lineIndex [] [] = 		"Files are identical."

diff :: String -> String -> String
diff a b = compareLines 0 (lines a) (lines b)

fileDiff :: FilePath -> FilePath -> IO String
fileDiff f1 f2 = do
  fileOne <- readFile f1
  fileTwo <- readFile f2
  return $ diff fileOne fileTwo

Performance equals a C++ version and it can easily handle huge files.

Tell me what you think.

For the imperative minded : start reading at the bottom and then move up ;-) .

Posted in Haskell, Rants | Leave a Comment »

One of Those Nights

Posted by kilfour on August 24, 2011

QuickGenerate suits my needs for the moment.

QuickDotNetCheck still isn’t working the way I want to, but I’m going to let it rest for a little while.
I’ll probably iron out a couple of bugs I have recently introduced, so that I can start using it full-time at work.
But after that I’m going to let the idea ripe in my head a little.

So what to do for fun at home.

And then there’s Haskell…

Particulary interested in getting some web-related stuff working.

Posted in Haskell, Rants | Leave a Comment »

 
Follow

Get every new post delivered to your Inbox.