Saturday, October 25, 2008

F# mindblow... simple math expression evaluator

One of the things I typically do when I learn a new programming language is, of course, write a program in it. I try to pick something that will help me understand not just the syntax of the language, but will also help me pick up the style and thinking patterns of the language.

One of the little tasks I typically force upon myself is a simple infix expression evaluator. For the uninitiated, 'infix notation' is the form of notation we humans commonly write our mathematical expression in - that is, the form "a + b * c" etc. Here's a wiki!

Anywho - I decided today was the time for F# to get its own infix evaluator...

Feel free to laugh. I'm laughing at myself, actually - way down inside, in the tiny places, where I harbor my self-loathing and self-doubt, down where I quietly plot my own demise... *ahem*

It took me about 7 hours today to write a total of 95 lines of code. That's ~13 lines of code an hour. Horrible, horrible. Though I'm really not beating myself up as badly as you might think - it took me as long to write this evaluator in VB.NET when I was learning that language (though not nearly as long in C# due to the syntax being the only difference between the languages). I'm just not used to thinking recursively. Sure, I've done recursion in C#, VB.NET, etc., but I've never really used it to replace what would otherwise be imperative-style constructs before. I've only used it when something *had* to be recursive. Also I wanted to adhere to the pure functional programming dogma - zero side effects. This code does indeed adhere to this self-imposed mandate.

Before I post the code, I want to stress that this is *not* optimal code. I know a little about tail recursion, etc., but I *clearly* didn't use it here. I don't even know if I *can* use it here. Also I'm not terribly knowledgeable about the various Seq and List functions that are available, so I probably flubbed them up as well. I don't really care - this works, and I'm proud of it. This is my first major attempt at writing something of any facility in a functional programming language. Also I'd like to mention that I am *very* open to suggestions. If I did something really, truly stupid in here, please call me on it. This is a learning exercise, and I'd like to learn as much as I can from it - part of that is peer review.

So, without further ado, here it is. (Apologies for the different theme - I'm doing this at home and haven't gotten around to installing the darker theme yet.)

    1 #light


    3 open System


    5 type operator =

    6   | Add

    7   | Subtract

    8   | Multiply

    9   | Divide


   11 type token =

   12   | Number of int

   13   | OpenParen

   14   | CloseParen

   15   | Operator of operator


   17 let tokenize i =

   18   let rec tk s =

   19     match s with

   20     | ' ' :: tail -> tk tail

   21     | '(' :: tail -> OpenParen :: tk tail

   22     | ')' :: tail -> CloseParen :: tk tail

   23     | '+' :: tail -> Operator Add :: tk tail

   24     | '-' :: tail -> Operator Subtract :: tk tail

   25     | '*' :: tail -> Operator Multiply :: tk tail

   26     | '/' :: tail -> Operator Divide :: tk tail

   27     | n :: tail when Char.IsNumber(n) ->

   28       let rec intListToValue (i:int list) acc c =

   29         match i with

   30         | [] -> acc

   31         | n :: tail -> intListToValue tail (acc + (n * int (10.0 ** float c))) (c + 1)

   32       let rec tc (i:int list) (n:char) (s:char list) =

   33         let value = Int32.Parse(string n)

   34         match s with

   35         | n :: tail when Char.IsNumber(n) -> tc (value::i) n tail

   36         | x :: tail -> (Number (intListToValue i value 1)) :: tk (x :: tail)

   37         | [] -> [Number (intListToValue i value 1)]

   38       tc [] n tail


   40     | [] -> []

   41     | n :: _ -> failwith ("Parse error - invalid character detected: " + n.ToString())

   42   tk (Seq.to_list i)


   44 let evalTokens input =

   45   let pop2 (stack:'a list) = (stack.Head, stack.Tail.Head, stack.Tail.Tail)


   47   let eval a b op =

   48     match op with

   49     | Operator Add -> b + a

   50     | Operator Subtract -> b - a

   51     | Operator Multiply -> b * a

   52     | Operator Divide -> b / a

   53     | _ -> failwith "error parsing input"


   55   let rec eval_rec input numstack (opstack:token list) =

   56     match input with

   57     | Number n :: tail -> eval_rec tail (n::numstack) opstack


   59     | Operator op :: tail ->

   60       if opstack.Length <> 0 && opstack.Head > (Operator op) then

   61         let firstNum, secondNum, numstackRem = pop2 numstack

   62         let e = eval firstNum secondNum opstack.Head

   63         eval_rec tail (e::numstackRem) (Operator op::opstack.Tail)

   64       else

   65         eval_rec tail numstack (Operator op::opstack)


   67     | OpenParen :: tail -> eval_rec tail numstack (OpenParen::opstack)


   69     | CloseParen :: tail ->

   70       match opstack with

   71       | Operator op :: opsTail ->

   72         let firstNum, secondNum, numstackRem = pop2 numstack

   73         let e = eval firstNum secondNum (Operator op)

   74         eval_rec input (e::numstackRem) opsTail

   75       | OpenParen :: _ ->

   76         eval_rec tail numstack opstack.Tail

   77       | _ -> failwith "error parsing input"


   79     | [] ->

   80       match opstack with

   81       | Operator op :: tail ->

   82         let firstNum, secondNum, numstackRem = pop2 numstack

   83         let e = eval firstNum secondNum (Operator op)

   84         eval_rec [] (e::numstackRem) tail

   85       | [] -> numstack.Head

   86       | _ -> failwith "error parsing input"


   88   eval_rec input [] []

You call this like so: evaluate "((1 + 2) * 3) / 3" -- this would, of course, evaluate to 3, as expected - it understands and follows the correct order of operations, and respects parenthesis correctly as expected. It doesn't yet handle floats, but I might get around to that at some point. It'll complicate the parsing, but that's not really that big of a deal. The way I'm handling it is weird anyway...

Well there you have it. My first real F# program of any value at all. Despite taking me 7 hours, I'm rather proud of this. It's taught me a great deal, especially that I need to be patient with myself while trying to learn to think in functions. Also, staying pure and functional can be *hard*, especially coming from an imperative / mutable state world. I should also note that my first C# implementation included around 4 times the number of lines of code...

All told, I really enjoyed my dive into F#. My advice to anyone looking into it as a real programming language (and not as a toy) would be to keep an open mind and give it a fair shot.


Steve Owens said...

I'm just beginning F# myself, so I appreciate your struggles. I can't follow all your code, but is the expression

(1 + 2 * 3) / 3

really supposed to evaluate to 3? Shouldn't "*" bind 2 and 3 before 1 is added?

If I knew just a bit more (and wasn't feeling lazy right now), I'd try to fiddle with the code to see what's going on.

Erik said...

Whoops, typo - you're absolutely right, I meant to add a second set of parenthesis, such that the expression would read: "((1+2)*3)/3" - I'll fix. Thanks for the catch! :)

DaWillie said...

F# is indeed mind blowing, in a good way. Since you said you are still learning F#, i though i'll shortly mention what i would have done differently. Your code is in general fine, though a fun mix of meaningfull names and abrevations.

First, distriminated unions can have members, so the following might be interesting to add to 'operator':
member this.Apply(x,y) =
match this with
| Add -> x + y
| Subtract -> x - y
| Multiply -> x * y
| Divide -> x / y

Second, the function pop2 is unnesesary since you can write patterns on the left side with 'let'. So you can just write:
let firstNum::secondNum::numstackRem = numstack

Third, don't write the wildcard in patterns if you already have given a complete pattern. If you later change the arguments type you won't get a 'incomple pattern' warning.

Lastly, you mentioned:
"...haven't gotten around to installing the darker theme yet"
In a future release of F#, if your write #dark instead of #light, you'll get a dark theme but still have the "white syntax".
While some (meaning not me) believe functinal programming should be done with a bright on dark theme, keeping the default F# look will help others (meaning me) read it faster.

Also a thing you might have missed, F# allows comments with the use of // or (* *) ; )

(note: first post code indent looked wierd)

Erik said...

Great feedback dawillie - thanks a lot!

First: adding members to unions - thank you, I didn't know that! That definitely cleans up the code a bit, and would make it easier to add new operators in the future.

Second: re pop2 - I tried as you suggested, but got a compiler warning stating that [_] would not be matched by the expression. I don't think that's an issue in a properly formed expression, but I don't know what would happen at runtime in such a case.

Third: re wildcard patterns - not sure where in the program you're referring to.

Forth: I thought the '#light' directive was in reference to ocaml compatability, rather than color themes. *shrugs*

Last: re comments - yeah, I know. =) I don't typically add comments as I code, and I was too lazy to go through and add them afterwards. =X

Thanks a bunch for your comments - I appreciate it. :)

DaWillie said...

@ pop2 vs pattern. The warning is there for a reason, and the use of a function is almost like turning a blind eye to the problem. If you try calling pop2 with a list of zero or one element, you'll get a KeyNotFoundException. So you don't solve the problem using the function, you're just not seeing the warning. Therefore ensure that the list has atleast 2 element before doing either.

@ wildcard patterns: '| _ -> ' This is what i mean by wildcard. It accepts anything. I took a second look at your code, and it isn't used unnesesary but for example the type of eval a b op is int -> int -> token -> int, where as one would expect int -> int -> operator -> int, making the wildcard redundant.

@ #light, this is actually used to specify you won't be writing OCaml compatible code.

Fixing tailrecursion for tk is as simple as for intListToValue, first four lines would be:
let rec tk (tkacc:token list) s =
match s with
| ' ' :: tail -> tk tkacc tail
| '(' :: tail -> tk (OpenParen :: tkacc) tail

and the last would be
----| [] -> tkacc
----| n :: _ -> failwith ("Parse error - invalid character detected: " + n.ToString())
List.rev (tk [] (Seq.to_list i))
Hope this didn't spoil the fun of finding out for yourself.