Elsewhere > OTP: a Functional Approach (or Three)
Posted 2015-01-29 on viget.com
I initially started the OTP
challenge
as a fun way to write some OCaml. It was, so much
so that I wrote solutions in two other functional languages,
Haskell and
Elixir. I structured all three sets of
programs the same so that I could easily see their similarities and
differences. Check out the encrypt
program in
all
three
languages
and then I’ll share some of my favorite parts. Go ahead, I’ll wait.
Don’t Cross the Streams
One tricky part of the OTP challenge is that you have to cycle over the key if it’s shorter than the plaintext. My initial approaches involved passing around an offset and using the modulo operator, like this:
let get_mask key index =
let c1 = List.nth key (index mod (List.length key))
and c2 = List.nth key ((index + 1) mod (List.length key)) in
int_from_hex_chars c1 c2
Pretty gross, huh? Fortunately, both Haskell and Elixir have built-in functionality for lazy, cyclical lists, and OCaml (with the Batteries library) has the Dllist (doubly-linked list) data structure. The OCaml code above becomes simply:
let get_mask key =
let c1 = Dllist.get key
and c2 = Dllist.get (Dllist.next key) in
int_of_hex_chars c1 c2
No more passing around indexes or using mod
to stay within the bounds
of the array – the Dllist handles that for us.
Similarly, a naïve Elixir approach:
def get_mask(key, index) do
c1 = Enum.at(key, rem(index, length(key)))
c2 = Enum.at(key, rem(index + 1, length(key)))
int_of_hex_chars(c1, c2)
end
And with streams activated:
def get_mask(key) do
Enum.take(key, 2) |> int_of_hex_chars
end
Check out the source code (OCaml, Haskell, Elixir) to get a better sense of cyclical data structures in action.
Partial Function Application
Most programming languages have a clear distinction between function
arguments (input) and return values (output). The line is less clear in
ML-derived
languages like Haskell and OCaml. Check this out (from Haskell’s ghci
interactive shell):
Prelude> let add x y = x + y
Prelude> add 5 7
12
We create a function, add
, that (seemingly) takes two arguments and
returns their sum.
Prelude> let add5 = add 5
Prelude> add5 7
12
But what’s this? Using our existing add
function, we’ve created
another function, add5
, that takes a single argument and adds five to
it. So while add
appears to take two arguments and sum them, it
actually takes one argument and returns a function that takes one
argument and adds it to the argument passed to the initial function.
When you inspect the type of add
, you can see this lack of distinction
between input and output:
Prelude> :type add
add :: Num a => a -> a -> a
Haskell and OCaml use a concept called currying or partial function application. It’s a pretty big departure from the C-derived languages most of us are used to. Other languages may offer currying as an option, but this is just how these languages work, out of the box, all of the time.
Let’s see this concept in action. To convert a number to its hex
representation, you call printf "%x" num
. To convert a whole list of
numbers, pass the partially applied function printf "%x"
to map
,
like
so:
hexStringOfInts nums = concat $ map (printf "%x") nums
For more info on currying/partial function application, check out Learn You a Haskell for Great Good.
A Friendly Compiler
I learned to program with C++ and Java, where gcc
and javac
weren’t
my friends – they were jerks, making me jump through a bunch of hoops
without catching any actual issues (or so teenage Dave thought). I’ve
worked almost exclusively with interpreted languages in the intervening
10+ years, so it was fascinating to work with Haskell and OCaml,
languages with compilers that catch real issues. Here’s my original
decrypt
function in Haskell:
decrypt ciphertext key = case ciphertext of
[] -> []
c1:c2:cs -> xor (intOfHexChars [c1, c2]) (getMask key) : decrypt cs (drop 2 key)
Using pattern matching, I pull off the first two characters of the ciphertext and decrypt them against they key, and then recurse on the rest of the ciphertext. If the list is empty, we’re done. When I compiled the code, I received the following:
decrypt.hs:16:26: Warning:
Pattern match(es) are non-exhaustive
In a case alternative: Patterns not matched: [_]
The Haskell compiler is telling me that I haven’t accounted for a list consisting of a single character. And sure enough, this is invalid input that a user could nevertheless use to call the program. Adding the following handles the failure and fixes the warning:
decrypt ciphertext key = case ciphertext of
[] -> []
[_] -> error "Invalid ciphertext"
c1:c2:cs -> xor (intOfHexChars [c1, c2]) (getMask key) : decrypt cs (drop 2 key)
Elixir’s |> operator
According to Programming
Elixir, the pipe
operator (|>
)
takes the result of the expression to its left and inserts it as the first parameter of the function invocation to its right.
It’s borrowed from F#, so it’s not an entirely novel concept, but it’s certainly new to me. To build our key, we want to take the first argument passed into the program, convert it to a list of characters, and then turn it to a cyclical stream. My initial approach looked something like this:
key = Stream.cycle(to_char_list(List.first(System.argv)))
Using the pipe operator, we can flip that around into something much more readable:
key = System.argv |> List.first |> to_char_list |> Stream.cycle
I like it. Reminds me of Unix pipes or any Western written language. Here’s how I use the pipe operator in my encrypt solution.
At the end of this process, I think Haskell offers the most elegant code and Elixir the most potential for us at Viget to use professionally. OCaml offers a good middle ground between theory and practice, though the lack of a robust standard library is a bummer, man.
I had a great time writing and refactoring these solutions. I encourage you to check out the code, fork the repo, and take the challenge yourself.