Year 2015 Day 4

This day introduces us to a little bit of FFI, linking to openssl to use it's MD5 functionality.


import Data.String
import Data.Bits
import System.FFI
import Prim.Array

FFI

We will be using openssl's MD5 function to do the hashing, so we will need to provide a couple of foreign function definitions.

First, we'll provide one for the C standard library's strlen. We could get this value from within idris, but we'll do it through FFI to flesh out the example a bit more.

The format of the specifier after %foreign is "language:function name,library name". Since strlen is the c standard library, we use libc as the library name.

Idris will automatically convert primitive types like Strings and Ints to their C equivalents at the FFI boundary.

PrimIO is a special, basic type of IO action based on linear types, we won't be getting into the specifics right now.

By convention, primitive FFI function names start with prim__ in idris.


%foreign "C:strlen,libc"
prim__strlen : String -> PrimIO Int

Now we need one for the MD5 function from openssl, the openssl objectfile calls itself libcrypto, so we'll use that as the library name.

The MD5 function actually returns a pointer to the hash value, but since the 3rd argument is the pointer it expects to write the hash to, we already have that value and we'll ignore it by having our prim__md5 return unit.

The second argument is the length of the input string in bytes.


%foreign "C:MD5,libcrypto"
prim__md5 : String -> Int -> Ptr Bits8 -> PrimIO ()

Now we glue these parts together in our higher level md5' function.

We use malloc from System.FFI to allocate a buffer for prim__md5 to write to. This returns an AnyPtr, when we need a Ptr Bits8, so we'll need to cast it, using prim__castPtr from the prelude.

We then use our prim__strlen function to get the length of our input string from within C, using primIO from the prelude to "lift" the resulting PrimIO Int into our io context.

We then feed the string, the calculated length, and our casted buffer into our prim__md5 function, then use prim__getArrayBits8 from the c-ffi library to extract each of the bytes, passing them into a pure idris helper function to convert into hex representation.

Since we are working with C FFI and performing a little bit of manual memory management here, we must remember to free our pointer, and then we can splice together our output string and return.


md5' : HasIO io => String -> io String
md5' str = do
  buffer <- malloc 16
  len <- primIO $ prim__strlen str
  raw_md5 <- primIO $ prim__md5 str len (prim__castPtr buffer)
  let bytes = map (\x => prim__getArrayBits8 (prim__castPtr buffer) x) [0..15]
  free buffer
  pure . joinBy "" . map byteToHex $ bytes
  where
    hexDigit : Bits8 -> Char
    hexDigit n = 
      if n < 10 
        then chr (cast n + ord '0')
             else chr (cast (n - 10) + ord 'a')
    byteToHex : Bits8 -> String
    byteToHex n =   
      pack [hexDigit (n `shiftR` 4), hexDigit (n .&. 0xF)]

Now, because foreign functions return a PrimIO by default, our md' still requires some sort of HasIO (like IO). This is undesirable here, after all, MD5 is a hash function, so it really ought to behave like a pure function. We know from taking a careful look at what FFI functions we are invoking that we aren't altering any global state.

Idris has an escape hatch for these types of situations, unsafePerformIO. This function is, as the name suggests, quite unsafe, so it shouldn't be used if you don't fully understand the implications, and only sparingly at that, but doing FFI to C is an intrinsically unsafe operation, so it's morally okay to use here, as long as we are careful to not violate referential transparency or type safety. Haskell programmers should note that unsafePerformIO is much easier to use correctly in Idris, largely as a result of Idris being strict by default.

Finally, we'll construct our top level md5 function, wrapping md5' in unsafePerformIO


md5 : String -> String
md5 str = unsafePerformIO $ md5' str

Solver functions

Count the leading zeros in a string

There is a more clever way to do this making use of a "view" strings have for interacting with them as if they were lazy lists, but we've already covered so much here that we'll save that one for another time and just do it the straight forward way, having a top level function that accepts a string, turns it into a list of chars, and then pass it into a tail recursive helper function to actually count the zeros.


countZeros : String -> Nat
countZeros str = countZeros' (unpack str) 0
  where
  countZeros' : (xs : List Char) -> (acc : Nat) -> Nat
  countZeros' [] acc = acc
  countZeros' ('0' :: xs) acc = countZeros' xs (acc + 1)
  countZeros' (x :: xs) acc = acc

Count the number of zeros for a specific input

Concatenate on our number to our secret key, hash it, and count the zeros


checkZeros : (key : String) -> (number : Nat) -> Nat
checkZeros key number = 
  let hash = md5 (key ++ show number)
  in countZeros hash

Find the first input with a specific number of zeros

Search numbers starting at the provided seed and counting up in a tail recursive helper function. This function is going to be effectively impossible to prove totality for, so we have removed our normal %default total annotation for this file


findFirst : (zeros : Nat) -> (key : String) -> (seed : Nat) -> Nat
findFirst zeros key current = 
  if checkZeros key current >= zeros 
    then current
    else findFirst zeros key (current + 1)

Part Functions

Part 1

Pass the input into our findFirst function and use the result as both our output and context value. Since part2 is searching for a longer run of 0s than part 1, numbers already checked by part 1 can not possibly be valid solutions for part 2, so we can skip them in part 2.


part1 : Eff (PartEff String) (Nat, Nat)
part1 = do
  input <- map trim $ askAt "input"
  let number = findFirst 5 input 0
  pure (number, number)

Part 2

Start the findFirst search from the point where part 1 left off.


part2 : Nat -> Eff (PartEff String) Nat
part2 seed = do
  input <- map trim $ askAt "input"
  let number = findFirst 6 input seed
  pure number