Year 2015 Day 4
This day introduces us to a little bit of FFI, linking to openssl to use it's
MD5
functionality.
import Data.String
import Data.Bits
import System.FFI
import Prim.Array
FFI
We will be using openssl's MD5
function to do the hashing, so we will need to
provide a couple of foreign function definitions.
First, we'll provide one for the C standard library's strlen
. We could get
this value from within idris, but we'll do it through FFI to flesh out the
example a bit more.
The format of the specifier after %foreign
is "language:function name,library
name". Since strlen
is the c standard library, we use libc
as the library
name.
Idris will automatically convert primitive types like String
s and Int
s to
their C equivalents at the FFI boundary.
PrimIO
is a special, basic type of IO action based on linear types, we won't
be getting into the specifics right now.
By convention, primitive FFI function names start with prim__
in idris.
%foreign "C:strlen,libc"
prim__strlen : String -> PrimIO Int
Now we need one for the MD5
function from openssl, the openssl objectfile
calls itself libcrypto
, so we'll use that as the library name.
The MD5
function actually returns a pointer to the hash value, but since the
3rd argument is the pointer it expects to write the hash to, we already have
that value and we'll ignore it by having our prim__md5
return unit.
The second argument is the length of the input string in bytes.
%foreign "C:MD5,libcrypto"
prim__md5 : String -> Int -> Ptr Bits8 -> PrimIO ()
Now we glue these parts together in our higher level md5'
function.
We use malloc
from System.FFI
to allocate a buffer for prim__md5
to write
to. This returns an AnyPtr
, when we need a Ptr Bits8
, so we'll need to cast
it, using prim__castPtr
from the prelude.
We then use our prim__strlen
function to get the length of our input string
from within C, using primIO
from the prelude to "lift" the resulting
PrimIO Int
into our io
context.
We then feed the string, the calculated length, and our casted buffer into our
prim__md5
function, then use prim__getArrayBits8
from the
c-ffi library to extract each of the
bytes, passing them into a pure idris helper function to convert into hex
representation.
Since we are working with C FFI and performing a little bit of manual memory
management here, we must remember to free
our pointer, and then we can splice
together our output string and return.
md5' : HasIO io => String -> io String
md5' str = do
buffer <- malloc 16
len <- primIO $ prim__strlen str
raw_md5 <- primIO $ prim__md5 str len (prim__castPtr buffer)
let bytes = map (\x => prim__getArrayBits8 (prim__castPtr buffer) x) [0..15]
free buffer
pure . joinBy "" . map byteToHex $ bytes
where
hexDigit : Bits8 -> Char
hexDigit n =
if n < 10
then chr (cast n + ord '0')
else chr (cast (n - 10) + ord 'a')
byteToHex : Bits8 -> String
byteToHex n =
pack [hexDigit (n `shiftR` 4), hexDigit (n .&. 0xF)]
Now, because foreign functions return a PrimIO
by default, our md'
still
requires some sort of HasIO
(like IO). This is undesirable here, after all,
MD5 is a hash function, so it really ought to behave like a pure function. We
know from taking a careful look at what FFI functions we are invoking that we
aren't altering any global state.
Idris has an escape hatch for these types of situations, unsafePerformIO
. This
function is, as the name suggests, quite unsafe, so it shouldn't be used if
you don't fully understand the implications, and only sparingly at that, but
doing FFI to C is an intrinsically unsafe operation, so it's morally okay to use
here, as long as we are careful to not violate referential transparency or type
safety. Haskell programmers should note that unsafePerformIO
is much easier
to use correctly in Idris, largely as a result of Idris being strict by default.
Finally, we'll construct our top level md5
function, wrapping md5'
in
unsafePerformIO
md5 : String -> String
md5 str = unsafePerformIO $ md5' str
Solver functions
Count the leading zeros in a string
There is a more clever way to do this making use of a "view" strings have for interacting with them as if they were lazy lists, but we've already covered so much here that we'll save that one for another time and just do it the straight forward way, having a top level function that accepts a string, turns it into a list of chars, and then pass it into a tail recursive helper function to actually count the zeros.
countZeros : String -> Nat
countZeros str = countZeros' (unpack str) 0
where
countZeros' : (xs : List Char) -> (acc : Nat) -> Nat
countZeros' [] acc = acc
countZeros' ('0' :: xs) acc = countZeros' xs (acc + 1)
countZeros' (x :: xs) acc = acc
Count the number of zeros for a specific input
Concatenate on our number to our secret key, hash it, and count the zeros
checkZeros : (key : String) -> (number : Nat) -> Nat
checkZeros key number =
let hash = md5 (key ++ show number)
in countZeros hash
Find the first input with a specific number of zeros
Search numbers starting at the provided seed and counting up in a tail recursive
helper function. This function is going to be effectively impossible to prove
totality for, so we have removed our normal %default total
annotation for this
file
findFirst : (zeros : Nat) -> (key : String) -> (seed : Nat) -> Nat
findFirst zeros key current =
if checkZeros key current >= zeros
then current
else findFirst zeros key (current + 1)
Part Functions
Part 1
Pass the input into our findFirst
function and use the result as both our
output and context value. Since part2 is searching for a longer run of 0s than
part 1, numbers already checked by part 1 can not possibly be valid solutions
for part 2, so we can skip them in part 2.
part1 : Eff (PartEff String) (Nat, Nat)
part1 = do
input <- map trim $ askAt "input"
let number = findFirst 5 input 0
pure (number, number)
Part 2
Start the findFirst
search from the point where part 1 left off.
part2 : Nat -> Eff (PartEff String) Nat
part2 seed = do
input <- map trim $ askAt "input"
let number = findFirst 6 input seed
pure number