The Great Theorem Prover Challenge II

7 minute read

This is the second of a series on Hillel Wayne’s Great Theorem Prover Showdown. Make sure to check out the first part! You can also follow my progress in this repository.

In the last installation of “Cam learns to use a theorem prover”, we got acquainted with Liquid Haskell, induction, and formal specifications while writing a verified version of the legendary LeftPad function.

Today, I’ll be going over my solution to the next challenge, “Unique”. This one actually gave me more trouble than LeftPad, contrary to Twitter’s opinion of it.

Our task is to, given a list L construct a list containing every unique element of L exactly once (in any order).

Let’s start with the code itself:

{[email protected] reflect elem @-}
elem :: Eq a => a -> [a] -> Bool
elem x [] = False
elem x (y:ys) = x == y || elem x ys

{[email protected] reflect unique @-}
unique :: Eq a => [a] -> [a]
unique [] = []
unique (x:xs)
  | x `elem` xs = unique xs
  | otherwise = x:unique xs

This is fairly straightforward – an element x is unique in x:xs if x is not present in xs, so we keep it. Now for the specification:

Takes a sequence of integers, returns the unique elements of that list. There is no requirement on the ordering of the returned values.

As usual, I think that our code is “obviously” correct. In fact, our code is even more general – we can take any Eq a, but it specializes to integers easily enough.

Let’s see if we can prove it.

The “Obvious” Part

The “main” property of Unique is that every element in xs appears exactly once in unique xs. And this is important, because it means that a) we didn’t drop any elements and b) we actually removed duplicate elements. We can do so like this:

{[email protected] reflect count @-}
count :: Eq a => a -> [a] -> Int
count _ [] = 0
count x (y:ys)
  | x == y = 1 + count x ys
  | otherwise = count x ys

{[email protected] thmUniqueNoDup :: xs:[a] -> {x:a | elem x xs} ->
    {count x (unique xs) == 1}
  @-}
thmUniqueNoDup :: Eq a => [a] -> a -> ()

Now, here is where I got stuck. No matter what I tried, I couldn’t quite come up with a way to make LH infer this proof automagically, the way we did before.

It turns out that LH does come with some hand-guidance tools. We can, in fact, step through code execution symbolically to show that it exhibits some property (correctness or otherwise) for all relevant inputs. While automating the entire process can be computationally intractable, we (as humans) have a fair bit more intuition and can “know” where to begin – we can write the proof, and the computer can check if our proof is correct1.

Here’s the case where y is an element of xs.

thmUniqueNoDup :: Eq a => [a] -> a -> Proof
thmUniqueNoDup (x:xs) y
  | y `elem` xs =
      if x == y then
              count y (unique (x:xs))
          ==. count y (unique xs)
          ==. 1 ? thmUniqueNoDup xs y
          *** QED   
      else
              count y (unique (x:xs))
          ==. count y (x:unique xs)
          ==. count y (unique xs)
          ==. 1 ? thmUniqueNoDup xs y
          *** QED

Let’s break this down a bit. ==. is the Liquid Haskell combinator for “extensionally equivalent to”, which means that the expressions on both sides will evaluate to the same value (or lack of value). When we use this, we’re asking LH to check whether using that equivalence as a step is justified2.

As before, though, sometimes LH can’t figure out that a given step is justified on its own. Instead, we can provide a justification for a step with the ? combinator. For example, in this step

count y (unique xs) ==. 1 ? thmUniqueNoDup xs y

By our assumption on y, we know y elem xs. Thus, by the types of thmUniqueNoDup (that is, by our inductive hypothesis), we know that count y (unique xs) evaluates to 1.

The final thing here is *** QED, which finalizes the “proof” in Haskell types. We need this to ensure that we can actually have a term of the Haskell type Proof provided by Liquid Haskell.

Next case might start looking like this:

thmUniqueNoDup :: Eq a => [a] -> a -> Proof
thmUniqueNoDup (x:xs) y
  | otherwise =
              count y (unique (x:xs))
          ==. 1 + count y (unique xs)
          ==. 1 ? someTheorem
          *** QED

So what can we use for someTheorem? We’d need to somehow show that count y (unique xs) is 0, but our inductive hypothesis can only tell us that count x (unique xs) is equal to 1!

It turns out that we don’t actually have enough to show that the other case is sound. To do that, we’ll need two auxiliary theorems:

{[email protected] reflect thmElemCount @-}
{[email protected] thmElemCount :: Eq a =>
      x:a -> xs:{[a] | not elem x xs} -> { count x xs == 0 }
  @-}
thmElemCount :: Eq a => a -> [a] -> ()
thmElemCount x [] = ()
thmElemCount x (y:ys) = thmElemCount x ys


{[email protected] thmUniqueExistInv ::
      xs:[a] ->
      {x:a | not elem x xs} ->
      { not elem x (unique xs) }
  @-}
thmUniqueExistInv :: Eq a => [a] -> a -> Proof
thmUniqueExistInv [] y = trivial *** QED
thmUniqueExistInv (x:xs) y
  =   elem y (unique (x:xs))
  ==. elem y (x:unique xs)
  ==. elem y (unique xs)
  ==. False ? thmUniqueExistInv xs y
  *** QED

The first simply creates and proves the “obvious” relation between elem and count. We’ll ultimately be invoking this theorem to get our 0. The second is slightly more interesting. Note that the precondition for thmElemCount is that x is not an element of xs. However, we haven’t actually shown that unique xs won’t contain any elements not in xs! So we need to prove it.

This gives us our final step:

...
==. 1 + count y (unique xs)
==. 1 ? thmUniqueExistInv xs y `seq` thmElemCount y (unique xs)
*** QED

Notice the seq here. In order to cite thmElemCount, we need to show that y is not an element of unique xs, so we must first use thmUniqueExistInv.

The Less Obvious Part

In most worlds, this would be enough. We’ve shown that, for every element in xs, unique xs contains that element exactly once. Incidentally, we also showed that unique xs didn’t “spawn” elements – if an element isn’t in xs, it can’t be in unique xs either.

However, we’re not done yet. The world of type theory (of which Liquid Haskell’s refinement types are a subworld) is one based on constructive logic (otherwise known as intuitionistic logic), which removes the law of the excluded middle. That is, . In this particular case, that means that we’re not done! We’ve only proven that every element in unique xs is not not in xs, which is not the same as saying that it’s in xs3.

For the sake of completeness, let’s show that x being in unique xs implies that x is in xs as well!

{[email protected] thmUniqueExist ::
      xs:[a] ->
      {x:a | elem x (unique xs)} ->
      { elem x xs }
  @-}
thmUniqueExist :: Eq a => [a] -> a -> Proof
thmUniqueExist (x:xs) y
  | y == x = trivial *** QED
  | otherwise
      =   elem y (x:xs)
      ==. elem y xs
      ==. True ? thmUniqueExist' (x:xs) y `seq` thmUniqueExist xs y
      *** QED

{[email protected] thmUniqueExist' ::
      xs:{[a] | len xs > 0} ->
      {x:a | elem x (unique xs) && x != hd xs} ->
      { elem x (unique (tl xs)) }
  @-}
thmUniqueExist' :: Eq a => [a] -> a -> Proof
thmUniqueExist' (x:xs) y
  =   elem y (unique (tl (x:xs)))
  ==. elem y (unique xs)
  ==. elem y (x:unique xs)
  ==. elem y (unique (x:xs))
  ==. True
  *** QED

I won’t go over every step, but an important step is in the derivation of thmUniqueExist', in which we actually step “backwards” from elem y (x:unique xs) to elem y (unique (x:xs)) (by the definition of unique). I’m actually not particularly satisfied with this proof; in particular I find the forced use of hd and tl to be extremely inelegant. If you know of a better way, please let me know!

Onwards to Fulcrum! This one may take a bit longer – I’ve been sitting on this solution to Unique for about a week, but Fulcrum has been giving me way more trouble.

  1. If you think this sounds similar to P vs NP, you’d be correct! Proof generation is actually coNP-complete, which means that it’s easy to check that a proof is wrong, but difficult to create a proof from scratch or verify that it’s correct! 

  2. Technically, LH doesn’t check every step. The official documentation states that LH will only check that the first and last steps are justified, but with some experimentation I’ve found that certain intermediate steps are mandatory. Who knows? 

  3. The difference may seem academic to you, and in this case it arguably is. For a better discussion of the difference between classical and logical systems, check out this awesome blog post by Professor Harper of CMU. 

Updated:

Leave a Comment