3 Examples3.3 A Program Verification3.4 Another Verification

3.4 Another Verification

Our final example uses another program verification in order to illustrate some more features of the system and subtle points of its use. The program to be verified represents the core of the binary search algorithm for finding an element x in an array a of integer numbers that is sorted in ascending order; the program sets a result value r to an index at which x occurs in a, respectively to -1, if x does not occur in a. The verification task is described by the following Hoare triple:

{ olda = a /\ oldx = x /\ (/\ j: 0 <= j < length(a)-1 => a[j] <= a[j+1]) /\
  r = -1 /\ low = 0 /\ high = length(a)-1 }
while r = -1 /\ low <= high do
  mid := |_(low+high)/2_|
  if a[mid] = x then
    r := mid
  else if a[mid] < x then
    low := mid+1
  else
    high := mid-1
{ a = olda /\ x = oldx /\
  ((r = -1 /\ (/\j: 0 <= j < length(a) => a[j] /= x)) \/
   (0 <= r < length(a) /\ a[r] = x)) }

The main difference of to the verification presented in the previous example is the requirement of the array to be sorted and the computation of the index mid to be investigated next: this computation makes use of the division operator / (whose result need not be an integer) and of the "floor" operator |_._| which maps a real number x to the largest integer number less than or equal x.

By the rules of the Hoare calculus, we derive the following five verification conditions A, B1, B2, B3, and C where B1, B2, and B3 describe the fact that the loop invariant is maintained by each of the three paths that may be taken through the body of the loop:

Input :<=>
  olda = a /\ oldx = x /\ r = -1 /\ low = 0 /\ high = length(a)-1 /\
  (/\j: 0 <= j < length(a)-1 => a[j] <= < a[j+1]);
Output :<=>
  a = olda /\ x = oldx /\
  ((r = -1 /\ (/\j: 0 <= j < length(a) => a[j] /= x)) \/
   (0 <= r /\ r < length(a) /\ a[r] = x));
Invariant :<=>
  a = olda /\ x = oldx /\
  (/\: 0 <= j < length(a)-1 => a[j] <= a[j+1]) AND
  -1 <= r /\ r < length(a) /\
  (r /= -1 => a[r] = x) /\
  0 <= low /\ low <= length(a) /\
  -1 <= high /\ high < length(a) /\
  low <= high+1 /\
  (/\j: 0 <= j < low /\ j < length(a) => a[j] < x) /\
  (/\j: high < j /\ j < length(a) => a[j] > x);

A :<=> Input => Invariant
B1:<=> Invariant /\ r = -1 /\ low <= high /\ 
         a[|_(low+high)/2_|] = x => Invariant[|_(low+high)/2_|/r]
B2:<=> Invariant /\ r = -1 /\ low <= high /\
         ~a[|_(low+high)/2_|] = x /\ a[|_(low+high)/2_|] < x 
         => Invariant[|_(low+high)/2_|+1/low]
B3:<=> Invariant /\ r = -1 /\ low <= high /\
         ~a[|_(low+high)/2_|] = x /\ ~a[|_(low+high)/2_|] < x 
         => Invariant[|_(low+high)/2_|-1/high]
C :<=> Invariant /\ ~(r = -1 /\ low <= high) => Output

A crucial part of the invariant says that "array a is sorted" by the formula / j: 0 <= j < length(a)-1 => a[j] <= a[j+1], a property that will become essential in the subsequent proof.

The software distribution contains in directory examples the text file binarysearch.pn with the corresponding declarations in the specification language of our system and a subdirectory binarysearch with the corresponding proof. In the following, we will focus on verification condition B3 whose proof is the most demanding one.

Since the specification language does not have a builtin "floor" function, we introduce a constant floor by the following declaration which is accompanied by the declaration of an axiom that characterizes this constant:

floor: REAL->INT;
floorAxiom: AXIOM
  FORALL(x:REAL): floor(x) <= x AND 
    NOT(EXISTS(y:INT): y <= x AND floor(x) < y);

Based on the already previously described definition of datatype "array", (which we specialize in this verification to "array of integers"), we introduce constants for the program variables and mathematical constants occurring in the Hoare triple:

a: ARR; olda: ARR; x: INT; oldx: INT;
low: INT; high: INT; mid: INT; r: INT;

The verification conditions use various versions of the predicate Invariant which is therefore correspondingly parameterized:

Invariant: (ARR, INT, INT, INT, INT) -> BOOLEAN =
  LAMBDA(a: ARR, x: INT, low: INT, high: INT, r: INT):
    a = olda AND x = oldx AND
    (FORALL (j:NAT): j < length(a)-1 => 
       get(a, j) <= get(a, j+1)) AND
    -1 <= r AND r < length(a) AND
    (r /= -1 => get(a, r) = x) AND
    0 <= low AND low <= length(a) AND 
    -1 <= high AND high < length(a) AND
    low <= high+1 AND
    (FORALL (j:NAT): j < low AND j < length(a) => 
      get(a,j) < x) AND
    (FORALL (j:NAT): high < j AND j < length(a) => 
      get(a,j) > x);

The pretty printed form of this invariant is shown below:

Binary Search Invariant

Proving a Lemma

As we will see below, the proof of B3 (and likewise the proof of B2) requires additional knowledge about sorted arrays. In particular, we need to infer that the fact that every pair of neighbor elements preserves the right order (the definition of "sorted") implies the fact that also every pair of non-neighbor elements preserves the order. We express this knowledge in the form of a declaration of a formula (lemma) L:

L: FORMULA
  (FORALL (j:NAT): j < length(a)-1 => 
    get(a, j) <= get(a, j+1)) => 
  (FORALL (j, k:NAT): j < length(a) AND 
      k < length(a) AND k <= j => 
    get(a, k) <= get(a, j));

The pretty printed form of this lemma is shown below:

Binary Search Lemma

The overall structure of the proof of this lemma is as follows:

Structure of Proof of Lemma

The root state [mca] of this proof is depicted below:

Proof of Lemma

The goal [odj] is an implication of two universally quantified formulas. We may thus be attempted to perform a predicate logic proof by pressing the "Scatter" button , but the resulting proof state is a dead end, as is also the result of the more conservative `Decompose" button . A closer investigation of state [mca] gives us the key idea to perform an induction proof on the conclusion of the goal, but for this purpose we first need to break up the implication. Consequently, we invoke the command flatten which leads to the state [c3f] whose goal [z42] is a universally quantified formula with natural number variables j and k:

Proof State after Flattening

We decide to perform the induction on the first variable of the goal and consequently execute induction j in z42 resulting in two child states [3da] and [4da].

The proof state [3da] represents the induction base:

Induction Base

Its goal [6nq] is still universally quantified; we thus apply the `Scatter" button which results in a proof state that is automatically closed.

The proof state [4da] represents the induction step:

Induction Step

Its goal [aw4] is also still universally quantified; we thus apply the `Scatter" button resulting in the following state [jdt]:

Proof State after Scattering

Closing this proof state apparently depends on the proper instantiation of the universally quantified assumptions [6hu], [iuh], and [cgs]. We thus try to find a suitable automatic instantiation by pressing the "Auto" button ; indeed the resulting proof state is automatically closed by the decision procedure.

Proving the Verification Condition B3

We will now describe the proof of the following verification condition:

B3: FORMULA
  Invariant(a, x, low, high, r) AND r = -1 AND 
  low <= high AND get(a, floor((low+high)/2)) /= x AND
  NOT get(a, floor((low+high)/2)) < x
  => Invariant(a, x, low, floor((low+high)/2)-1, r);

The pretty printed version of this condition is shown below:

Binary Search Verification Condition

The overall structure of the proof is as follows:

Structure of Proof of Condition B3

The root state [r1b] of the proof consists of an assumption [6hu] representing the axiomatization of the floor constant and the goal [yhd] representing the content of the verification condition S3:

Proof of Condition B3

Apparently, floor is only applied to the argument high/2+low/2 (three times). Therefore we instantiate already in the very beginning of the proof the assumption which represents the axiomatization of floor with this value by executing the command instantiate high/2+low/2 in 6hu. This yields the proof state [zrp]:

Proof State after Instantiation

This state has the specialized assumption [5cu]; the general assumption [6hu] will not play a role in the remainder of the proof any more. Now we execute expand Invariant to expose the goal formula. The resulting proof state [rui] (not depicted) looks quite daunting such that we immediately press the "Scatter" button which results in six child states of which three are automatically closed.

The first of the remaining three open proof states is labelled [zhg]:

First Proof State Resulting from Scattering

We guess that this proof state can be closed by a clever instantiation of the universally quantified assumptions. Perhaps this is even the case for the two open sibling states, thus we press the button which indeed closes this state and another one by automatic instantiation. The only remaining state is the one with label [2hg]:

Second Proof State Resulting from Scattering

Apparently this state requires deeper investigation. Its goal [zki] is to prove x<a[j0] for some index j0. So what do we know about x and j0? From assumption [noj], we know |_(low+high)/2_| <= j_0; since a is sorted in ascending order, thus also a[|_(low+high)/2_|] <= a[j_0] should hold (*). Furthermore, from assumptions [3kp] and [6he] we know x < a[|_(low+high)/2_|]; thus by transitivity of <=, the goal should hold.

But do we really know the formula marked (*) above? What we actually know is the property [iuh] which says that a[j] <= a[j+1] for arbitrary pairs of neighbor indices j, j+1, while (*) talks about non-neighbor indices |_(low+high)/2_| and j0. This is the place where we need the knowledge that preserving the order for neighbor indices also implies preserving the order for non-neighboring indices, i.e. we need the formula L that was previously proved. We thus execute the command lemma L that imports this formula as an assumption and yields the proof state [63o]:

Proof State after Lemma Introduction

This state contains an additional assumption [z42] which is the conclusion of the implication contained in lemma L; its hypothesis is identical to assumption [iuh] and is therefore automatically discharged by the system. Pressing the "Auto" button does not close the state thus we execute the command instantiate j_0, floor(low/2+high/2) in z42 in order to explicitly instantiate the variables j and k in the new assumption by j0 and |_(low+high)/2_|, respectively. This leads to the following state [35c]:

Proof State after Lemma Instantiation

We again press which generates a child state [g6q] which is automatically closed; this completes the proof.

Summary

Above proofs demonstrate two things: most notably, they show that even for rather simple verifications "critical" proof states arise that require deeper analysis and perhaps the use of additional knowledge that has to be "imported" into the proof. Either this knowledge is already available in the form of previously declared formulas or the current proof has to be suspended to establish the knowledge and then resume the proof (the command assume allows to introduce such knowledge and prove it within the current proof). If an imported formula is not (yet) verified by a proof, the current proof is marked as "relative" with respect to unverified formulas.

The success of another important feature is demonstrated by the fact that we did not notice it. In proof state [63o] we instantiated a natural number variable j by an integer value |_(low+high)/2_|. This was legal because the system created an additional proof state [25c] whose goal was to prove that this value is not negative; the system automatically then automatically discharged this goal by applying its decision procedure. The theoretical basis for this is a system of subtypes which treats natural numbers as a subdomain of integers and integers as a subdomain of reals. This greatly simplifies proofs that have to deal with various arithmetic domains compared to systems that have a less flexible type discipline.

Having studied the example specifications and proofs discussed in this and the previous sections, the user now should have a feeling for the "flavor" of interaction with the RISC ProofNavigator. The system is helpful in quickly decomposing a proof into the "interesting" (i.e. difficult) proof states; it also takes from the user the task of dealing with various kinds of low-level reasoning steps. On the other hand, the system leaves the user with the task of finding the "crucial" steps in a proof like finding non-trivial variable instantiations, imports of lemmas, etc. The system is definitely and deliberately not an automatic theorem prover rather than a proof assistant that attempts to be as helpful as possible but not clever beyond its abilities. Chapter 4 discusses known deficiencies of the system and potential for future improvements.


Wolfgang Schreiner

3 Examples3.3 A Program Verification3.4 Another Verification