In the last lecture we introduced the -calculus as providing not only theoretical characterisation of computation but as being the basis a rigorous characterisation of a class of programming languages called functional programming languages. It was, we alleged, the case that it is possible to verify programs written in such languages in a simple way related to ordinary mathematical proof.
We can make our understanding of the role of the -calculus more precise as follows:
This means in particular that if we want to prove that a program is correct, then we must prove that the corresponding -calculus expression has a particular property. Such a "proof" does not guarantee that the program does in fact behave as required - but it does provide a basis for determining whether the error is in the user's program or in the language implementation.
In this lecture we will develop these ideas further.is a great convenience, because it means that we can harness a whole load of mathematical expectations about equality to our understanding of program-verification. But the two expressions above are manifestly different, so developing an appropriate concept of equality is necessary.
For example, suppose we want to prove that a sorting function that we have written, say merge_sort, is correct. Then, within the rigorous functional-paradigm, merge_sort should be exactly equivalent to an expression of the -calculus. What does it mean to say that merge_sort is correct? There are two criteria:
(merge_sort l)is sorted. How do we render this English requirement into a formalism? Well it's fairly easy to write a function sorted that determines if a list is sorted - a list is sorted if it is empty, or contains one element, or if the first two elements are in order and tail of the list (everything but the first element) is sorted. So our requirement can be expressed as
In order to understand how to implement software that can reliably support proof of facts like these we need to characterise exactly many important aspects of the -calculus. Let's start by defining -reduction so precisely that we can write a program to implement it.
That is to say, for the present we are restricting ourselves to the very simplest, most basic, idea of equality.
Informally, an expression E1 is said to occur in an expression E2 if it is a sub-expression of E2.
OCC1 E occurs in F if E = F
OCC2 If E occurs in F, and G is a term, then E occurs in FG
OCC3 If E occurs in F, and G is a term, then E occurs in GF
OCC4 If E occurs in F, and x is a variable, then E occurs in x . F
[Note Hindley allows that the bound variable x of a abstraction x. E occurs in the abstraction even if x does not occur in E. Barendregt does not. We have followed Barendregt.]
There can be several occurrences of E1 in E2. Note that if E1 is a variable v, say, then if v is the variable bound in a -abstraction, this does not constitute an occurrence. For example v does not occur in ( v. 2), while x has two occurrences in (+ x x).
When we defined -reduction, we were not very precise about what we meant by substitution. Now is the time to remedy this,
Now we need to define the idea of a bound variable. This is essential to giving a precise sense to the scope of a variable, and thereby being able to avoid confusing what are essentially two different variables which happen to have the same name.
Let E be an expression of the -calculus, and let v be a variable. We say that an occurrence of v in E is bound if it is inside a sub-expression of E of the form v.E2. An occurrence is said to be free otherwise. Thus v occurs bound in v. x v and in (y v. v) but it occurs free in x. v x.
Note that we are speaking of an occurrence of a variable as being bound - a variable can occur both bound and free in a given expression. For example, in v v. v, the first occurrence of v is free, and the last is bound.
We can define a function FV which forms the free-variables of an expression as:
Here FV1 says that the set of free variables of an expression that consists of a single variable is the set consisting of just that variable, while FV2 says that the set of free variables of a constant expression is empty. FV3 says that the free variables of an application E1E2 is the union of the free variables of the two expressions E1 and E2, while FV4 says that the set of free variables of a -abstraction is the set of free variables of its body minus the variable bound by the abstraction.
For example FV ( x. (f x y)) = {f,y}
An expression E is said to be closed if it has no free variables, i.e. FV(E) = {}.It is clear that the variable used in a -abstraction ought to be regarded as arbitrary. Thus x. + x 2 and y. + y 2 are, intuitively, the same function.
There is indeed a rule of the calculus, called -conversion, which allows the above two expressions to be treated as equivalent. It is a little tricky however, since one does not want to convert x. y x to y. y y - the rule is that we may only replace the variable bound in a -abstraction by one that does not occur free in the body. The conversion rule is thus:
Now let's return to the problem of defining substitution.
Consider for example:
So we have to make sure that, as we explore the expression we are substituting inside, if we encounter an inner -abstraction that has the same variable as the one we're replacing, that we don't continue substituting inside the expression. This is embodied in rule S5 below.
However, there is yet another pitfall for the naive implementor of -reduction. Suppose we want to -reduce
To summarise our discussion above, we may say that substitution,
forming E1[v:=E2],
`E1 with E2 substituted for v'
is a straightforward matter of rewriting sub-terms except when we come to a
-abstraction which binds the variable v or a variable free in
E2.
We can define it by cases, using v,u for variables,
E,E1... for expressions, with C for the
set of constants:
S2 - u[v:=E] = u, when u v
S3 - c[v:=E] = c, for any c in C
S4 - (E1E2)[v:=E]
=(E1[v:=E]E2[v:=E])
S5 - v.
E [v:=E1] = v. E
S6 - ( u. E1) [v:=E]
= u. (E1 [v:=E]),
S7 - ( u. E1) [v:=E]
= w. ((E1[u:=w]) [v:=E]),
Firstly let's note that S6 does follows the definition given in Hindley & Seldin in incorporating the condition v FV(E1). We could envisate leaving out the condition v FV(E1) from S6, forcing a change of variable (S7) in that case. This would slightly simplify the definition of substitution, though it does not appear to simplify the proofs given below.
Cases S1-S4 need no comment. Case S5 is the one we discussed earlier in which the variable we are substituting for is rebound in a -abstraction. Thus, inside the expression it no longer `means' the same thing - in some sense it is actually a different variable, so we should not substitute for it.
In case S6, the -abstraction introduces a new variable u different from v, but, there is no problem of confusing it with any variable occurring in E either when u does not occur in E or when v does not occur in E1.
However in case S7 there is a real problem - the new variable u introduced in the -abstraction is the same as a variable occurring free in E and v occurs in E1 (so that we will actually make a change in E1. The solution is to perform an -conversion, replacing it throughout the -abstraction by a variable w that does not occur in either E or in E1. We can always choose a w for S7 because we have an infinite supply of variables to choose from and any -calculus expression only contains finitely many.
We now have our first opportunity to prove a results relevant to developing a theory of equality in the -calculus. If we take nothing for granted we find that some rather "obvious" properties of substitution take quite a lot of effort to prove. This arises from the fact that there are seven clauses in the definition of substitution, so that if we are starting from the definition we have to consider all seven cases in our proof. Later, with a decent repetoire of lemmas under our belt, we'll have less labour in dealing with substitution.
Our proofs are by induction on the height of terms, and, since S6 requires us to change the variable of a -abstraction, we'll need a little lemma which says that height is unchanged thereby.
Inductive Step
Suppose for a given n we have for any term E for which height(E) n, for all x',y'
height(E) = height(E[x':=y'])
Thus, using H3 and S4, we have
Let us choose a variable w for which w FV(G) and w FV(y).
[A] If x FV(E) then
[B] If x FV(E) then
Base Case n=0
So the result is satisfied for this case.
Now E[x:=F] = u[x:=F] = u by S2, so we obtain:
Suppose E = c C. Then FV(E) = FV(c) = {} by FV1, and hence x FV(E) , that is we have case [B].
Now E[x:=F] = c[x:=F] = c by S3.
E[x:=F] = c = E
Inductive Step
Suppose for a given n we have for any term E for which height(E) n
[A] If x FV(E) then
[B] If x FV(E) then
Note that by H3, height(E1) n , height(E2) n .
There are 4 sub-cases
In this sub-case, it follows that x FV(E1), x FV(E2). Then
by the inductive hypothesis. Also
by the inductive hypothesis.
In this sub-case, applying S4, we have:
= FV(E1[x:=F]) FV(E2[x:=F])
by FV3. And now, applying the inductive-hypothesis and the associativity of union
= ((FV(E1) FV(E2))-{x}) FV(F)
by a little elementary set-theory. We now use FV3 again:
In this sub-case, applying S4, we have:
= FV(E1[x:=F]) FV(E2[x:=F])
by FV3. And now, applying the inductive-hypothesis, remembering x FV(E1)
= ((FV(E1) FV(E2))-{x}) FV(F))
by a little elementary set-theory, remembering again that x FV(E1). We now use FV3 again:
This case is symmetric with case 3.
In this case FV(E) = FV(G)-{x}, so x FV(E) . Using S5 we have
Also
thus the lemma holds for this case.
We have, by S6
and by FV4
There are 2 sub-cases
In this sub-case, by the inductive hypothesis,
FV(G[x:=F]) = FV(G)
and G[x:=F] = G
So,
by FV4. Moreover E[x:=F] = E
So, since we xFV(E)=FV(G)-{u}, the result holds in this case.
In this sub-case, by the inductive hypothesis,
So,
by FV4. But {u} FV(F) , so FV(F)-{u} = FV(F). Hence
= ((FV(E)-{x}) FV(F))
So our result holds for this sub-case.
Let us choose a variable w for which w FV(G) and w FV(F). We have
and by FV4 and [1]
There are 2 sub-cases.
In this sub-case, by the inductive hypothesis, FV(G[u:=w])=FV(G). So, again by the inductive hypothesis, since xFV(G)
= (FV(G)-{x}) FV(F)
Since u FV(G) . So that, substituting in [2] we obtain
= (FV(G)-{x}) FV(F)
Since w has been chosen not to be in either set of free-variables. On the other hand, FV(E) = FV(G)-{u} = FV(G), since in this sub-case u FV(G) . So we have
Proving the result for this sub-case, since x FV(G) and x u implies x FV(E)
In this sub-case, by the inductive hypothesis and FV1,
Thus x FV(G[u:=w]). Applying the inductive hypothesis twice again we obtain
= ((FV(G)-{u}) {w}) -{x}) FV(F)
Now w x , since x in this case is a free variable of G, while w is chosen not to be. Hence we have by [2]
= (FV(E) - {x}) FV(F)
establishing the result in this sub-case.
Base Case n=0
Suppose E = c C. Then, by S3, E[x:=x] = c = E. So the result is satisfied for this case.
Inductive Step
Suppose for a given n we have for any term E for which height(E) n
and we see the result holds for this case.