Computer Science 591i
Theoretical Foundations of Programming Languages
and their Practical Realisation.


The Lambda Calculus Provides our Basic Formalism
The Syntax of the -calculus is simple
Functional languages don't have to look like LISP
-reduction is our basic computation mechanism
The Pure Lambda Calculus is Theoretically Adequate for any Computation
Translating from the -calculus to the Java Virtual Machine
Proving that Programs are correct

Computer Systems generally inhabit a less challenging environment than bridges do, so one might reasonably expect that they would fall down less often. The central challenge of Computer Science is that the contrary is the case. Experience in other engineering disciplines would suggest that what is required is the specification and construction of computational systems within a framework that supports analysis.

Given the need for computational systems to be analysable, the question arises "how to strike a balance between what is analysable and what is practicable". For example, a vast corpus of material is written in the C language which supports analysis very poorly indeed; however C has proved to be a practical basis for the construction of systems at an early stage in the development of our discipline.

In this seminar, by contrast with C, we will look at one particular approach to creating software which makes analysability central, namely the Functional approach to Programming Language Design. This approach is based on desigining languages which have a close formal relation to the -calculus.

So, to elaborate, we will:

Generally the application of computers to engineering design is somewhat ahead of their application to software design. In particular it is common for engineers to use a tool such as Mathematica or Maple to assist in performing an analysis of the behaviour of a system before it is implemented. In this seminar we'd like to explore how functional languages can be used to address this gap by supporting detailed mathematical analysis of program behaviour.

In implementing programs it is common enough to employ what might be called "macroscopic" tools to analyse the behaviour of a program at a rather coarse level of detail. The most widely used such tool is the concept of data type which is built into most programming languages. Type-analysis will tell you that x+"fred" is an illegal expression (in most languages) because you can't add a number to a string. It won't tell you that you have made an error if you have written x+1 when you meant x-1.

On the other hand, Mathematica can tell you precisely that - for example if you want a solution of the equation x2+2x+1, Mathematica can determine that x=1 is not a solution of that equation (or of course it can solve the equation for you).

But computer programs in general are opaque to the sort of analysis that Mathematica and Maple can perform. Before we can even begin to prove that a program is correct, we have to characterise the behaviour of the programming language in which the program is written. Functional programming languages are characterised in a particularly simple way: every component of a program is treated as a mathematical function whose meaning follows the simple rule that within a given context identical expressions have identical values.

To give an example of what we mean, in a pure functional language a mathematical identity like:

         fred(x) + fred(x) = 2*fred(x)
should hold. This is not necessarily the case for a non-functional language, for example in Pascal or C fred might be a procedure which had a side-effect, so that calling it twice has a different effect from calling it once. For example fred might contain the assignment g:=g+1 where g is a global variable. Thus the value of fred(x) varies with the number of times it has been evaluated.

The Lambda Calculus Provides our Basic Formalism

If you ask a mathematician "what, fundamentally, is a function", he is likely to talk about a set of (argument,value) pairs. He may well characterise the relationship between arguments and values in some way that he regards as adequate, without any guarantee that there is any way, given an argument, of computing the corresponding value. For example, mathematicians can (and do) speak of a function f defined on the real numbers for which f(x)=0 when x is rational, and f(x)=1 when x is irrational. But any attempt to implement such a function computationally would require us to solve undecidable problems.

[Note that, while some programming languages refer to floating-point numbers as "real" numbers, they are in fact only crude approximations, quite inadequate for representing this particular function f.

An accurate representation of reals would require a given real to be represented as an infinite sequence of rationals. One can only implement such a thing computationally as a program, so that deciding if a particular real, so represented, is equivalent to a rational, is about proving whether the program that represents the real produces the same output as a program that produces a constant rational. This is an instance of a class of problems known to be undecidable. ]

This kind of approach is fine for mathematicians, who are usually not too concerned about whether they can actually construct the entities of which they speak. As computer scientists we need some mathematical way of characterising functions which, as a formalism, guides us exactly into something we can realise as a computation.

The generally agreed solution to this requirement of constructability of functions is to use as our basic formalism the -calculus, invented by Church, the American logician, circa 1940. It was developed as a branch of mathematics, rather than as a programming language (antedating all programming languages), and was thus defined essentially as marks on a piece of paper. However it is very easily represented in a computer. The -calculus is capable of expressing any computation, like a Turing machine, but, unlike a Turing machine, it can be made the basis of elegant and powerful computer languages.

One of the conceptual problems that beginners in formal logic, and the -calculus is a formal logic, face is the `bootstrapping conundrum', namely that we are trying to build a very formal system using our less formal ideas of ordinary mathematics, which were, in our teens, build using the even less formal ideas expressible in natural language. There is probably little that can be said except to point out that there is a computational analogy - if for example we want to build a super-secure computer system, it will depend on using less-secure tools, and ultimately on hand-crafted machine code which is not secure at all. Nevertheless, by careful checking and cross checking, and incorporating tests for consistency, we can build systems in which we trust.

Expressions in the -calculus

Most programming languages have a quite complex syntax that can take days or even weeks to learn. The -calculus has an exceedingly simple syntax. The syntactic definition given below is intended to emphasise the role of the -calculus as a basis for programming languages. We use a bold font for non-terminal symbols of the syntax (for example E1).

The -calculus consists of a set of expressions.

Basically, that's it! We can begin to give some meaning to this syntax, providing some explanation of what a given expression denotes. Doing this systematically will take quite a bit of work, but roughly we can say:

As defined above, complex expressions in the -calculus would have a great many parentheses. To avoid the confusion to the human reader that all these parentheses create, it is conventional to write an expression of the form

as (E1 E2 E3... En) and to write ( v.(E1E2...En)) as ( v.E1 E2...En), or even to omit the outer parentheses if that can be done without ambiguity. The convention about the syntactic scope of a -abstraction is that it extends as far to the right as possible. Thus means and not

Note that application is not an associative operation, so that E1 (E2 E3) is not the same as
(E1 E2) E3. Note also that in the -calculus every function takes exactly one argument, which is why the basic form of the abstraction is v.E, with just one variable v. However, provided we remember this fact, it is syntactically convenient to write

to mean and similarly for more variables.

LISP and the -calculus

A word of warning is needed here to readers who are familiar with the LISP language, or any derivative such as Scheme: It would appear that LISP was inspired by the -calculus, but is certainly not an exact implementation of it, despite the significant syntactic similarity. Thus the LISP S-expression (u v w) does not mean ``apply u to v and then apply the result to w, as it would in the -calculus.

However more modern functional languages which on the surface appear more remote from the -calculus, are in fact much closer. For example the SML expression:

    fn x=> x*x;
is quite a faithful representation of the -calculus expression x. (* x x).

-reduction: Our Basic Computation Mechanism

Above we stated that the abstraction v.E could be understood as meaning "that function which maps the variable v to the expression E". We can give a practical meaning to this statement by defining exactly what it means to apply a -expression to an argument. Consider the application

We've said that x. (+ 5 x) means "that function which maps x to (+ 5 x)". So, if we give it an argument 22 we should expect to get (+ 5 22) as the result. That is to say, we have substituted 22 for x in the body of the expression.

To make this precise we have to define what we mean by substitution. We need a little notation first. We'll say that

means "the expression E1 with E2 substituted for x". Roughly what this means is "go through the expression E1 and replace every occurrence of x by E2". However substitution is not quite as simple as that. For example the variable x might be used (or as we say rebound) in a -abstraction occurring inside E1; this isn't the only complication. We'll defer the full definition of substitution until later.

The process of converting the application of a -abstraction to an argument by substituting the argument for the variable of the abstraction in the body of the abstraction is called "-reduction". We write

For example, ( x. (+ 5 x)) (+ 2 3)   (+ 5 (+ 2 3))

There is no requirement that the argument of the application should be "worked out" before a -reduction is done. So, in the above example, we don't work-out that (+ 2 3) is 5 before doing the -reduction - indeed such "working out" is treated by another mechanism, -reduction, discussed later.

The Pure Lambda Calculus

If we take the -calculus as specified above and throw out the constants, we obtain the pure -calculus. It turns out that the pure calculus is theoretically adequate to specify any computation in much the same sense that a Turing machine is adequate. This is achieved by using -reduction successively to transform a computation expressed as an expression of the -calculus into another expression which is the result of the computation.

Mathematicians, who as a class have swallowed Occam's razor hook line and sinker, may be satisfied with the pure calculus. [The author trusts that his mixed metaphors will not give his readers indigestion.] However, as a model for practical computation, it is inadequate. In the pure calculus, for example, one can represent the number 3 as a -abstraction f x. f(f(f(x))). This is similar to the use of the Peano postulates in your Discrete Mathematics course to do arithmetic. You'll remember that 2+1 = 3 is rendered as:

Neither is much help if we want to calculate our income tax! However, since the pure calculus is nice and simple, we'll make some use of it in our theoretical discussions.

Functional languages and the -calculus

The -calculus serves as a mathematical specification for the design of functional programming languages. However, design of an actual language gives rise to complications, syntactic and semantic. The syntactic complications serve one main purpose, namely to assist human comprehension of a program written in a language. We have already seen two such concessions to human frailty in the suppression of unnecessary parentheses and 's.

Another feature of programming languages that is something of a historical accident is that a program is usually composed using only ASCII characters.

Within the constraint of using ASCII, most functional languages make further concessions to perceived human need by adopting syntactic conventions that approximate to normal mathematical usage. For example most functional languages will allow the user to render:

    (+ u (* v w))
as
    u + v*w

Provided that it is a translation of such syntax back to the -calculus is rigorously defined, there is no objection to its use.

There are a number of conventions for rendering -abstractions in ASCII. The one we will prefer is that used in Haskell. The abstraction v.E is rendered in ASCII as \v.E where E is the rendering of E. For example our old friend x. (+ 5 x) is rendered as

      \x. 5+x

How do we do "permanent" definitions?

One feature of the -calculus that might seem to be an occasion of distress is that there is no apparent way of creating named procedures as we can in most programming languages. For example in C we can write:
int f(int i) {return(5+i);}
creating a procedure named "f" that we can use elsewhere in our progam. Actually, we can do this kind of thing in the calculus, but in a rather inside-out kind of way. Suppose we wanted to use the C function defined above to compute f(3)+f(4), which we'd do as.
void main(); {printf("answer %p\n",f(3) + f(4);}
This can be rendered in the -calculus as Leaving aside the fact that we print the answer in C - how you do input/output in a functional way is a subject to be discussed later, this works fine, but has an unfortunate form in which the use of f is placed between the introduction of the variable f and the -abstraction that is its value. Practical functional languages provide at least one syntactic form that addresses this problem. In our case we'll write:
    let f = \i. 5+i in (f 3) + (f 4) end let

Translating from the -calculus to the JVM

The Java Virtual Machine, while intended primarily to support the Java language, provides something of a generic target architecture for compilers. In this seminar we will aim to create an implementation of a simple functional language which runs on the JVM. While we will create Java classes directly, without using the Java compiler, we can readily specify what these are by making use of Java syntax itself.

Our language will have the basic data-types of the Scheme language together with certain types which are designed to support proof of correctness of programs written in the language. However, while we will support Scheme data-types, the language itself will be a more rigorous implementation of the -calculus than Scheme is. The main data-types will each be a Java class.

The following is an interim specification of some of the types. It is to be understood that these will be suitably located in the Java class hierarchy.


Proving that Programs are Correct

The -calculus was intended as a kind of Mathematical Logic, and we shall use it as such to construct proofs about programs in our language within our language. Essentially the idea is that since the -calculus provides the definition of our language, we can require that any implementation obeys the laws of the calculus, together with additional laws which characterise the behaviour of the built-in System capabilities.

For example, suppose we want to prove that a sorting function that we have written, say merge_sort, is correct. Then since we are operating within the functional paradigm, merge_sort should be exactly equivalent to an expression of the -calculus. What does it mean to say that merge_sort is correct? There are two criteria:

This is a hard problem

Proving that programs are correct at the level of detail to which we aspire is an undecidable problem, for our approach is in effect to show that two different programs always produce the same result when given the same data, which is known to be undecidable if taken over a wide enough class of programs.

However, there is nothing in the undecidability result which says that it is impossible to prove that a particular program is correct. So, it is reasonable to attack the problem of proof with a combination of human ingenuity controlling a collection of inference rules provided as functions in the language itself.

The Abstract Data-type "Theorem"

Practically, this can be realised by using an abstract data-type Theorem with the property that the only ways a language-user can get hold of a theorem are guaranteed to ensure the validity of the theorem.

A theorem has two essential components, its premises and its conclusion. Our theorems will also carry a history with them. A theorem prints as

    premises |- conclusion
For example, we might have
    Integer.? x, Integer.?y |- x+y = y+x
Which states that it doesn't matter whether you add x to y or y to x. If there are no premises, then we write simply
    |-  conclusion
For example we require:
    |-  x = x

The meaning of theorems is a bit tricky to define. Essentially, the premises are a sequence of things that look like terms of the -calculus, while the conclusion is one of these term-like things. Notionally, the premises and conclusion "evaluate" to one of the values "true" or "false", and we require of our theorems that if all the premises "evaluate" to true, so does the conclusion.

The tricky bit is that such evaluation is not constructive. This has to be so because we are interested in proving theorems about equality of functions. We'd like, for example to be able to prove that

    |-  \x. x+3 = \x. 3+x

The conclusion \x. x+3 = \x. 3+x looks like a term of the -calculus, but we can't evaluate it in the ordinary way we'd evaluate a program. Instead, we have to employ a mathematical concept of evaluation. In this particular case, the meaning of the equality can be taken to be that if each of these -abstractions is given any of a countably infinite universe of objects of values, the results are computably identical.

Theorems provide arbitration between users and implementors. If a user can exhibit a theorem which says that his program ought to work, and it doesn't, then he has a cast-iron case for complaining to the implementor of the language that his implementation is broken.

Obtaining Theorems

There are 3 ways that a user can obtain a theorem.