This document provides a basic introduction to ML as implemented in POPLOG. Below is an annotated list of the available HELP files (on-line documentation) relating to POPLOG ML. The rest of this document is a HTML version of help PML and help EXAMPLES. Note that the name of the help file can be typed in upper or lower case.
CONTENTS
POPLOG ML (PML) is an implementation of the programming language Standard ML (SML). SML can be characterised as a strongly-typed, statically-scoped functional language with a strict semantics. "Functional" means that functions are first-class data objects - they may be passed as parameters, returned as results and embedded in data structures - and that the principal mode of computation is recursive function application. "Strongly typed" means that every expression has a type which is determined automatically by the compiler without the need for type declarations. Types can be polymorphic so that functions and data constructors can be defined which work uniformly on arguments of many types; the type system guarantees that any expression which can be typed will not generate type errors at run time. Other features of the language include a type-secure exception mechanism (for handling run-time errors) and a powerful module facility for the support of large-scale programming projects. For a proper definition of the Standard ML language and some guides to ML programming see the references cited below.
PML is provided as a POPLOG saved-image file; it can be run by typing to the shell
pml (on UNIX systems)
pml %x (on UNIX systems with X windows)
or
pop11 /pml (on VMS systems)
After a short delay, PML will print up a version banner to announce its presence, followed by the word "Setml" and the primary prompt "- ". For example
POPLOG ML (Version 1.0) Wed Sep 9 10:08:17 BST 1987 Setml -
"Setml" is printed whenever the PML system is reset - on start-up, or after an error or an interrupt. The primary prompt indicates that PML is waiting for new input from the terminal; a secondary prompt, "= ", is used whenever the system is expecting something extra, i.e. when an input line is incomplete. You can quit from PML by typing the "end-of-file" character to the primary prompt; on most systems this will be either CTRL-Z or CTRL-D. You can interrupt it by typing the interrupt or break character, usually a CTRL-C. During an evaluation, this generates an interrupt exception which can be caught and handled by an ML program. The default behaviour if the exception is not caught is to abandon the evaluation, reset the system and return to top-level; outside of an evaluation (during compilation) this will always be the case as it is not possible to catch interrupts arising at compile time.
The PML environment can be tailored by means of two initialisation files which are compiled whenever PML is run. The two files provide for both system-wide and per-user initialisation. Both files have the name init.ml but are sought in different directories. The system-wide file should be placed in the directory $popsys; this is compiled first. The per-user file should be placed either in the directory $poplib (which is distinct for each user) or else in the user's home directory; PML will look for the file in $poplib first, but will go to the home directory if the file can't be found there. This initialisation behaviour is the same as that of other parts of the POPLOG system - VED and POP-11 itself for example. It is described in more detail (with an explanation of the variables $popsys and $poplib) in the file HELP * INITIAL.
You should also copy ~pop/vedinit.p to your home directory to give you emacs compatibility.
PML is an interactive language. Declarations terminated by semicolons are typed to the prompt; these are read, analysed, compiled and evaluated and any resultant bindings are printed out together with their types. A useful form of input allows an expression
exp;
to be entered as shorthand for the binding
val it = exp;
In this case, just the value of the expression is printed with its type. HELP * SYNTAX gives a semi-formal description of the syntax of ML declarations and expressions. The PML top-level will also accept any of a small set of commands in place of a declaration; these commands give access to some of the extra features provided by the POPLOG system. The commands currently available are
cd help im load pop11 pwd showdef teach ved
They are described in the file HELP * COMMANDS.
Errors occurring at compilation time fall into two main categories: syntax errors, which cover all violations of the grammar rules, and type errors which cover all errors arising from the static analysis of the program done before evaluation starts. The principal type error arises when an expression or declaration is badly typed and the type-checker is unable to unify the types of two of its sub-components, but other faults (such as undefined names) will also give rise to type errors. A third (unnamed) category of errors arises from miscellaneous events such as trying to load a non-existent file. Error messages will identify the category of the error and its context. Wherever possible, the context will include the name of the file being compiled and the line number. Once the parsing phase is over, however, this information is lost and the context is given instead as the chain of declarations in which the error occurred, starting with the most local. Static type checking means that the majority of programming errors will be caught at compile time. Most of the unavoidable run-time errors (such as division by zero) and external interrupts manifest themselves as built-in exceptions which (if not caught by the user program) will escape to top-level and print an EXCEPTION message. There are also other errors which arise from deeper inside the POPLOG system and which cannot be handled by the user: when the heap space becomes exhausted, for example, or the call stack overflows. Such errors appear as standard POPLOG mishaps.
PML runs as a subsystem of the POPLOG system with the same status as the other POPLOG languages: POP-11, Common Lisp and Prolog. The PML command ved gives access to the VED editor which supports the incremental compilation and immediate mode evaluation of ML programs; on-line documentation can be read with the commands help and teach. Access to the language POP-11 (and from there to other languages) is provided by the command pop11; it is not at the moment possible to write mixed language programs which include ML, but this facility should be available soon. Refer to HELP * MLVED and HELP * COMMANDS for more details.
PML can be considered as a full implementation of the ML standard as described in reference [1] below. Inevitably, however, there are certain minor differences and extensions. These are as follows: (1) Equality: PML's equality function
val op = : 'a * 'a -> bool
is a true polymorphic function, i.e. it can be applied to items of any type. This is for convenience. Its meaning on functions is an accident of implementation; on abstract data types it is structural equality. (Future versions of PML should allow abstract types to include their own definitions of equality.) (2) Variables in type constraints: These respect the scoping rules of section 9 of the standard, but may be constrained by the type-checker. (3) The structure ExtendedIO Only part of this structure is provided by PML. The names missing are:
val execute : string -> instream * outstream val can_input : instream * int -> bool
These should be provided in a later release. (4) Pervasiveness Only those primitive names which are built-in to SML are normally visible inside the bodies of structures and signatures. PML includes a new declaration form pervasive which allows this set of primitive names to be extended. (See HELP * PERVASIVE). (5) Comments In addition to the SML comment brackets, PML also supports the standard POPLOG end-of-line comment introduced by the symbol ";;;" i.e. all text after this symbol up to the end of the same line is ignored by the lexical analyser.
HELP * HELPFILES A list of all the PML help files currently available HELP * EXAMPLES Some simple examples of programming in ML
[1] Robert Harper, David MacQueen, Robin Milner STANDARD ML University of Edinburgh technical report, LFCS-86-2, March 1986
[2] Robert Harper INTRODUCTION TO STANDARD ML University of Edinburgh technical report, LFCS-86-14, November 1986
[3] A. Wikstrom FUNCTIONAL PROGRAMMING USING ML Prentice-Hall, 1987
%HELP EXAMPLES Robert Duncan, Sept 1987. Examples of the use of ML
Some simple examples demonstrating how to use the language ML. The examples illustrate only uses of the core language; examples of modules are not yet available.
For an introduction to the PML system see HELP * PML and for a full grammar see HELP * SYNTAX. CONTENTS of CHAPTER
PML is an interactive language. It prompts for input initially with a single dash, "-", but if input spreads over more than one line a secondary prompt, "=", is used on lines after the first. Everything typed in to the prompt is a declaration. For example, the name x can be declared to have the integer value 3 by typing:
val x = 3;
Note the keyword val indicating that this is a value declaration (rather than a type or exception declaration - see below) and the terminating semicolon. For each such declaration, PML will respond with a list of the bindings it has made: the response from the above input would be:
val x = 3 : int
This confirms the nature of the declaration (val), and indicates the name bound, its value and its type. In the examples which follow, the output produced by PML will be shown immediately after the example, just as it would appear in a real session. Thus:
val x = 3; val x = 3 : int
You can confirm that the output shown is correct by marking and loading the example using the load marked range facility provided by VED (and described in HELP * MLVED). Once a name has been bound to a value (such as x above) it can then be used to stand for that value in subsequent declarations. So we could declare the value y as being twice that of x by doing
val y = 2 * x; val y = 6 : int
Often it is convenient not to have to give explicit names to values computed at top-level, and a shorthand form of declaration is provided for that: an expression typed in on its own, E say, with no introductory val, is interpreted by PML as being short for the declaration
val it = E;
i.e. a binding to the name it. So, for example, the input
(x + y) * (x + y);
is exactly equivalent to
val it = (x + y) * (x + y);
but the output produced by this kind of binding is shortened too, with only the value being printed. The result of the example above is just
81 : int
A consequence of this is that the name it is always bound to the result of the last expression evaluated:
it; 81 : int
Every expression in ML has a type. If the type of some expression E is ty say, then we write
E : ty
The colon symbol (:) is pronounced "has type" (thus: "E has type ty"). We have seen some examples of this already in the output from the declarations above, as PML displays the type of every value it computes. The type int (short for integer of course) is one of the basic (or primitive) types of ML and is built in to the language. The complete set of basic types is as follows:
1 ~1 256 ~300000
The prefix "~" indicates a negative number. Integers in PML can be of any magnitude.
1.0 ~1.0 0.3333 1E5 2.35E~4
Real numbers are always distinct from integers as they must include either a decimal point or the exponent symbol E (or both). The two types can't be mixed directly in expressions either; the built-in functions real and floor must be used to convert between them. E.g.
floor(22.0 / 7.0); 3 : int real it; 3.0 : real
"" "a" "More matter, with less art" "2 + 2 = 4"
Operators on strings include size, which returns the length of a string, and "@", which joins two strings:
val message = "Hello world\n"; val message = "Hello world
" : string size(message ^ message ^ message); 36 : int
true false
Built-in predicates such as "=" (general equality) and "<" (less-than ordering) return values of this type, and there is a conditional expression form which can choose between alternative values based on a boolean test. For example:
if x = y then "yes" else "no"; "no" : string
(); () : unit
This type is analagous to the void types which exist in other languages; its principal use is to provide a value in those cases where no value is really required. For example, the built-in function output, which prints on an output stream, is called purely for its side-effect and so returns "()" as a result:
output(std_out, message); Hello world () : unit
These are all constant (or nullary) types, but ML also provides a set of built-in type constructors for building complex types from more basic ones. The first example of these is the list type constructor. A list, as a value, is some arbitrary-length sequence of values of a given type. The list type constructor thus takes a single type as an argument (the type alpha, say) and constructs a new type which is the type of lists of values of type alpha. For every type alpha in ML, there is a type of lists of alphas. There are an infinite number of such types, but here are some particular examples:
string list (lists of strings) unit list (lists of units) int list list (lists of lists of integers)
List values of known length can be written with a special syntax using brackets ("[" and "]") and commas:
["More", "matter", "with", "less", "art"] : string list [(), (), (), ()] : unit list [[1, 2], [3, 4], [5, 6, 7]] : int list list
Lists can also be constructed dynamically using the operator "::" (pronounced cons, short for construct) which adds an item to the front of a list. For example:
val numlist = [3, 4, 5]; val numlist = [3, 4, 5] : int list 1 :: 2 :: numlist; [1, 2, 3, 4, 5] : int list
In fact, if we were to start always with an empty list, written as "[]" (or sometimes called "nil"), then we could construct every list using "::", as in:
"More" :: "matter" :: "with" :: "less" :: "art " :: []; ["More", "matter", "with", "less", "art "] : string list 1 :: 2 :: 3 :: 4 :: 5 :: nil; [1, 2, 3, 4, 5] : int list
Just as the type constructor list constructs new types from more basic ones, the so-called value constructors nil and "::" construct complex lists from more basic values. This relationship between type constructors and value constructors becomes more apparent with user-defined types discussed below. The type of the empty list is of interest:
[]; [] : 'a list
The identifier 'a (pronounced "alpha") is a type variable which can stand for any ML type. Because there are an infinite number of types constructed with list but only one empty list, all the list types must share the same empty list. The type of the empty list is thus a general type which can be instantiated to any particular list type. For non-empty lists, every item in the list must be of the same type. There is no legal type which could be attached to an expression such as
[1, "elephant"] : ???
Any expression which tries to mix types in the same list will cause a type error:
val bad_list = [1, "elephant"]; ;;; ML TYPE ERROR - Cannot unify the following types ;;; FOUND : string ;;; WANTED : int ;;; CONTEXT : val bad_list
This error message explains that in the context of compiling the declaration of bad_list, PML has discovered a wrongly-typed expression. Specifically, it has encountered a string where it was expecting another integer (since a list starting with the value 1 should contain nothing else but integers). This is no less of an error than (for example) trying to add together a string and an integer:
val bad_sum = 1 + "elephant"; ;;; ML TYPE ERROR - Cannot unify the following types ;;; FOUND : string ;;; WANTED : int ;;; CONTEXT : val bad_sum
To construct aggregate types whose components are of different types, ML provides the tuple type constructor, written as an infix "*". This constructor takes two or more types as arguments. For example:
int * real (a 2-tuple) int * real * string (a 3-tuple) int * real * string * bool (a 4-tuple)
The syntax for writing tuple-typed values uses parentheses ("(" and ")") and commas:
(1, 1.0) : int * real (1, 1.0, "a") : int * real * string (1, 1.0, "a", true) : int * real * string * bool
There is no mechanism by which tuples can be constructed dynamically (i.e. there are no value constructors for tuples): lists can be dynamic only because they contain objects of a single type; tuples may contain different types but must be of fixed length. Other type constructors built in to ML are:
The type of a value is normally deduced by the compiler (as in all the above examples) without the need for explicit declarations. An exception to this arises in the case of functions which are overloaded (this concept is explained later). Where a type declaration is needed, it can be attached to any expression or binding in the manner we have seen already. For example:
val pair : bool * int = (true, 0) and n : int = 1; val pair = (true, 0) : bool * int val n = 1 : int
Such type declarations (or constraints as they are usually called) can be useful too for documentation purposes, particularly where expressions involve complex, user-defined types. Well-placed type constraints can add useful information for the reader of a program.
Functions in ML are simply values, with the same status as values of other types. The most general type of a function is
'a -> 'b
(pronounced alpha arrow beta): this represents a function which takes an argument of some type alpha and returns a result of some (possibly different) type beta. Particular functions will have the type variables 'a and 'b instantiated in different ways. Thus:
real; fn : int -> real floor; fn : real -> int
Because functional values have no meaningful printed representation, they are always displayed by PML just as the word fn. There are no value constructors for function types, so a special syntax must be used to create new functional values. Here is an example of a function which doubles its argument (the function has a single formal parameter x and computes a result which is 2 times the value of x):
fn x => 2 * x; fn : int -> int it 4; 8 : int
We could give this a name using a standard value declaration:
val double = fn x => 2 * x; val double = fn : int -> int double 4; 8 : int
Named functions can be made recursive by use of the keyword rec:
val rec fac = fn n => if n = 0 then 1 else n * fac(n-1); val fac = fn : int -> int fac 6; 720 : int
We can see from this last example (and from some earlier ones) how function application is represented simply by the juxtaposition of function and argument. Parentheses aren't necessary except to change the order of evaluation: function application binds more tightly than any other expression construct in ML and so will always be evaluated first unless parentheses are used to change this. Compare:
double 4 + 4; 12 : int double (4 + 4); 16 : int
An important point to note about functions is that every function takes only a single argument (this is apparent from the function type, which specifies only one argument and one result). Functions which are to operate on more than one value may be written in either of two ways. Firstly, the two (or more) values needed may be provided to the function as a tuple. Here, for example, is the maximum function on integer pairs:
val max : int * int -> int = fn (n, m) => if n >= m then n else m; val max = fn : int * int -> int max(0, 1); 1 : int
This concept can cause confusion to those used to programming in other languages, as max here looks very much like a two-argument function would if declared in, say, POP-11 or Pascal. However, the following example emphasises that max does indeed take a single argument, a tuple, by constructing the tuple separately. This example also demonstrates the use of a let expression which localises a declaration to an expression; the binding of arg is not visible at top-level.
let val arg = (0, 1) in max arg end; 1 : int
Many of the built-in functions we have seen already (such as +, *, ^) are actually defined in this way. Their applications are normally written differently however, as they are all declared to be infix operators which means that they can be written between the two components of their arguments. Such operators can only be written in the usual way if the keyword op is used to turn off their infix status. For example:
op ^; fn : string * string -> string op *(4, 2); 8 : int
We could declare max as infix too if we wished:
infix max; 0 max 1; 1 : int
An alternative solution for functions which require multiple arguments is to have functions which return functions as results. Here's a function which adds an item to the end of a list (rev is a built-in function which reverses the order of items in a list):
val add_to_end = fn item => fn list => rev (item :: (rev list)); val add_to_end = fn : 'a -> 'a list -> 'a list
Note the type: the argument to add_to_end is an alpha-value and the result is a function from alpha-lists to alpha-lists (the function arrow -> associates to the right). Using the function requires two applications:
add_to_end 4 [1, 2, 3]; [1, 2, 3, 4] : int list
but the advantage of this is that we can apply it just once, and then save the resulting function for use later.
val add_4_to_end = add_to_end 4; val add_4_to_end = fn : int list -> int list add_4_to_end [1, 2, 3]; [1, 2, 3, 4] : int list add_4_to_end it; [1, 2, 3, 4, 4] : int list
ML provides a more concise syntax for function declarations, introduced by the keyword fun. We would normally define add_to_end as follows:
fun add_to_end item list = rev (item :: (rev list)); val add_to_end = fn : 'a -> 'a list -> 'a list
The effect is exactly the same as with the previous definition, but the declaration is shorter and clearer and the function created will be more efficient to run. Also, functions declared with fun are implicitly recursive, so there is no need for the rec keyword. As a further example of a simple function declaration, we will define the function iota which, when applied to an integer argument n, constructs a list of all the integers from 1 to n. (The name iota is taken from the language APL.) The definition makes use of another let expression to define a local function Iota which generates the list of integers from n1 to n2.
fun iota n = let fun Iota n1 n2 = if n1 > n2 then [] else n1 :: Iota (n1+1) n2 in Iota 1 n end; val iota = fn : int -> int list iota 10; [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] : int list
A complexity in the ML type system arises when the same function name is used to stand for functions of different types. This concept is called overloading. Many of the built-in maths functions have this property. For example:
2 + 3; 5 : int 2.0 + 3.0; 5.0 : real
Here the name "+" is used for both the integer and real addition functions. The relational operators are like this too:
"a" < "b"; true : bool 1 < 2; true : boolSuch overloaded names must always be used at a particular type, and type constraints are sometimes necessary to indicate which type is intended. For example, an alternative definition of double might be:
fun double x = x + x;but this will fail with the message:
;;; ML TYPE ERROR - Cannot determine a type for overloaded operator ;;; INVOLVING: + ;;; CONTEXT : fun doubleas there is no indication of whether the parameter x is meant to be integer or real. A type constraint is necessary:
fun double x : int = x + x; val double = fn : int -> intThis concept is quite distinct from that of polymorphism, where the same function may be used on objects of different types. List concatenation (written "@") is an example:
fun dup x = x @ x; val dup = fn : 'a list -> 'a list dup [2]; [2, 2] : int list dup [2.0]; [2.0, 2.0] : real listPolymorphism is the rule and overloading the exception in ML.
fun fac 0 = 1 | fac n = n * fac(n-1); val fac = fn : int -> int fac 6; 720 : intIn any particular application of the function only one of the alternative clauses can be used; the process by which a particular alternative is chosen is called pattern matching. The formal parameters in each clause (0 and n in this case) are called patterns. Patterns mirror the structure of values, and their function is to decompose values into their constituent parts. We say that a pattern matches a value if the pattern has the same structure as the value; whenever a pattern matches a value, the value can be decomposed according to the pattern. For example, the pattern 0 matches the integer value 0 only; the pattern n however, being a variable, matches any value (although in the context of the definition of fac, the value must be an integer). A pattern such as (0, n) would match any 2-tuple whose first component was 0. Variables which occur in patterns (such as n in fac) are so-called binding occurrences, i.e. the match, if it succeeds, binds the variables in the pattern to the corresponding components of the argument value, and the bindings remain visible throughout the right hand side of the chosen clause. Thus matching n against, say, 3, would bind n to the value 3, as would matching (0, n) against (0, 3). Applying fac to the value 0 we get:
fac 0; 1 : intThe argument 0 matches the pattern in the first clause of fac, and so that clause is chosen. The right hand side of the clause, 1, is returned straight away. Conversely, the application:
fac 3; 6 : intmatches the second rule of fac, which, by the binding rules described above, expands the right hand side of the rule to
3 * fac(3-1);By following through the recursive call (and a further two calls) the answer
3 * 2 * 1 * 1 = 6is computed. It's not only constants, variables and tuples which can be used to build patterns: the value constructors of types can be used too. This means that we can pattern-match on lists (but not on functions, which have no value constructors). In a context where we have defined:
a : 'a x : 'a listthen we can construct the value
(a :: x) : 'a listConversely, we can use the pattern (a :: x) to do the opposite task: to decompose any non-empty list into two parts a and x, a being the fist item in the list (the head of the list) and x being the list of all items but the first (the tail of the list). The function length uses pattern matching on lists to measure the length of a list: the length of the empty list is obviously zero, while the length of any non-empty list must be 1 greater than the length of its tail.
fun length [] = 0 | length (a::x) = 1 + length x; val length = fn : 'a list -> int length []; 0 : int length numlist; 3 : int
It's important that for every function definition, at least one clause should match for every possible value to which the function might be applied. If this were not the case, the function would be undefined for some values and a run-time error could arise. The ML compiler will give a warning about any declaration which is not complete in this sense. For example, this definition of sumlists which adds together two integer lists component-wise is incomplete:
fun sumlists [] [] : int list = [] | sumlists (n :: ns) (m :: ms) = (n + m) :: sumlists ns ms; ;;; ML WARNING - Clauses of function declaration are non-exhaustive ;;; CONTEXT : fun sumlists val sumlists = fn : int list -> int list -> int listThe definition leaves an expression such as
sumlists [1] [];undefined. Some additional constructs are available for patterns which make possible some sophisticated effects. The next function, removedups, removes duplicate items from a sorted list; the restriction to sorted lists allows us to assume that duplicate items will always be adjacent to one another in the list. Two new pattern constructs are used: the wildcard pattern, written using the underscore as "_", matches any value, but creates no new bindings so that the matched value is effectively discarded; the layered pattern, written var as pattern, both binds the matching value to the variable var, but also matches it against the subsidiary pattern which may decompose it further. With this construct, both the whole value and its component parts can be made accessible.
fun removedups (a :: (l as b :: _)) = if a = b then removedups l else a :: removedups l | removedups l = l; val removedups = fn : 'a list -> 'a list removedups [1, 1, 2, 3, 4, 4, 5, 6, 6, 6, 7]; [1, 2, 3, 4, 5, 6, 7] : int list
datatype day = Sun | Mon | Tue | Wed | Thu | Fri | Sat; datatype day constructor Fri : day constructor Mon : day constructor Sat : day constructor Sun : day constructor Thu : day constructor Tue : day constructor Wed : dayThis declaration creates a new type day together with a set of seven value constructors, each of which is a value of the new type (these are in fact the only values of the new type). The value constructors may be used in both patterns and expressions:
fun weekend Sat = true | weekend Sun = true | weekend _ = false; val weekend = fn : day -> bool weekend Wed; false : bool weekend Sun; true : boolValue constructors may be declared to take arguments, in which case they become functions from their argument types to the newly created type. This supports the creation of new record types such as complex numbers:
datatype complex = Complex of (real * real); datatype complex constructor Complex : real * real -> complexThe keyword of introduces the argument type of the constructor; the resulting function Complex converts 2-tuples of real numbers into complex numbers. The Complex constructor can be used for pattern matching also:
fun realpart (Complex(r, _)) = r and imagpart (Complex(_, i)) = i; val realpart = fn : complex -> real val imagpart = fn : complex -> real realpart (Complex(1.0, 0.0)); 1.0 : real imagpart (Complex(~1.5, 0.5)); 0.5 : realFor more sophisticated structures, we might choose to use a labelled record as the argument to a value constructor. This is similar to a tuple, in that it has a fixed number of fields of mixed type, but each field is labelled with an identifier; the labels are made explicit both in the type itself and in every value of the type. With large structures this can make programs much clearer, and there is a special pattern matching construct available for selecting particular labelled fields. Labelled records are written inside braces, "\{" ... "\}"; all the labels in a record must be distinct, but as a consequence the fields may be written in any order. This next datatype uses a labelled record to describe a book:
datatype book = Book of { title : string, author : string, date : int }; datatype book constructor Book : {author:string, date:int, title:string} -> book val good_book = Book { title = "Standard ML", author = "Harper et.al.", date = 1986 }; val good_book = Book({author = "Harper et.al.", date = 1986, title = "Standard ML"}) : bookNote how the colon symbol ":" is used in the type specification to separate field names and their types while the equality symbol "=" is used in expressions to associate values with fields. Fields are sorted internally into a standard order, so the output doesn't always match the input exactly. Unwanted fields of the record can be ignored in pattern matches by use of the record wildcardpattern, written "..." (this is another keyword of the language):
fun written_by name (Book {author, ...}) = author = name; val written_by = fn : string -> book -> bool written_by "Harper et.al." good_book; true : bool written_by "Henry James" good_book; false : boolAs a further demonstration of the flexibility of datatype declarations, here is a declaration of the type of simple binary trees (we can define a simple binary tree as EITHER an empty (Null) tree OR a Node consisting of two binary trees):
datatype tree = Null | Node of (tree * tree); datatype tree constructor Node : tree * tree -> tree constructor Null : treeAn example tree might be:
val tree = Node(Node(Null, Null), Null); val tree = Node(Node(Null, Null), Null) : treeThe function height measures the height of a tree:
fun height Null = 0 | height (Node(left, right)) = let val hl = height left and hr = height right in (if hl > hr then hl else hr) + 1 end; val height = fn : tree -> int height tree; 2 : intIt is also possible to create parameterised type constructors. For example, a more useful binary tree would include some kind of information, perhaps at each interior node. We can parameterise the definition of trees as follows:
datatype 'a tree = Null | Node of ('a tree * 'a * 'a tree); datatype 'a tree constructor Node : 'a tree * 'a * 'a tree -> 'a tree constructor Null : 'a treeNow the Node constructor takes a 3-tuple as an argument: a pair of trees plus some information of type alpha. We can instantiate alpha in any way we choose:
Node(Null, 0, Null); Node(Null, 0, Null) : int tree Node(Node(Null, "a", Null), "d", Null); Node(Node(Null, "a", Null), "d", Null) : string treebut, as ever, we cannot mix integers and strings in the same tree. Some of the built-in types of ML are no more than datatypes. Lists, for example:
datatype 'a list = nil | op :: of ('a * 'a list);This declaration won't actually compile however, as it is forbidden to redefine built-in types. The datatype declaration creates new, unique types which match only themselves; even if a declaration is repeated word for word, the two types thus created are distinct. An alternative declaration form, the type declaration, creates what is called a type synonym or a type abbreviation, i.e. it defines a new name for some existing type. This offers no more type security, as types with the new name will match correctly with the original type, but it can be extremely useful in allowing more descriptive names to be given to types for documentation purposes. Thus in geometry, we might declare:
type point = real * real; type point type line = point * point; type lineThere are no value constructors associated with such types: the values of type point are precisely those values of type real * real. Because of this equivalence, the ML compiler can never deduce for itself that any particular pair of real numbers is meant to represent a point. Values must be explicitly constrained with the new types to show any effect. For example,
val origin : point = (0.0, 0.0); val origin = (0.0, 0.0) : point val x_axis : line = (origin, (1.0, 0.0)) and y_axis : line = (origin, (0.0, 1.0)); val x_axis = ((0.0, 0.0), (1.0, 0.0)) : line val y_axis = ((0.0, 0.0), (0.0, 1.0)) : line fun gradient (((x1, y1), (x2, y2)) : line) : real = (y2-y1)/(x2-x1); val gradient = fn : line -> real gradient x_axis; 0.0 : realIt's worth stressing again that although a function such as gradient has been typed to take an argument of type line, it will in fact work on any type whose underlying structure is
(real * real) * (real * real)For example, we might choose to represent a circle as two points (its centre, plus an arbitrary point on the circumference):
type circle = point * point; type circle val unit_circle : circle = (origin, (1.0, 0.0)); val unit_circle = ((0.0, 0.0), (1.0, 0.0)) : circleWe can now say (without getting a type error):
gradient unit_circle;the result of which is probably meaningless. This would be impossible had we defined points, lines and circles as datatypes (in a manner analagous to the definition of complex numbers above). A third form of type declaration provided by ML, the abstype declaration, allows the creation of abstract data types. This will not be covered here.
1 div 0; ;;; ML EXCEPTION: div (Division by zero)or, more realistically, by using the gradient function defined above:
gradient y_axis; ;;; ML EXCEPTION: / (Real overflow)The exception message prints the name of the exception which was raised (such as div or /); this is normally the same as the name of the function which failed, so as to give an indication of what has gone wrong. Some exceptions will also print an extra message to provide more information, as with Real overflow above. The default effect of an exception is to abort the current evaluation and return to the top-level prompt. This need not be the case however, as exceptions may be caught by means of the handle construct:
1 div 0 handle div => 999; 999 : intThe exception raised by the bad div expression is trapped by the exception handler
div => 999This tries matching the name of the raised exception against the name specified in the handler; in this case they are the same, so the value of the handler is returned as the result of the expression. In the case where the exception names don't match, the raised exception passes through the handler either up to the next handler or to top level. This next example tries catching the wrong exception, without success:
gradient y_axis handle div => 999.0; ;;; ML EXCEPTION: /The / exception raised by the application of gradient passes through the div handler and up to top-level. Note how the handler has been changed to return 999.0 (a real) instead of 999 (an integer): this is because we have
(gradient y_axis) : realand the result returned by an exception handler must have the same type as that of the expression being handled. New exceptions can be declared with the exception keyword and then raised to signal error conditions in user functions. Here the function subscript picks the nth item from a list, but raises an exception if the list is not long enough:
exception subscript; exception subscript : unit fun subscript 1 (item :: _) = item | subscript n (_ :: list) = subscript (n-1) list | subscript n [] = raise subscript; val subscript = fn : int -> 'a list -> 'a val days = [Sun, Mon, Tue, Wed, Thu, Fri, Sat]; val days = [Sun, Mon, Tue, Wed, Thu, Fri, Sat] : day list subscript 5 days; Thu : day subscript 8 days; ;;; ML EXCEPTION: subscript subscript 8 days handle subscript => Sun; Sun : dayExceptions can also be used to communicate information between the point at which they are raised and the point at which they are handled. In this case the exception has to be declared with a type; when it is raised, it is raised with a value of that type, and the handler for it must use a pattern-matching construct to determine a course of action depending on the value.
exception errcond : string; exception errcond : string fun try x = if x < 10 then raise errcond with "too low" else if x > 19 then raise errcond with "too high" else x * x and keep_trying x = try x handle errcond with "too low" => keep_trying (x+1) | "too high" => keep_trying (x-1) | msg => raise errcond with msg; val try = fn : int -> int val keep_trying = fn : int -> int keep_trying 15; 225 : int keep_trying 1; 100 : int keep_trying 20; 361 : intThe simple declaration
exception subscript;is actually just a contracted form of
exception subscript : unit;so that exceptions default to the unit type. Similarly, the handler
subscript => Sun;is equivalent to the full handler form
subscript with () => Sun;the superfluous "with ()" being elided. A special built-in exception interrupt is raised by interrupt signals generated from the keyboard. This can be caught in the same way as any other exception, allowing ML programs to keep control after interrupts.