In abstraction we draw out (from the Latin verb traho I pull (c.f. tractor), and the preposition ab meaning "from" or "away from") an essential aspect of an idea, allowing it to be applied to more than the particular set of circumstances in which we first encountered it. We have already seen this at work when we considered the sum function and abstracted it to obtain the reduce function.
An important principle in the design of computational systems is to provide a measure of isolation of the implementation of a capability from its users. Thus a user is required to employ some kind of standard interface in accessing a capability. In doing this we are abstracting the essence of the capability from the point of view of its users.
For example, in computer hardware, a standard bus such as the VME bus can be used to connect modules. In operating systems, access to backing store is mediated via system-calls.
This isolation offers two primary advantages:
The implementation can be improved or changed without affecting how it is used. Provided the user adheres to the standard interface, (s)he need not alter how (s)he uses the capability.
Thus for example, in hardware, a larger memory module can be plugged into the standard bus and can be immediately usable. In operating systems, a file system local to a particular machine can be replaced by a distributed file system with minimal disruption to users.
Safety features can be built into the implementation. Generally it is true that not all states of a resource are legal. For example, in an operating system, each block on the disc should either belong to one named file, or should be known to be free. Ensuring that this remains true can remain the responsibility of the operating system (OS) provided that the user only accesses the disc via the abstraction that the OS provides, namely the file.
Sometimes a mechanism is provided to police the safety features. For example in the Unix operating system, it is impossible for a user program to issue an input-output instruction to access the disc directly. Any such instruction will be trapped by the machine hardware and referred to the kernel of the OS. On the other hand, in the DOS operating system, there is no such protection, so that correct usage of the disc is dependent on programmer discipline.
Likewise, in the Scheme language, any access to the machine's store is mediated by the car,cdr and cons functions. This prevents certain kinds of illegality from occurring. For example it is impossible in a Scheme system for a piece of store to be regarded as free when in fact it forms part of a user's data-structure.
By contrast, in the C language, the user has free access to her entire virtual machine, so that it is possible for a piece of store to be used in two contradictory ways by a single program.
However there is often a performance penalty associated with using a standard interface. During the evolution of computer hardware, many bus-standards have become obsolete as technology has advanced. For example, memory is now supplied as SIMM's which plug directly into a processor board. Likewise the writers of computer games are notorious for employing direct access to graphics hardware, rather than employing the standard interface, in order to achieve the necessary speed.
Likewise, the use of the car,cdr and cons functions in Scheme may carry a performance penalty compared with the more direct access offered by C. Not all storage configurations that can be created by C can be created by Scheme. Moreover the storage integrity demanded by Scheme can require that these primitive functions perform a check that the car and cdr functions are being applied to lists. The issue of efficiency is a complex one, and does not always imply that languages like Scheme are more inefficient than C, especially for large programs.
Any engineered system of any complexity exhibits abstraction layered into levels. In computer hardware, levels of abstraction are imposed by technology - there is at least a chip-level, a board-level and a system-level. Within the chip-level, there are further levels of abstraction - at least a device level and a register-level. With a processor chip there will be larger functional units, e.g. an ALU or a cache.
For each level of abstraction there are two separate problems to be addressed
While car,cdr and cons provide a base abstract view of the machine store, they operate at a low conceptual level, not related to what most programs are about. We can use them, and other facilities of Scheme, as building blocks to provide abstractions appropriate to the requirements of a given program by
Within a purely functional approach, implementation requires us to define constructor functions to build representations of the objects and selector functions to access such representations. At the base-level, cons is a constructor function car, cdr are the selectors.
(define (mk_point x y) ; the basic constructor function
(list x y))
(define x_point car) ; two selector functions.
(define y_point cadr)
From now on we use only mk_point, x_point and y_point to construct and access points. For example:
(define (mid_point p1 p2)
(mk_point
(/(+ (x_point p1) (x_point p2)) 2)
(/(+ (y_point p1) (y_point p2)) 2)
)
)
(example '(mid_point (mk_point 1 2) (mk_point 5 8)) (mk_point 3 5))
(define (diff_point p1 p2)
(mk_point
(- (x_point p1) (x_point p2))
(- (y_point p1) (y_point p2))
)
)
So any code we write to manipulate points is quite independent of the
implementation of points.
(define (mk_point x y)
(list 'point x y))
(define x_point cadr)
(define y_point caddr)
our mid_point function will still work!
(example '(mid_point (mk_point 1 2) (mk_point 5 8)) (mk_point 3 5))
example: (mid_point (mk_point 1 2) (mk_point 5 8)) = (point 3 5), ok!
|
Note however that the print out is different here -----
We can build one abstraction on top of another. For example, we can use points to define lines:
(define (mk_line x0 y0 x1 y1)
(let (
(p_0 (mk_point x0 y0))
(p_1 (mk_point x1 y1))
)
(list p_0 p_1))
)
(define point_0_line car)
(define point_1_line cadr)
(define (length_line l)
(let* (
(p0 (point_0_line l))
(p1 (point_1_line l))
(p (diff_point p0 p1))
(x (x_point p))
(y (y_point p)))
(sqrt (+ (* x x) (* y y)))
)
)
(define l1 (mk_line 0 1 6 7))
(define l2 (mk_line 0 4 7 8))
(length_line l1)
UMASS Scheme provides opaque records as an option. The function-call
(record-class class_info spec)
will return a list-structure containing record access functions. Here
class_info is a symbol or other structure that is common to all
members of the class.
The parameter spec
is a list of field specifiers. A field specifier says
what kind of data can be held in a field. The only kind of specifier we
will use is the symbol 'object,
which creates a field able to hold
any Scheme object.
The record-class
function returns a list of four items:
For example we might do:
Another problem is that we may want to have a class of objects that in
some way extends another class. For example, if we were modelling
a university, then we would want to have a basic person class that
was extended to a student class. That is a student has
all the attributes of a person (name, age, sex let's say), plus some
others, for example a list of courses that he or she is taking.
And finally we may want to say that a particular class
implements some kind of abstractly-defined capability. For
example, we might have a notion of what software to manipulate sets ought
to provide - membership, union and intersection operations, say.
The record-class facility of UMASS Scheme provides a basic
tool-kit for addressing the above issues; however "spilling out" the
package of capabilities provided by record-class into global
name-space is not a good basis for maintaining encapsulation. One paradigm
that supports encapsulation is the usual object-oriented paradigm,
in which the capabilities associated with a class of objects remain
encapsulated in a class-structure which is accessible primarily via objects
of the class. One Scheme view of this might be to implement
object-orientation in terms of a call to a function "send" which passes a
message to an object. So, instead of writing (x_point p), we
would instead write:
Barrett,R.,
Ramsay,A. and Sloman A., [1985] POP-11 A Practical Language for
Artificial Intelligence, Ellis Horwood, Chichester, England
and John Wiley N.Y.,USA.
You may also use
(define class_point (record-class 'point '(object object)))
(define mk_point (car class_point))
(define sel_point (caddr class_point))
(define x_point (car sel_point))
(define y_point (cadr sel_point))
(define dest_point (cadr class_point))
(define point? (cadddr class_point))
Starting to write an object-oriented capability for Scheme.
The
record-class capability allows to create opaque data-structures
which can only be accessed by the appropriate selector functions. However
the selector functions as we have used them just live in the global
name-space. This is a problem if we try to build a big system out of
software components written by disparate authors since we can't be sure
that some people won't use the same names for different functions. This, of
course, is a problem for the C language, which also has a big global
name-space.
(send p 'x)
Implementing this kind of capability is something of a doddle using
record-class, since we can use the first argument of the call to
provide class-common information. However actually doing this will have to
wait on us knowing about the imperative paradigm in Scheme.
References
Burstall, R.M. and Popplestone, R.J., [1968] The POP-2 Reference Manual,
Machine Intelligence 2, pp. 205-46, eds Dale,E. and Michie,D. Oliver and
Boyd,
Edinburgh, Scotland.
EXTRACT FROM POPLOG ON-LINE MANUAL - ref keys
The material below is not required reading. It explains more fully
how to use field specifiers.
6 Field Specifiers for Poplog Data
This section lists the permissible field type specifiers for Poplog
data, i.e. that can appear in the SPEC or SPEC_LIST argument to conskey
(SPEC_LIST is a list of type specifiers for a record class, and SPEC is
a single one for a vector class).
(N.B. For upward compatibility with earlier versions of Poplog, the
words "ddecimal" and "decimal" are also allowed, and are synonymous
with "dfloat" and "float" respectively. Note that "decimal" equals
"float", NOT "sfloat".)
Type Meaning
"full"
Holds a single Poplog item, and occupies one
'natural' machine word (32 bits in all current
implementations - except the DEC ALPHA).
(N.B. All vector classes constructed on the types "byte" and "sbyte"
are special insofar as they are guaranteed to be null-terminated,
i.e. to have a 0 byte following the last actual byte of the vector.
This costs on average an extra byte per vector, but allows data such
as standard strings to be passed to external C functions without
modification.)
Type Meaning
"int"
Signed integer of the 'natural' machine wordsize
(32 bits in all current implementations, range -2**31
<= I < 2**31 ).
"uint"
Unsigned integer of the 'natural' machine wordsize
(32 bits in all current implementations, range 0
<= I < 2**32 ).
"long"
Signed 'long' integer (same as "int" in all
current implementations).
"ulong"
Unsigned 'long' integer (same as "uint" in all
current implementations).
"short"
Signed 'short' integer (16 bits in all
current implementations, range -2**15 <= I < 2**15 ).
"ushort"
Unsigned 'short' integer (16 bits in all
current implementations, range 0 <= I < 2**16 ).
"sbyte"
Signed byte (8 bits in all current
implementations, range -2**7 <= I < 2**7 ).
"byte"
Unsigned byte (8 bits in all current
implementations, range 0 <= I < 2**8 ).
-N
Signed bitfield of N bits, where 1 <= N <= 32.
(Range of-2**(N-1) < = I < 2**(N-1)) ).
N
Unsigned bitfield of N bits, where 1 <= N <= 32.
(Range 0 <= I < 2**N ).
"pint"
Same as "int", but declares the field as holding
only values within the range of a Poplog simple
integer (pop_min_int <= I <= pop_max_int). When this
is known for an "int" field, using "pint" instead
gives faster access/update.
Type Meaning
"dfloat"
A double-length floating-point number in machine
format, occupying two 'natural' machine words (64
bits in all current implementations).
"sfloat"
A single-length floating-point number in machine
format, occupying one 'natural' machine word (32
bits in all current implementations).
"float"
Identical to "sfloat", EXCEPT when specified as
an external function result - see 'Additional
Field Specifiers for External Data' below.