TEACH AITHEMES
A PERSONAL VIEW OF ARTIFICIAL INTELLIGENCE
Aaron Sloman
School of Cognitive and Cognitive Sciences (COGS), University of Sussex
At University of Birmingham since 1991
http://www.cs.bham.ac.uk/~axs


This file is http://tinyurl.com/BhamCog/personal-ai-sloman-1988.html
It is also available as PDF: http://tinyurl.com/BhamCog/personal-ai-sloman-1988.pdf

This was originally published as the Preface to
Computers and Thought: A Practical Introduction to Artificial Intelligence,
(Explorations in Cognitive Science)
By Mike Sharples, David Hogg, Chris Hutchinson, Steve Torrance, David Young
MIT Press, 20 Oct 1989 - 433 pages
Available without diagrams online here, both browsable and as a zip package:
http://www.cs.bham.ac.uk/research/projects/poplog/computers-and-thought
http://www.cs.bham.ac.uk/research/projects/poplog/computers-and-thought.zip

Related teaching material for use with Poplog/Pop11
http://www.cs.bham.ac.uk/research/projects/poplog/contrib/pop11/ct_book
http://www.cs.bham.ac.uk/research/projects/poplog/freepoplog.html

This preface has also been available since about 1988 as 'TEACH' file in
the Poplog system: TEACH AITHEMES

See also
http://tinyurl.com/thinky-ex
   Thinky programming and other kinds
http://tinyurl.com/thinkyprog
   Tips on how to teach thinky programming:
http://tinyurl.com/PopVidTut
    Video tutorials on some of this material
_______________________________________________________________________

CONTENTS

 -- Introduction
 -- What then is AI?
 -- Goals of AI: the trinity of science
 -- But what is intelligence? Three key features:
 -- Intentionality
 -- Flexibility
 -- Productive laziness
 -- Sub areas of AI
 -- A simple architecture
 -- Sketch of a not very intelligent system
 -- Limitations of the model
 -- Less ambitious projects
 -- Key ideas in AI models
 -- Computers vs brains
 -- "Non-cognitive" (?) states and processes
 -- Conceptual analysis
 -- Tools for AI
 -- An example of the expressive power of an AI language
 -- Horses for courses: multi-language, multi-paradigm systems
 -- Conclusion
 -- Bibliography


-- Introduction

There are  many  books,  newspaper  reports  and  conferences  providing
information  and  making  claims  about  Artificial Intelligence and its
lusty baby the field of Expert Systems. Reactions range from one lunatic
view  that  all  our  intellectual  capabilities  will  be  exceeded  by
computers in a few years time to the slightly more  defensible  opposite
extreme view that computers are merely lumps of machinery that simply do
what they are programmed to do and therefore cannot conceivably  emulate
human thought, creativity or feeling. As an antidote for these extremes,
I'll try to sketch a sane middle-of-the-road view.

In the long-term AI will  have  enormously  important  consequences  for
science  and  engineering  and  our view of what we are. But it would be
rash to speculate in detail about this.

In the short to medium term there are extremely difficult problems.  The
main initial  practical  impact  of  AI will  arise  not  so  much  from
intelligent  machines  as  from  the  use  of  AI  techniques  to  build
'intelligence amplifiers' for  human beings. Even  if machines have  not
advanced enough to be capable of designing complex systems,  discovering
new concepts and theories, understanding speech at cocktail parties  and
taking all our important economic, political and military decisions  for
us, AI systems may nevertheless be  able to help people to learn,  plan,
take decisions, solve  problems, absorb  information, find  information,
design things, communicate  with one  another or  even just  brain-storm
when confronted with a new problem.

Besides helping human thought processes, AI languages, development tools
and  techniques  can  also  be used for improving and extending existing
types of  automation,  for  instance:  cataloguing,  checking  software,
checking   consistency   of  data,  checking  plans  or  configurations,
formatting documents, analysing images, and many kinds of monitoring and
controlling activities.

But there is no sharp boundary between such AI applications and computer
science generally. Indeed the boundary is not only fuzzy but shifts with
time, for established AI  techniques and solved  AI problems are  simply
absorbed  into  mainstream  computer  science.  A  striking  example  is
compiling:  once   only   human  beings   could   understand   algebraic
expressions, and making a machine do  likewise was a problem in AI.  Now
any humdrum compiler for  a programming language can  do it (apart  from
some quirky languages, like simpler versions of the most widely used  AI
language, namely LISP!).

-- What then is AI?

Some people give it a very narrow definition as an applied sub-field  of
computer science. I prefer a definition that reflects the range of  work
reported at  AI  conferences, in  AI  journals, and  the  interests  and
activities of some of the  leading practitioners, including founders  of
the subject. From this viewpoint AI  is a very general investigation  of
the nature of  intelligence and the  principles and mechanisms  required
for understanding or replicating it. Like all scientific disciplines  it
has three main types of goal, theoretical, empirical, and practical.

-- Goals of AI: the trinity of science

The long term goals of AI include: finding out what the world  is  like,
understanding it, and changing it, or, in other words:

(a) empirical  study  and  modelling  of  existing  intelligent  systems
    (mainly human beings);

(b) theoretical analysis and exploration of possible intelligent systems
    and  possible mechanisms, architectures or representations usable by
    such systems;

(c) solving practical problems in the light of (a) and (b), namely:

    (c.1) attempting to  deal  with  problems  of  existing  intelligent
         systems   (e.g.    problems  of  human  learning  or  emotional
         difficulties) and

    (c.2) designing new useful intelligent or semi-intelligent machines.

In the course of these activities AI  generates  new  sub-problems,  and
these lead to new concepts, new formalisms, and new techniques.

Some people restrict the term 'Artificial Intelligence' to a  subset  of
this  wide-ranging  discipline.  For  example  those  who think of it as
essentially a branch of engineering restrict it to (c.2). This does  not
do justice to the full range of work done in the name of AI.


In any case, it is folly to try to produce engineering solutions without
either  studying  general  underlying  principles  or  investigating the
existing intelligent systems  on  which  the  new  machines  are  to  be
modelled  or  with  which  they  will  have to interact. Trying to build
intelligent systems without  trying  to  understand  general  principles
would  be  like  trying  to  build  an  aeroplane  without understanding
principles of mechanics or aerodynamics. Trying to  build  them  without
studying  how people or other animals work would be like trying to build
machines without ever studying the properties of any naturally occurring
object.

The need to study general principles of thought, and the ways  in  which
human  beings  perceive,  think, understand language, etc. means that AI
work has to be done in close  collaboration  with  work  in  psychology,
linguistics,  and  even philosophy, the discipline that examines some of
the most general presuppositions of our thought and language.

This is why, at some Universities, AI has  not  been  restricted  to  an
engineering  department.  In fact it is now often to be found in several
different areas of a University. E.g. at  Sussex  University  it  is  in
several different Schools including the School of Cognitive Sciences.

The term 'Cognitive Science' can also be used to cover the full range of
goals  specified above, though it too is ambiguous, and some of its more
narrow-minded practitioners tend to restrict it to (a) and (c.1).

-- But what is intelligence? Three key features:

The  goals  of  AI  have  been  defined  in  terms  of  the  notion   of
intelligence.   I  don't  pretend  to  be  able to offer a definition of
'intelligence'. However, most, if not all,  the  important  work  in  AI
arises out of the attempt to understand three key characteristics of the
kind of intelligence found in people and, to  different  degrees,  other
animals.  The  features  are intentionality, flexibility, and productive
laziness.

-- Intentionality
   This is the ability to have internal states  that  refer  to  or  are
   ABOUT entities or situations more or less remote in space or time, or
   even non-existent or wholly abstract things.

   So intentional states include contemplating clouds, dreaming you  are
   a  duke,  exploring  equations, pondering a possible action, seeing a
   snake or wanting to win someone's favours. These  are  all  cases  of
   awareness  or  consciousness  of something, including hypothetical or
   impossible objects or situations. A sophisticated mind may also  have
   thoughts  or  desires  about  its  own  state - various forms of SELF
   consciousness are also cases of intentionality.

Particular categories of intentional states include:

   - perceiving something

   - believing or knowing something

   - wanting something, or having something as a goal

   - considering or imagining a possibility

   - asking a question about something

   - having a plan or strategy

All intentional states seem to require the existence  of  some  kind  of
REPRESENTATION  of  the  content  of  the  state: some representation of
whatever is believed, perceived, desired, imagined, etc. A  major  theme
in  AI  is therefore investigation of different kinds of representations
and their implementation and uses. This is a very  tricky  topic,  since
there  are  many  different  kinds of representational forms: sentences,
logical symbols, computer data-bases, maps,  diagrams,  arrays,  images,
etc.  It  is  very  likely  that  there  are  still  important  forms of
representation waiting to be discovered.

Moreover, many representations are themselves abstractions that are  not
necessarily  explicitly or directly embodied in physical structures, for
example a very large sparse array that is encoded in a compact form.  It
is  therefore  useful to talk about 'virtual representations' as opposed
to physical representations.

A particularly important case involves the use of inference  procedures.
If  new  conclusions can be drawn from what is represented, then besides
the information stored explicitly there is additional  information  that
can  be  DERIVED  when  needed. Thus we all have knowledge of arithmetic
that goes beyond the tables we have learnt explicitly, since we know how
to  derive  new facts from them. A different example is using an old map
to work out a new route.  Different  kinds  of  representations  require
different kinds of inference mechanisms.

One reason why computers are powerful tools  for  exploring  intentional
systems  is  that  they  can  very  rapidly  construct or change virtual
representations, whereas mechanical construction would often be too slow
to  deal with a world that waits for no man or machine. Brains also seem
to have this ability, though exactly how  they  do  it  remains  largely
unexplained.  Perhaps  new  kinds  of  machines will one day exhibit new
kinds of rapid structural variability enabling new kinds of intelligence
to be automated.

-- Flexibility
   This has to do with the breadth and variety of intentional  contents,
   for instance the variety of types of goals, objects, problems, plans,
   actions,  environments  etc.  with  which  an  individual  can  cope,
   including the ability to deal with new situations using old resources
   combined and transformed in new ways.

Flexibility in this sense is required for understanding a  sentence  you
have  never  heard  before, seeing a familiar object from a new point of
view, coping with an old  problem  in  a  new  situation,  dealing  with
unexpected obstacles to a plan. A kind of flexibility important in human
intelligence involves the ability to raise a wide range of questions.

A desirable kind of flexibility often missing in  computer  programs  is
'graceful degradation'. Often if the input to a computer deviates at all
from what is expected the result is simply an error message  and  abort,
or  worse  in  some  cases. Graceful degradation on the other hand would
imply being able to try to cope with the unexpected  by  re-interpreting
it,  or  modifying  one's  strategies, or asking for help, or monitoring
actions more carefully. Instead  of  total  failure,  degradation  might
include  taking  longer to solve a problem, reducing the accuracy of the
solution, reducing the frequency of success, and so on.

One of the factors determining the degree of  flexibility  will  be  the
range  of  representations available. A system that can merely represent
things using a vector of numerical measures, for example,  will  have  a
narrower  range  of  possible  intentional states than a system that can
build linguistic descriptions of unlimited complexity, like:

                                the man
                              the old man
                       the old man in the corner
              the old man sitting on a chair in the corner
   the sad old man sitting on a chair with a broken leg in the corner
                                  etc.

So flexible control systems of the future will have  to  go  far  beyond
using numerical measures, and will have to be able to represent goals or
functions, and relationships between structures,  resources,  processes,
constraints, and so on.

Another requirement for flexibility is non-rigid control structures.  In
most   machines  behaviour  is  pre-determined  by  structure.  Computer
programs with conditional  instructions  allow  more  flexibility.  Even
greater  flexibility is achieved by turning the whole program into a set
of condition-action rules, as is done in some AI  programming  languages
known as 'production systems'. Then, instead of the programmer having to
determine in advance a good order in which  tests  should  be  made  and
actions attempted, the rule interpreter can examine the applicable rules
and decide in the light of the context at 'run time'. If the program can
change the set of rules yet more flexibility is available.

However, an excess of flexibility can cause its own problems, notably  a
lack   of  control.  That  leads  to  the  idea  of  a  layered  process
architecture where some kind of higher level supervisor program  watches
over  the  actions of lower level programs and decides when they need to
be suspended, modified, or aborted. This kind of flexibility is not much
in evidence in AI programs yet, but will become increasingly feasible as
computer power becomes cheaper and more readily available.

Different kinds of flexibility are to be found in  different  organisms.
For example, birds that can build only one sort of nest may nevertheless
be very flexible and adaptive in relation to availability  of  materials
and  sites for such nests. Many aspects of human intelligence range over
a potentially infinite variety of structures - for  instance  infinitely
many   sentences,   dance  movements,  algebraic  equations,  or  social
situations. To account for this we need to study the generative power of
the  underlying  mechanisms  and  representations, as well as mechanisms
that allow major changes of direction in the light of new information.

-- Productive laziness
   It is not enough to achieve results: intelligence is partly a  matter
   of  HOW  they  are  achieved.  Productive  laziness involves avoiding
   unnecessary work.

A calculator blindly follows the rules for multiplication  or  addition.
It  cannot  notice  short cuts. If you tell it to work out 200 factorial
minus 200 factorial, it will do a lot of  unnecessary  computation,  and
perhaps  produce  an  overflow  error. The intelligent solution is a far
more lazy one. A chess champion who wins  by  working  through  all  the
possible sequences of moves several steps ahead and choosing the optimal
one is not as intelligent as the player who avoids explicitly  examining
so  many  cases because he notices some higher level pattern that points
directly to the best move.

The implications of this kind of laziness are profound.  In  particular,
noticing  short  cuts often requires using a far more complex conceptual
structure, such as might be needed to discern high level  symmetries  in
the  problem  space.  Compare  trying to answer the question 'Is there a
prime number bigger than a billion?' by searching for one, with Euclid's
lazy  approach  of proving in a few lines that there is no largest prime
number.

Why is laziness important? Given any solvable task for  which  a  finite
solution is recognizable, it is possible in principle to find a solution
by enumerating all possible actions (or all possible computer  programs)
and checking them exhaustively until the right one turns up. In practice
this is useless because the set of possibilities is too great.

This is the 'combinatorial explosion'. Any construction  involving  many
choices  from  a set of options has a potentially huge array of possible
constructs to choose from. If  you  have  four  choices  each  with  two
options  the total set of options is sixteen. If you have twenty choices
each with six options, the total  shoots  up  to  3,656,158,440,062,976.
Clearly  exhaustive  enumeration  is not a general solution. The tree of
possible moves in chess is larger than the number of  electrons  in  the
Universe  (if we are to believe the physicists). So lazy short cuts have
to be found.

For example a magic square is an array of  numbers  all  of  whose  rows
columns  and  diagonals add up to the same total. Here is a 3 by 3 magic
square made of the digits 1 to 9.

    672
    159
    834

If you try to construct an N by N magic square by  trying  all  possible
ways  of  assigning  the NxN numbers to the locations in the square then
the number of possible combinations is the factorial of NxN. In the case
of the 3x3 square that makes 362,880 combinations. Trying them all would
not be intelligent. A sensible procedure would involve  testing  partial
combinations   to   see   whether   they   can   possibly   be  extended
satisfactorily, and, if not, rejecting at one blow all the  combinations
with that initial sequence.


It is also sensible to look for symmetries in the problem. Having  found
that  you  can't  have  the  number 5 in the top left corner, reject all
combinations that involve 5 in any corner.

Yet more subtle  arguments  can  be  used  to  prune  the  possibilities
drastically.   For  example, since eight different triples with the same
total are needed, it is easy to show that large and small  numbers  must
be  spread evenly over the triples, and that they must in fact add up to
15. So the central number has to be in four different triples adding  up
to  15,  the  corner  numbers  in  three  triples each, and the mid-side
numbers in two each. For each number we can work out how many  different
triples it can occur in, and this immediately restricts the locations to
which they can be assigned. E.g. 1 and 9 must go into locations  in  the
middle of a side, and the only candidate for the central square is 5. In
fact, a high level symmetry shows  that  you  need  bother  to  do  this
analysis  only for the numbers 1 to 4. You can then construct the square
in a few moves, without any trial and error. What about  a  two  by  two
magic square containing the numbers 1, 2, 3 and 4? Think about it!

These examples show that the ability to detect short cuts  requires  the
ability  to  DESCRIBE the symmetries, relationships, and implications in
the structure of the task.  It also requires the ability to NOTICE  them
and  perceive their relevance, even though they are not mentioned in the
statement of the  task.  This  kind  of  productive  laziness  therefore
depends   on   intentionality   and  flexibility,  but  motivates  their
application. Discovering relevant relationships  not  mentioned  in  the
task  specification  (e.g.  "location  X  occurs  in  fewer triples than
location Y") requires the use of  a  generative  conceptual  system  and
notation.

An  intelligent  problem  solver  therefore  requires  a   rich   enough
representation language to express the constraints and describe relevant
features, and a powerful inference system to work out  the  implications
for  choices.  Being  lazy  in  this  way is often harder than doing the
stupid exhaustive search. But it may be very much faster. This points to
a need for an analysis of the notion of intellectual difficulty.

Productive laziness often means applying previously  acquired  knowledge
about  the  problem  or  some  general class of problems. So it requires
learning: the ability to form new concepts and to acquire and store  new
knowledge  for  future application. Sometimes it involves creating a new
form of representation, as has happened often in the history of  science
and mathematics.

Laziness motivates a desire for generality -- finding one solution for a
wide  range  of  cases  can save the effort of generating new solutions.
This is one of  the  major  motivations  for  all  kinds  of  scientific
research.  It can also lead to errors of over-generalisation, prejudice,
and the like. A more  complete  survey  would  discuss  the  differences
between  avoiding  mental  work  (saving  computational  resources)  and
avoiding physical work.

-- Sub areas of AI

So far I have given a very general characterisation of intelligence  and
the  goals  of  AI. Most work in the field necessarily focuses on a sub-
area, and each area has its own literature growing too fast  for  anyone
to keep up with.

The topic can be divided up in a number of ways. One  form  of  division
reflects  the supposed architecture of an autonomous intelligent system.
Thus  people  study  components  like  vision,  language  understanding,
memory,  planning,  learning,  motor  control,  and so on. These include
empirical studies of people and other animals  as  well  as  exploratory
engineering designs.

There are also attempts to address what appear to be general issues, for
instance   about   suitable   representational   formalisms,   inference
strategies,  search  algorithms,  or  suitable  hardware  mechanisms  to
support  intelligent  systems.  A  second  order debate concerns whether
there are any generally useful formalisms or inference engines. Some who
oppose  the notion argue that different kinds of expertise require their
own representations and algorithms, and indeed early attempts to produce
general  problem  solvers  showed  that they often had a tendency to get
bogged down in combinatorial searching.

Until recently computer power has been expensive and scarce,  so  hardly
anybody  has  been  able  to  do  anything  about  assembling integrated
systems.  Increasingly, however,  we  can  expect  to  see  attempts  to
produce  robots  with  a  collection of computers working together. This
will lead to investigations of different kinds of  global  architectures
for  intelligent  systems. In particular, whereas most AI systems in the
past  have  been  based  on  a  single  sequential  process,   it   will
increasingly   be   appropriate   for   different   subsystems  to  work
asynchronously in parallel.

-- A simple architecture

Initially it is to be expected that systems will be  designed  with  the
following main components:

(a) Perceptual mechanisms
   These mechanisms analyse (e.g. parse) and interpret information taken
   in by the 'senses' and store the interpretations in a database.

(b) A database of information.
   This is not just as a store of facts, for a database can  also  store
   procedural  information, about how to do things, in a form accessible
   by planning procedures. It may include both particular facts provided
   by the senses and generalisations formed over a period of time.

(c) Analysis and interpretation procedures
   These are procedures which examine the data provided by  the  senses,
   break  them  up into meaningful chunks, build descriptions, match the
   descriptions, etc. Analysis involves describing what is presented  in
   the   data.    Interpretation  involves  describing  something  else,
   possibly lying behind the  data,  for  instance  constructing  a  3-D
   description  on  the  basis  of  2-D  images,  or inferring someone's
   intentions from his actions.

(d) Reasoning procedures.
   These use information in the database to derive  further  information
   which  can  also  be stored in the database. For instance if a lot of
   information about lines is in the database, inference procedures  can
   work  out  where  there are junctions. If you know that Socrates is a
   man, and that all men are mortal, you can infer something  new  about
   Socrates.

(e) A database of goals.
   These just represent possible situations which it is intended  should
   be  made  ACTUAL. There may also be policies, preferences ideals, and
   the like.

(f) Planning procedures.
   These take a goal, and a database of  information,  and  construct  a
   plan  which  will  achieve  the goal, assuming the correctness of the
   information in the database.

(g) Executive mechanisms and motors
   These translate plans into action.

Often the divisions will not  be  very  clear.  For  instance  is  'this
situation is painful' a fact or a goal concerned with the need to change
the situation?

This sort of model can be roughly represented by the following diagram.

-- Sketch of a not very intelligent system

We  use  curly  braces  to  represent  {PROCESSES}  square  brackets  to
represent  stored  [STRUCTURES] and parentheses to indicate (PROCEDURES)
which generate processes.

    --> {parsing sentences} ----->|
        (parsing procedures)      |
                                  |
    --> {analysing images} ------>|
        (visual procedures)       |
                                  |
    --> {other kinds of sensory   |
        analysis} (analysis and   |--> [database of beliefs]
        interpretation procedures)           |      /|\     |
                                             |       |      |
                                            \|/      |      |
                    [goals]                 {reasoning}     |
                      |                 (inference rules)   |
                     \|/                                    |
                    {planning} <----------------------------+
                (problem solvers)
                      |
                     \|/
    <--{motors} <---[plans]


-- Limitations of the model

This sort of diagram conceals much hidden complexity. Each of the  named
sub-processes may have a range of internal structures and sub-processes,
some relatively permanent, some very short term.

However, even this kind of complexity does not do justice to the kind of
intelligence that we find in human beings and many animals. For example,
there is a need  for  internal  self-monitoring  processes  as  well  as
external  sensory  processes.  A richer set of connections may be needed
between sub-processes.  For example perception may need to be influenced
by beliefs, current goals, and current motor plans. It is also necessary
to be able to learn from experience, and that requires processes that do
some  kind of retrospective analysis of past successes and failures. The
goals of an autonomous  intelligent  system  are  not  static,  but  are
generated  dynamically  in  the  light  of  new information and existing
policies, preferences, and  the  like.  There  will  also  be  conflicts
between  different  sorts of goals that need to be resolved. Thus 'goal-
generators' and 'goal-comparators' will be needed,  and  mechanisms  for
improving these in the light of experience.

In the case of real-time intelligent systems further complexities  arise
from  the  need to be able to deal with new information and new goals by
interrupting, modifying, temporarily  suspending,  or  aborting  current
processes.  I  believe  that  these  are  the kinds of requirements that
explain some kinds of emotional states  in  human  beings,  and  we  can
expect similar states in intelligent machines.

It is possible that full replication  and understanding of the types  of
intelligence found  in  people  (and other  animals)  will  require  the
development of new physical designs for computers. Already there is work
investigating  highly  parallel  "connectionist"  architectures  loosely
modelled on current theories  about the brain as  an assembly of  richly
interconnected neurons  that  compute  by exciting  and  inhibiting  one
another.  Such  machines  might  be  specially  useful  for  long   term
associative  memory  stores,  and  for  low  level  sensory  processing.
However, the  hardest problem  will  be knowing  how to  'program'  such
machines.

It may also turn out that we need to  discover  entirely  new  kinds  of
formalisms  or  representations. For example, at present it is very hard
to give machines a good grasp of spatial structures and relationships of
kinds  that  we  meet  in  everyday  natural  environments. It isn't too
difficult for a computer to represent a shape bounded entirely by  plane
or  simply  curved  surfaces.  But  we,  and  other animals, have visual
systems  without  that  restriction.  Similar  comments  apply  to   the
representation   of   motion,   e.g.  in  a  ballet,  or  the  non-rigid
transformations of a woollen jumper as you take it out of a  drawer  and
put it on.

-- Less ambitious projects

Much AI work is concerned with  subsystems  of  an  intelligent  system,
rather than trying to design a complete autonomous intelligent robot.

In most cases the hardest problems  involve  identifying  the  knowledge
that  is  required to perform a task, and finding good ways to represent
it. As already hinted, in vision there is a largely unsolved problem  of
representing  shapes  and motion in sufficient generality to accommodate
the range of objects we all perceive effortlessly. In  designing  speech
understanding  systems  a  key question is what features in the acoustic
signal  are  significant  in  identifying  the   meaningful   units   in
utterances.  In  designing fault diagnosis systems it is often extremely
difficult to  identify  the  clues  actually  used  by  an  expert,  the
inference strategies used in drawing conclusions from the clues, and the
control strategies used in deciding what to do next when the problem  is
difficult.  The  difficulties are compounded when the expert needs to be
able to combine different sorts of knowledge in a new way,  for  example
knowledge  about electrical properties of components, the mechanical and
spatial properties, the thermal properties, and the functional design of
the system.

One reason these tasks are so difficult is that much human expertise  is
below  the  level  of  consciousness.  People are quite unable simply to
write  down  the  grammatical  rules  they   use   in   generating   and
understanding their native language, despite many years of use. The same
applies to most areas of human expertise, though paradoxically it is the
most  advanced  and specialised forms, usually learnt late in life, that
are easiest to articulate. This is often partly because  they  are  less
rich and complex than more common and superficially impressive abilities
shared by all and sundry. This has  led  to  techniques  for  'knowledge
elicitation',  a  process  that often has much in common with methods by
which philosophers probe hidden assumptions  underlying  our  conceptual
systems. (See below.)

For those who wish to apply AI in such a way as to avoid these difficult
research  issues,  it  is  generally  advisable  to  tackle much simpler
problems, for example fault-diagnosis problems where there is already  a
lot of clearly articulated reliable information on how to track down the
causes of malfunctions.

-- Key ideas in AI models

Several important concepts and techniques keep cropping up  in  work  in
AI, including the following:

(a) Structural description (e.g. list, database). This generally depends
   on  analysis  of  a structure, e.g. Segmenting and recognising parts,
   properties and relationships, which may then be described.

(b) Matching (e.g. see the TEACH *MATCHES and *SCHEMATA files.)

(c) Canonical form (to simplify matching, searching in database,  etc.).
   An  example  is  trying  to  represent seen objects in terms of their
   internal structure rather than in terms of their appearance from  one
   viewpoint.

(d) Domain (a class of structures, with its  laws  of  well-formedness).
   E.g sentences of English form a domain, logical proofs form a domain,
   three-D polyhedra form a domain. 2-D line  drawings  form  a  domain.
   Domains can overlap, and one can include another.

(f) Interpretation of a structure (building another structure  which  it
   is  taken  to  represent).  For  instance interpreting a 2-D image by
   building a description of the 3-D scene depicted.

(g) A search space. The structure of a class of  problems  and  possible
   solutions to those problems is often thought of geometrically.

(h) Search strategy (controlling search).

(i) Inference. (Deduction, reasoning.)

(j) Alternative representations of the same thing (e.g.  turtle  picture
   vs database).

(k) Indexing and addressing. E.g. how do you recognise and complete  the
   following  so quickly 'A ---- in time saves ----', 'Birds do it, bees
   do it, even -----',  etc.,  when  you  have  hundreds  of  thousands,
   probably  millions  of  items  of stored information in your mind. It
   can't be that you search LINEARLY through the lot.

(l) Structure sharing. This is a very important and general notion which
   can  be  found in recognition processes, problem-solving and planning
   processes, inference processes,  etc.  The  basic  idea  is  that  if
   different  alternatives have something in common, you should not have
   to repeat the exploration of the common parts. This can  considerably
   reduce  the  amount of backtracking required in a search process, for
   instance. (TEACH  VIEWS  describes  a  package  that  uses  structure
   sharing.)

   (m) Heuristic evaluation and search-pruning procedures.

   (n) The transition from matching to inference. A search  for  a  good
   match  can  often  be  CONTROLLED  in  part  by  restrictions  on the
   variables, e.g. pattern elements like:

           ??X:NP

   where the procedure NP checks that what is matched  against  X  is  a
   noun-phrase.  (LIB  GRAMMAR  uses  MATCH  in this way). In general, a
   process  which  we  would  ordinarily  call  matching,  for  instance
   matching  a 3-D scene against a 2-D image may include a great deal of
   inference, in addition to checks for correspondences  between  parts.
   An extreme case would be the notion of matching a GOAL against a PLAN
   to achieve the goal. The notion of 'match' here has been considerably
   stretched. How does one check that a plan will or can achieve a goal?
   One of the current debates in AI concerning the  importance  of  what
   are  called  'SCRIPTS'  or  'FRAMES'  can  be  interpreted  as  being
   concerned with the issue that inference can  be  kept  to  a  minimum
   during  much  of the matching required for perception, understanding,
   planning.

(o)  Trade-offs.  Closely  connected  with  several  of  the  previously
   mentioned ideas is the idea of a trade-off. By doing more work at the
   time you build up a structure you may be able to use  it  later  with
   less  effort:  e.g.  a  trade-off  between compile time and execution
   time. Converting descriptions into a  'canonical'  form  to  simplify
   matching  and  recognition  is  an example. A more familiar trade-off
   concerns time against space.  Another is  generality  or  flexibility
   against efficiency.

   Does the transition from Roman to Arabic numerals  involve  a  trade-
   off,  or  is  it  pure  gain? What about using a new symbol for every
   word, versus building words out of simpler symbols?

-- Computers vs brains

Whether or not the model  sketched  above  is  accurate,  concepts  like
these,  which have proved essential for exploring the model, may also be
essential for developing correct theories about how the mind works. This
may  be so even if the human mind is embodied in a physical system whose
fundamental computational architecture is very different from  a  modern
digital  computer:  e.g.  it  seems  to  be  more like a huge network of
communicating computers each connected to thousands  of  others  in  the
net.

Computer models like this are sometimes called "connectionist" models.

-- "Non-cognitive" (?) states and processes

One of the standard objections  to  AI  is  that  although  it  may  say
something   useful   about  COGNITIVE  processes,  such  as  perception,
inference and planning, it says nothing about other aspects of mind such
as motivation and emotions.  In particular, AI programs tend to be given
a single 'top-level' goal, and everything  they  do  is  subservient  to
this,  whereas  people  have  a large number of different wishes, likes,
dislikes, hopes, fears, principles, ambitions, all of which can interact
with  the processes of deciding and planning, and even such processes as
seeing physical objects or understanding a sentence.   This  is  correct
and important.

There are ways of extending the model so as to begin to cope  with  this
sort  of  complexity,  without  leaving  a computational framework.  For
example, what sorts of processes can  produce  new  motives?  How  would
motives be represented? What sorts of processes could select motives for
action? How would one motive (e.g. a fear or preference)  interact  with
the  process  of  trying  to  achieve  another? In order to answer these
questions we must clarify what we understand  by  the  key  terms.  This
requires conceptual analysis.

-- Conceptual analysis

This involves taking  familiar  concepts,  like  'knowledge',  'belief',
'explanation',  'anger',  and  exploring  their structure. What sorts of
things can they be applied to, how are they related to  other  concepts,
and  what  is  their role in our thinking and communication? To meet the
above criticism of AI in full, it is necessary to  engage  in  extensive
analysis  of many concepts which refer to mental states and processes of
kinds which AI work does not at present say much  about,  concepts  like
'want',   'like',   'enjoy',   'prefer',   'intend',   'afraid',  'sad',
'pleasure', 'pain', 'embarrassed', 'disgusted',  'exultation',  and  the
like.

This is not an easy task, since we are largely unconscious  of  how  our
own  concepts  work. However, by showing how motives of many kinds might
co-exist  in  a  single  system,  generating  many  different  kinds  of
processes,  some of which disturb or disrupt others, we may begin to see
how, for example, emotional states might be accounted  for.  This  would
require  considerable  extension  of the model outlined above, and would
make use of concepts used not so much in AI work as in computer science,
especially  in  the  design  of operating systems, for instance concepts
like  'interrupt',  'priority'  and  'communication  between  concurrent
processes'. But such modelling is still some way off.

-- Tools for AI

Anyone who has spent much time programming will appreciate that  getting
computers  to perform AI tasks is not easy. Moreover, most of the widely
used programming languages were not designed for this sort  of  purpose,
and  the  programming  support  tools,  such  as  editors, compilers and
debuggers, are not adequate for projects that  are  not  concerned  with
implementing  well-understood  algorithms  worked  out in advance on the
basis of mathematical analysis.

AI development work requires languages that  support  a  wide  range  of
representations including things like verbal descriptions, logical rules
of inference, plans, definitions of concepts, images  and  speech  wave-
forms. This requires the use of languages that make it easy to build and
manipulate non-numerical as well as numerical  structures.  Examples  of
such  highly  expressive  languages  are  LISP,  the oldest AI language,
Prolog, a language based on logical  inference,  and  POP-11,  developed
first  at Edinburgh University (as POP-2) then at Sussex. POP-11 has the
power of LISP but a far more readable syntax and a range  of  additional
features.

Moreover, since the process of building a program is often  a  tentative
exploratory  task,  part of whose goal is to find out precisely what the
constraints and requirements for the program are,  it  is  necessary  to
provide  languages  and  compilers  that support 'rapid prototyping' and
very flexible experimentation. Compilers for conventional languages such
as  C, Ada, Fortran, Pascal, for example, do not allow you to define new
experimental procedures or modify old ones, without re-linking the whole
system,  which  can be very slow and wasteful of human and computer time
if  the  system  is  already  big.  So  AI  development  tools   include
interpreters  and  incremental  compilers and editors that are linked in
with the compilers so that there is  no  need  for  continual  switching
between  the two. The best development environments for LISP, Prolog and
POP-11 provide such integrated support tools.

-- An example of the expressive power of an AI language

I'll give one example to illustrate the kind of thing that AI  languages
provide  to  simplify programming tasks. Suppose you have to store lists
of lists of words and for some reason need a program to find  a  sublist
containing  a  pair  of  given  words and produce a list of the words in
between. For example given the pair of words "cat" "horse" and the  list
of lists:

[[book cat chair spoon][ape cat dog flea horse shark][castle house tower]]

it should produce the list: [dog flea]. Writing a program like this in a
language  like  C  or PASCAL would require the use of three nested loops
and rather complicated constructs for back-tracking if you find a  false
clue  like "cat" in the first list. The POP-11 a pattern matcher enables
you to write a single line instruction:

         list_of_lists  -->  [== [== cat ??wanted horse ==] ==]

(or a more general form replacing "cat" and "horse" with variables),  to
solve this problem.

Having expressive constructs tailored to the requirements  of  the  task
enables  programmers to get things right first time far more often. This
is one reason  why  many  AI  systems  include  "macro"  facilities  for
extending the syntax of the language to suit new applications. Similarly
it is often useful to try one method to solve a task and if  that  fails
try   others,   where  each  method  itself  involves  trial  and  error
strategies. Programming this back-tracking control structure yourself is
tedious,  and  you  may not do it efficiently, whereas Prolog provides a
very general form of it built in to the language.

-- Horses for courses: multi-language, multi-paradigm systems

Which language is best for AI? This is a misguided  question.  Different
languages  are  needed for different problems or different sub-problems,
and for that reason a good AI  development  environment  should  make  a
range  of  languages  available  in  such  a  way  as to make it easy to
integrate programs written  in  different  styles.  Also,  even  if  one
language  is  ideal  for  a  particular project, it may be that there is
software  readily  available  in  another  language.   Duplicating   the
development  could  be  very wasteful. So a system that makes it easy to
link in a program written in another language is desirable.

POPLOG attempts to meet this requirement. It includes all three  of  the
languages  mentioned  above,  all  incrementally  compiled into a common
portable "virtual machine", which runs  on  a  range  of  computers  and
operating  systems (in 1986 these are: VMS, UNIX System V, Berkeley UNIX
4.2, on VAX, DEC 8000  series,  Hewlett-Packard  9000/200  and  900/300,
SUN-2,  SUN-3,  Bleasdale,  GEC-63,  Apollo  Domain  - and probably more
later). It also allows programs written in conventional languages to  be
linked   in  and  unlinked  dynamically,  and  provides  facilities  for
developing new special-purpose sub-languages suited to  particular  sub-
tasks.  (The  detailed  mechanisms  are described in REF *SYSCOMPILE and
REF *VMCODE. The Alvey Real-time Expert Systems Club,  for  example made
good  use  of  this  language-extension  facility, which is also used to
implement all the POPLOG Languages.

It is very likely that other systems will become available offering some
or  all of the POPLOG features. Already there are some LISP systems that
include a PROLOG subset. POPLOG itself is being used in  many  countries
including  the  UK,  the  USA,  Scandinavia,  Europe,  India,  Japan and
Australia. E.g. it the core teaching system in a Masters degree  in  the
University of New South Wales.

-- Conclusion

This is by no means a complete overview of AI and its tools. At  best  I
hope  I  have whetted the appetites of those for whom it is a new topic.
The bibliography includes pointers to books and papers that  extend  the
points made in this article.

As readers may have discerned, my own interests are mainly in the use of
AI  to explore philosophical and psychological problems about the nature
of the human mind, by designing and testing models of  human  abilities,
analysing  the  architectures,  representations and inferences required,
and so on. These are long term problems.

In the short  run,  my  guess  is  that  the  most  important  practical
applications  will be in the design of relatively simple expert systems,
and in the use of AI tools for non-AI programming, since the  advantages
of  such  tools  are  not  restricted  to  AI projects. In principle, AI
languages and tools could also have a profound  effect  on  teaching  by
making   new  kinds  of  powerful  teaching  and  learning  environments
available, giving pupils a chance  to  explore  a  very  wide  range  of
subjects by playing with or building appropriate programs. But since our
culture does not attach much  importance  to  education  as  an  end  in
itself,  I  fear  that  this  potential  will  not  be realised. Instead
millions will be spent on military applications of AI.

-- Bibliography
R. Barrett, A. Ramsay and A. Sloman POP-11: A Practical Language for AI,
    Ellis Horwood and John Wiley, 1985, reprinted 1986.
Margaret Boden, Artificial Intelligence and Natural Man,
    Harvester press, 1977.
E. Charniak and D. McDermott, Introduction to Artificial Intelligence,
    Addison Wesley, 1985.
William S. Clocksin and C.S. Mellish, Programming in Prolog,
    Springer-Verlag, 1981
John Gibson, 'POP-11: an AI Programming Language' in Yazdani 1984.
David Marr, Vision,
    Freeman 1982.
Tim O'Shea and Marc Eisenstadt, editors: Artificial Intelligence:  Tools
    Techniques Applications,
    Harper and Row, 1984.
Allan Ramsay and Rosalind Barrett, AI in practice: examples in POP-11
    Ellis Horwood and John Wiley, forthcoming 1987.
Elaine Rich, Artificial Intelligence,
    McGraw Hill, 1983.
A.Sloman The Computer Revolution in Philosophy,
    Humanities Press and Harvester Press, 1978.
A. Sloman, `Why we need many knowledge  representation  formalisms',  in
    Research and Development in Expert Systems,
    ed M. Bramer, Cambridge University Press, 1985.
A. Sloman, 'Real-time multiple-motive expert systems'  in  Martin  Merry
    (ed), Expert Systems 85
    Cambridge University Press, 1985
A. Sloman and Graham Thwaites, 'POPLOG: a unique collaboration' in Alvey
    News, June 1986.
G J Sussman, A Computational Model of Skill Acquisition,
    American Elsevier, 1975
P.H.Winston, and B.K.Horn, LISP, Addison-Wesley, 1981.
Terry Winograd, Language as a cognitive process: syntax,
    Addison Wesley, 1983.
Patrick H. Winston, Artificial Intelligence,
    Second Edition, Addison-Wesley, 1984.
Masoud Yazdani, editor, New Horizons in Educational Computing,
    Ellis Horwood and John Wiley, 1984.
TEACH AITHEMES A PERSONAL VIEW OF ARTIFICIAL INTELLIGENCE Aaron Sloman School of Cognitive and Cognitive Sciences (COGS), University of Sussex At University of Birmingham since 1991 http://www.cs.bham.ac.uk/~axs

TEACH AITHEMES
A PERSONAL VIEW OF ARTIFICIAL INTELLIGENCE
Aaron Sloman
School of Cognitive and Cognitive Sciences (COGS), University of Sussex
At University of Birmingham since 1991
http://www.cs.bham.ac.uk/~axs