How might we go about this? And in particular, how might we do so in a way that is consistent with what we believe we know about human cognitive capacities? One way might be to attempt to put in a database all the grammatical sentences of English. Any possible sentence is then bound to be included in the database and can be checked against it, for example, by the POP-11 pattern matcher. Non-sentences, such as those in (2), will not be included, and so will not be recognized as grammatical English.
This method has some severe drawbacks. In the first place, since we have already claimed that the number of sentences in a language is infinitely vast, it follows that no finite database could ever be complete. For example, given a database containing N sentences, we could create an (N+1)th sentence made up of all the sentences so far joined together by the word `and', then an (N+2)th sentence by joining them with `or', and so on. Most important, the grammars of natural languages are recursive (a concept to which we shall return later in the chapter), allowing syntactic units to be embedded to any depth, as in the sentence ``The house the surveyor the property developer called valued fell down'' or ``The last seven small white glazed earthenware mugs.''
In the second place, simply putting a large number of sentences into a database does not indicate what it is that distinguishes them from non-sentences; it does not account for our intuitions, on seeing a novel string of words, as to whether that string is grammatical English or not.
So a `database of all the grammatical sentences' is both impossible and unsatisfactory. The two disadvantages listed above converge in a common general observation, which is relevant to our requirement that our account of language be cognitively plausible. The human brain contains only a finite, even if awesomely vast, number of neurons, so human beings have a strictly limited memory. No single human brain, nor even the totality of all human brains, could hold in memory all the sentences of a language, since a finite space cannot contain an infinite number of objects. By the same token, we could not hope to put an infinite number of sentences into a computer database.