Language Structure
Language Structure
What makes the human species special? There are two basic hypotheses about
why people are intellectually different from other species. In the past few chapters,
I indulged my favorite theory, which is that we have unmatched abilities to solve
problems and reason about our world, owing in large part to the enormous development
of our prefrontal cortices. However, there is another theory at least as popular in cognitive
science, which is that humans are special because they alone possess a language.
This chapter and the next will analyze in more detail what language is, how people
process language, and what makes human language so special. This chapter will focus
primarily on the nature of language in general, whereas the next chapter will contain
more detailed analyses of how language is processed. We will consider some of the
basic linguistic ideas about the structure of language and evidence for the psychological
reality of these ideas, as well as research and speculation about the relation between
language and thought. We will also look at the research on language acquisition. Much
of the evidence both for and against claims about the uniqueness of human language
comes from research on the way in which children learn the structure of language.
In this chapter, we will answer the questions: • What does the field of linguistics tell us about how language is processed? • What distinguishes human language from the communication systems of other
species? • How does language influence the nature of human thought? • How are children able to acquire a language?
•Language and the Brain
The human brain has features strongly associated with language. For almost all
of the 92% of people who are right-handed, language is strongly lateralized in
the left hemisphere. About half of the 8% of people who are left-handed still
have language left lateralized. So 96% of the population has language largely
Anderson7e_Chapter_12.qxd 8/20/09 9:52 AM Page 322
Language and the Brain | 323
in the left hemisphere. Findings from studies with split-brain patients (see
Chapter 1) have indicated that the right hemisphere has only the most rudimentary
language abilities. It was once thought that the left hemisphere was
larger, particularly in areas taking part in language processing, and that this
greater size accounted for the greater linguistic abilities associated with the left
hemisphere. However, neuroimaging techniques have suggested that the differences
in size are negligible, and researchers are now looking to see whether
there are differences in neural connectivity or organization (Gazzaniga, Ivry, &
Mangun, 2002) in the left hemisphere. It remains largely a mystery what differences
between the left and the right hemispheres could account for why language
is so strongly left lateralized.
Certain regions of the left hemisphere are specialized for language, and
these are illustrated in Figure 12.1. These areas were initially identified in studies
of patients who suffered aphasias (losses of language function) as a consequence
of stroke. The first such area was discovered by Paul Broca, the French surgeon
who, in 1861, examined the brain of such a patient after the patient’s death (the
brain is still preserved in a Paris museum). This patient was basically incapable
of spoken speech, although he understood much of what was spoken to him.
He had a large region of damage in a prefrontal area that came to be known
as Broca’s area. As can be seen in Figure 12.1, it is next to the motor region that
controls the mouth. Shortly thereafter, Carl Wernicke, a German physician,
identified patients with severe deficits in understanding speech who had damage
in a region in the superior temporal cortex posterior to the primary auditory
cortex. This area came to be known as Wernicke’s area. Parietal regions
close to Wernicke’s area (the supramarginal gyrus and angular gyrus) also have
also been found to be important to language.
Two of the classic aphasias, now known as Broca’s aphasia and Wernicke’s
aphasia, are associated with damage to these two regions. Chapter 1 gave
Brain Structures
Broca’s area
Wernicke’s area
Supramarginal gyrus
Angular gyrus
Motor face area
Primary auditory area
FIGURE 12.1 A lateral view of
the left hemisphere. Some of
the brain areas implicated in
language are in boldface type.
(From Dronkers, Redfern, & Knight, 2000.)
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 323
examples of the kinds of speech problems suffered by patients with these two
aphasias. The severity of the damage determines whether patients with Broca’s
aphasia will be unable to generate almost any speech (like Broca’s original
patient) or be capable of generating meaningful but ungrammatical speech.
Patients with Wernicke’s aphasia, in addition to having problems with comprehension,
sometimes produce grammatical but meaningless speech. Another
kind of aphasias is conduction aphasia; in this condition, patients suffer difficulty
in repeating speech and have problems producing spontaneous speech.
Conduction aphasia is sometimes associated with damage to the parietal regions
shown in Figure 12.1.
Although the importance of these left-cortical areas to speech is well documented
and there are many well-studied cases of aphasia resulting from damage
in these regions, it has become increasingly apparent that there is no simple
mapping of damaged areas onto types of aphasia. Current research has focused
on more detailed analyses of the deficits and of the regions damaged in each
aphasic patient.
Although there is much to understand, it is a fact that human evolution and
development have selected certain left-cortical regions as the preferred locations
for language. It is not the case, however, that language has to be left lateralized.
There are those left-handers who have language in the right hemisphere, and
young children who suffer left-brain damage may develop language in the right
hemisphere, in regions that are homologous to those depicted in Figure 12.1 for
the left hemisphere.
Language is preferentially localized in the left hemisphere in prefrontal
regions (Broca’s area), temporal regions (Wernicke’s area), and parietal
regions (supramarginal and angular gyri).
•The Field of Linguistics
The academic field of linguistics attempts to characterize the nature of language.
It is distinct from psychology in that it studies the structure of natural
languages rather than the way in which people process natural languages.
Despite this difference, the work from linguistics has been extremely influential
in the psychology of language. As we will see, concepts from linguistics play
an important role in theories of language processing. As noted in Chapter 1, the
influence from linguistics was important to the decline of behaviorism and the
rise of modern cognitive psychology.
Productivity and Regularity
The linguist focuses on two aspects of language: its productivity and its
regularity. The term productivity refers to the fact that an infinite number of
utterances are possible in any language. Regularity refers to the fact that these
utterances are systematic in many ways. We need not seek far to convince
324 | Language Structure
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 324
ourselves of the highly productive and creative character of language. Pick a
random sentence from this book or any other book of your choice and enter it
as an exact string (quoting it) in Google. If Google can find the sentence in all
of its billions of pages, it will probably either be from a copy of the book or a
quote from the book. In fact, these sorts of methods are used by programs to
catch plagiarism.Most sentences you will find in books were created only once in
human history. And yet it is important to realize that the components that make
up sentences are quite small in number: English uses only 26 letters, 40 phonemes
(see the discussion in the Speech Recognition section of Chapter 2), and some
tens of thousands of words. Nevertheless, with these components, we can and do
generate trillions of novel sentences.
A look at the structure of sentences makes clear why this productivity is
possible. Natural language has facilities for endlessly embedding structures within
structures and coordinating structures with structures. A mildly amusing party
game starts with a simple sentence and requires participants to keep adding to
the sentence:
• The girl hit the boy. • The girl hit the boy and he cried. • The big girl hit the boy and he cried. • The big girl hit the boy and he cried loudly. • The big girl hit the boy who was misbehaving and he cried loudly. • The big girl with authoritarian instincts hit the boy who was misbehaving
and he cried loudly.
And so on until someone can no longer extend the sentence.
The fact that an infinite number of word strings can be generated would not
be particularly interesting in itself. If we have tens of thousands of words for
each position and if sentences can be of any length, it is not hard to see that
a very large (in fact, an infinite) number of word strings is possible. However,
if we merely combine words at random, we get “sentences” such as • From runners physicians prescribing miss a states joy rests what thought
most.
In fact, very few of the possible word combinations are acceptable sentences.
The speculation is often jokingly made that, given enough monkeys working at
typewriters for a long enough time, some monkey will type a best-selling book.
It should be clear that it would take a lot of monkeys a long time to type just
one acceptable *R@!#s.
So, balanced against the productivity of language is its highly regular character.
One goal of linguistics is to discover a set of rules that will account for
both the productivity and the regularity of natural language.
Such a set of rules is referred to as a grammar. A grammar should be able
to prescribe or generate all the acceptable utterances of a language and be able
to reject all the unacceptable sentences in the language. A grammar consists
of three types of rules—syntactic, semantic, and phonological. Syntax concerns
The Field of Linguistics | 325
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 325
word order and inflection. Consider the following examples of sentences that
violate syntax: • The girls hits the boys. • Did hit the girl the boys? • The girl hit a boys. • The boys were hit the girl.
These sentences are fairly meaningful but contain some mistakes in word combinations
or word forms.
Semantics concerns the meaning of sentences. Consider the following sentences
that contain semantic violations, even though the words are correct in
form and syntactic position: • Colorless green ideas sleep furiously. • Sincerity frightened the cat.
These constructions are called anomalous sentences in that they are syntactically
well formed but nonsensical.
Phonology concerns the sound structure of sentences. Sentences can be correct
syntactically and semantically but be mispronounced. Such sentences are
said to contain phonological violations. Consider this example:
The Inspector opened his notebook. “Your name is Halcock, is’t no?” he began.
The butler corrected him. “H’alcock,” he said, reprovingly. “H, a, double-l?”
suggested the Inspector. “There is no h’aich in the name, young man. H’ay is
the first letter, and there is h’only one h’ell.” (Sayers, 1968, p. 73)
The butler, wanting to hide his cockney dialect, which drops the letter h, is systematically
mispronouncing every word that begins with a vowel.
The goal of linguistics is to discover a set of rules that captures the structural
regularities in a language.
Linguistic Intuitions
A major goal of linguistics is to explain the linguistic intuitions of speakers of a
language. Linguistic intuitions are judgments about the nature of linguistic utterances
or about the relations between linguistic utterances. Speakers of the language
are often able to make these judgments without knowing how they do so.
As such, linguistic intuition is another example of implicit knowledge, a concept
introduced in Chapter 7. Among these linguistic intuitions are judgments about
whether sentences are ill-formed and, if ill-formed, why. For instance, we can
judge that some sentences are ill-formed because they have bad syntactic structure
and that other sentences are ill-formed because they lack meaning. Linguists require
that a grammar capture this distinction and clearly express the reasons for it.
Another kind of intuition is about paraphrase.A speaker of English will judge that
the following two sentences are similar in meaning and hence are paraphrases: • The girl hit the boy. • The boy was hit by the girl.
326 | Language Structure
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 326
Yet another kind of intuition is about ambiguity. The following sentence has
two meanings: • They are cooking apples.
This sentence can either mean that some people are cooking some apples or
that the apples can be used for cooking.Moreover, speakers of the language can
distinguish this type of ambiguity, which is called structural ambiguity, from
lexical ambiguity, as in • I am going to the bank.
where bank can refer either to a monetary institution or to a riverbank. Lexical
ambiguities arise when a word has two or more distinct meanings; structural
ambiguities arise when an entire phrase or sentence has two or more meanings.
Linguists try to account for the intuitions we have about paraphrases, ambiguity,
and the well-formedness of sentences.
Competence versus Performance
Our everyday use of language does not always correspond to the prescriptions
of linguistic theory. We generate sentences in conversation that, upon reflection,
we would judge to be ill-formed and unacceptable. We hesitate, repeat
ourselves, stutter, and make slips of the tongue. We misunderstand the meaning
of sentences. We hear sentences that are ambiguous but do not note their
ambiguity.
Another complication is that linguistic intuitions are not always clear-cut.
For instance, we find the linguist Lakoff (1971) telling us that, in the following
case, the first sentence is not acceptable but the second sentence is: • Tell John where the concert’s this afternoon. • Tell John that the concert’s this afternoon.
People are not always reliable in their judgments of such sentences and certainly
do not always agree with Lakoff.
Considerations about the unreliability of human linguistic behavior and
judgment led linguist Noam Chomsky (1965) to make a distinction between
linguistic competence, a person’s abstract knowledge of the language, and
linguistic performance, the actual application of that knowledge in speaking
or listening. In Chomsky’s view, the linguist’s task is to develop a theory of
competence; the psychologist’s task is to develop a theory of performance.
The exact relation between a theory of competence and a theory of performance
is unclear and can be the subject of heated debates. Chomsky has
argued that a theory of competence is central to performance—that our
linguistic competence underlies our ability to use language, if indirectly.
Others believe that the concept of linguistic competence is based on a rather
unnatural activity (making linguistic judgments) and has very little to do with
language use.
Linguistic performance does not always correspond to linguistic competence.
The Field of Linguistics | 327
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 327
•Syntactic Formalisms
A major contribution of linguistics to the psychological study of language has
been to provide a set of concepts for describing the structure of language. The
most frequently used ideas from linguistics concern descriptions of the syntactic
structure of language.
Phrase Structure
A great deal of emphasis in linguistics has been given to understanding the syntax
of natural language. One central linguistic concept is phrase structure.
Phrase-structure analysis is not only significant in linguistics, but also important
to an understanding of language processing. Therefore, coverage of this
topic here is partly a preparation for material in the next chapter. Those of you
who have had a certain kind of training in high-school English will find the
analysis of phrase structure to be similar to parsing exercises.
The phrase structure of a sentence is the hierarchical division of the sentence
into units called phrases. Consider this sentence: • The brave dog saved the drowning child.
If asked to divide this sentence into two major parts in the most natural way,
most people would provide the following division: • (The brave dog) (saved the drowning child).
The parentheses distinguish the two separate parts. The two parts of the
sentence correspond to what are traditionally called subject and predicate or
noun phrase and verb phrase. If asked to divide the second part, the verb phrase,
further, most people would give • (The brave dog) (saved [the drowning child]).
Often, analysis of a sentence is represented as an upside-down tree, as
in Figure 12.2. In this phrase-structure tree, sentence points to its subunits, the
328 | Language Structure
The brave dog saved the downing child.
Sentence
Verb pharse
Article Adj Noun Verb Noun phrase
Article Adj Noun
Noun phrase
FIGURE 12.2 An example of the phrase structure of a sentence. The tree structure illustrates
the hierarchical division of the sentence into phrases.
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 328
noun phrase and the verb phrase, and each of these units points to its subunits.
Eventually, the branches of the tree terminate in the individual words. Such treestructure
representations are common in linguistics. In fact, the term phrase
structure is often used to refer to such tree structures.
An analysis of phrase structure can point up structural ambiguities. Consider
again the sentence • They are cooking apples.
Whether cooking is part of the verb with are or part of the noun phrase with apples
determines the meaning of the sentence. Figure 12.3 illustrates the phrase
structure for these two interpretations. In Figure 12.3a, cooking is part of the
verb, whereas in Figure 12.3b, it is part of the noun phrase.
Phrase-structure analysis is concerned with the way that sentences are broken
up into linguistic units.
Pause Structure in Speech
Abundant evidence supports the argument that phrase structures play a key
role in the generation of sentences.1 When a person produces a sentence, he or
she tends to generate it a phrase at a time, pausing at the boundaries between
large phrase units. For instance, although no tape recorders were available in
Lincoln’s time, one might guess that he produced the first sentence of “The
Gettysburg Address” with brief pauses at the end of each of the major phrases
as follows:
Four score and seven years ago (pause)
our forefathers brought forth on this continent (pause)
Syntactic Formalisms | 329
(a)
They are cooking apples.
Sentence
Noun phrase Verb phrase
Noun phrase
Aux Noun
Verb
Verb
(b)
They are cooking apples.
Pronoun
Sentence
Noun phrase
Noun phrase
Adj Noun
Verb
Verb phrase
Pronoun
FIGURE 12.3 The phrase structures illustrating the two possible meanings of the ambiguous
sentence They are cooking apples: (a) that those people (they) are cooking apples; (b) that
those apples are for cooking.
1 In Chapter 13, we will examine the role of phrase structures in language comprehension.
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 329
a new nation (pause)
conceived in liberty (pause)
and dedicated to the proposition (pause)
that all men are created equal (pause)
Although Lincoln’s speeches are not available for auditory analysis, Boomer
(1965) analyzed examples of spontaneous speech and found that pauses did
occur more frequently at junctures between major phrases and that these
pauses were longer than pauses at other locations. The average pause time
between major phrases was 1.03 s, whereas the average pause within phrases
was 0.75 s. This finding suggests that speakers tend to produce sentences a
phrase at a time and often need to pause after one phrase to plan the next.
Other researchers (Cooper & Paccia-Cooper, 1980; Grosjean, Grosjean, &
Lane, 1979) looked at participants producing prepared sentences rather than
spontaneous speech. The pauses of such participants tend to be much shorter,
about 0.2 s. Still, the same pattern holds, with longer pauses at the major
phrase boundaries.
As Figures 12.2 and 12.3 illustrate, there are multiple levels of phrases
within phrases within phrases. What level do speakers choose for breaking up
their sentences into pause units? Gee and Grosjean (1983) argued that speakers
tend to choose the smallest level above the word that bundles together coherent
semantic information. In English, this level tends to be noun phrases (e.g., the
young woman), verbs plus pronouns (e.g., will have been reading it), and
prepositional phrases (e.g., in the house).
People tend to pause briefly after each meaningful unit of speech.
Speech Errors
Other research has found evidence for phrase structure by looking at errors in
speech. Maclay and Osgood (1959) analyzed spontaneous recordings of speech
and found a number of speech errors that suggested that phrases do have a psychological
reality. They found that, when speakers repeated themselves or corrected
themselves, they tended to repeat or correct a whole phrase. For instance,
the following kind of repeat is found: • Turn on the heater/the heater switch.
and the following pair constitutes a common type of correction: • Turn on the stove/the heater switch.
In the preceding example, the noun phrase is repeated. In contrast, speakers do
not produce repetitions in which part, but not all, of the verb phrase is repeated,
such as • Turn on the stove/on the heater switch.
Other kinds of speech errors also provide evidence for the psychological reality
of constituents as major units of speech generation. For instance, some research
has analyzed slips of the tongue in speech (Fromkin, 1971, 1973; Garrett, 1975).
330 | Language Structure
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 330
One kind of speech error is called a spoonerism, after the English clergyman
William A. Spooner to whom are attributed some colossal and clever errors of
speech. Among the errors of speech attributed to Spooner are:
• You have hissed all my mystery lectures. • I saw you fight a liar in the back quad; in fact, you have tasted the whole
worm. • I assure you the insanitary spectre has seen all the bathrooms. • Easier for a camel to go through the knee of an idol. • The Lord is a shoving leopard to his flock. • Take the flea of my cat and heave it at the louse of my mother-in-law.
As illustrated here, spoonerisms consist of exchanges of sound between words.
There is some reason to suspect that the preceding errors were deliberate
attempts at humor by Spooner. However, people do generate genuine spoonerisms,
although they are seldom as funny.
By patient collecting, researchers have gathered a large set of errors made by
friends and colleagues. Some of these errors are simple sound anticipations and
some are sound exchanges as in spoonerisms:
• Take my bike →bake my bike [an anticipation] • night life →nife lite [an exchange] • beast of burden →burst of beaden [an exchange]
One that gives me particular difficulty is
• coin toss →toin coss
The first error in the preceding list is an example of an anticipation, where
an early phoneme is changed to a later phoneme. The others are examples of
exchanges in which two phonemes switch. The interesting feature about these
kinds of errors is that they tend to occur within a single phrase rather than
across phrases. So, we are unlikely to find an anticipation, like the following,
which occurs between subject and object noun phrases: • The dancer took my bike.→The bancer took my dike.
Also unlikely are sound exchanges where an exchange occurs between the initial
prepositional phrase and the final noun phrase, as in the following: • At night John lost his life.→At nife John lost his lite.
Garrett (1990) distinguished between errors in simple sounds and those in
whole words. Sound errors occur at what he called the positional level, which
basically corresponds to a single phrase, whereas word errors occur at what he
called the functional level, which corresponds to a larger unit of speech such as
a full clause. Thus, the following word error has been observed:
• That kid’s mouse makes a great toy.→That kid’s toy makes a great
mouse.
whereas the following sound error would be unlikely: • That kid’s mouse makes a great toy.→That kid’s touse makes a great moy.
Syntactic Formalisms | 331
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 331
In Garrett’s (1980) corpus, 83% of all word exchanges extended beyond phrase
boundaries, but only 13% of sound errors did. Word and sound errors are
generally thought to occur at different levels in the speech-production process.
Words are inserted into the speech plan at a higher level of planning, and so a
larger distance is possible for the substitution.
An experimental procedure has been developed for artificially producing
Spoonerisms in the laboratory (Baars,Motley, & MacKay, 1975; Motley, Camden,
& Baars, 1982). This involves presenting a series of word pairs like
Big Dog
Bad Deal
Beer Drum
**Darn Bore**
House Coat
Whale Watch
and asking the participants to speak certain words such as the asterisked Darn
Bore in the above series. When they have been primed with a series of word
pairs with the opposite order of first consonants (the preceding three all are
B—— D——), they show a tendency to reverse the order of the first consonants,
in this case producing Barn Door. Interestingly, participants are much
more likely to produce such an error if it produces real words, as it does in the
above case, than if it does not (as in the case of Dock Boat, which if reversed
would become Bock Doat). Participants are also sensitive to a host of other
facts such as whether the pair is grammatically appropriate and whether it is
culturally appropriate (e.g., they are more likely to convert cast part into past
cart than they are to convert fast part into past fart). This research has been
taken as evidence that we combine multiple factors into selection of speech
items.
Speech errors involving substitutions of sounds and words suggest that words
are selected at the clause level, whereas sounds are inserted at a lower phrase
level.
Transformations
A phrase structure describes a sentence hierarchically as pieces within larger
pieces. There are certain types of linguistic constructions that some linguists
think violate this strictly hierarchical structure. Consider the following pair of
sentences:
1. The dog is chasing Bill down the street.
2. Whom is the dog chasing down the street?
In sentence 1, Bill, the object of the chasing, is part of the verb phrase. On the
other hand, in sentence 2, whom, the object of the verb phrase, is at the beginning
of the sentence. The object is no longer part of the verb-phrase structure
332 | Language Structure
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 332
to which it would seem to belong. Some linguists have proposed that, formally,
such questions are generated by starting with a phrase structure that has the
object whom in the verb phrase, such as
3. The dog is chasing whom down the street?
This sentence is somewhat strange but, with the right questioning intonation
of the whom, it can be made to sound reasonable. In some languages, such
as Japanese, the interrogative pronoun is normally in the verb phrase, as in
sentence 3. However, in English, the proposal is that there is a movement transformation
that moves the whom into its more normal position. Note that this
proposal is a linguistic one concerning the formal structure of language and
may not describe the actual process of producing the question.
Some linguists believe that a satisfactory analysis of language requires such
transformations, which move elements from one part of the sentence to another
part. Transformations can also operate on more complicated sentences.
For instance, we can apply it to sentences of the form
4. John believes that the dog is chasing Bill down the street.
The corresponding question forms are
5. John believes that the dog is chasing whom down the street?
6. Whom does John believe that the dog is chasing down the street?
Sentence 5 is strange even with a questioning intonation for whom, but still
some linguists believe that sentence 6 is transformationally derived from it,
even though we would never produce sentence 5.
An intriguing concern to linguists is that there seem to be real limitations
on just what things can be moved by transformations. For instance, consider
the following set of sentences:
7. John believes the fact that the dog is chasing Bill down the street.
8. John believes the fact that the dog is chasing whom down the street?
9. Whom does John believe the fact that the dog is chasing down the street?
As sentence 7 illustrates, the basic sentence form is acceptable, but one cannot
move whom from question form 8 to produce question form 9. Sentence 9
just sounds bizarre. We will return later to the restrictions on movement
transformations.
In contrast with the abundant evidence for phrase structure in language
processing, the evidence that people actually compute anything analogous to
transformations in understanding or producing sentences is very poor. How
people process such transformationally derived sentences remains very much
an open question. It is also the case that there is a lot of controversy within linguistics
about how to conceive of transformations. The role of transformations
has been deemphasized in many proposals.
Transformations move elements from their normal positions in the phrase
structure of a sentence.
Syntactic Formalisms | 333
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 333
•What Is So Special about Human Language?
We have reviewed some of the features of human language, with the implicit
assumption that no other species has anything like such a language.What gives
us this conceit? How do we know that other species do not have their own
languages? Perhaps we just do not understand the languages of other species.
Certainly, all social species communicate with one another and, ultimately,
whether we call their communication systems languages is a definitional matter.
However, human language is fundamentally different than these other systems,
and it is worth identifying some of the features (Hockett, 1960) that are considered
critical to human language.
Semanticity and arbitrariness of units. Consider, for instance, the communication
system of dogs. They have a nonverbal system that is very effective in
communication. The reason that dogs are such successful pets is thought to be
that their nonverbal communication system is so much like that of humans.
Besides being nonverbal, canine communication has more fundamental limitations.
Unlike human language, in which the relation between signs and meaning
is arbitrary (there is no reason why “Good dog” and “Bad dog” should mean
what they do), dogs’ signs are directly related to means—a snarl for aggression
(which often reveals the dog’s sharp incisors), exposing the neck (a vulnerable
part of the dog’s body) for submission, and so on. However, although canines
have a nonarbitrary communication system, it is not the case that all species do.
For instance, the vocalizations of some species of monkeys have this property of
arbitrary meaning (Marler, 1967). One species, the Vervet Monkey, has different
warning calls for different types of predators—a “chutter” for snakes, a “chirp”
for leopards, and a “kraup” for eagles.
Displacement in time and space. A critical feature of the monkey warning
system is that the monkeys use it only in the presence of a danger. They do not
use it to “discuss” the day’s events at a later time. An enormously important
feature of human language (exemplified by this book) is that it can be used to
communicate over time and distance. Interestingly, the “language” of honeybees
satisfies the properties of both arbitrariness and displacement (von Frisch,
1967). When a honeybee returns to a nest after finding a food source, it will
engage in a dance to communicate the location of the food source. The “dance”
consists of a straight run followed by a turn to the right to circle back to the
starting point, another straight run, followed by a turn and circle to the left, and
so on, in an alternating pattern. The length of the run indicates the distance of
the food and the direction of the run relative to vertical indicates the direction
relative to the sun.
Discreteness and productivity. Language contains discrete units, which
would serve to disqualify the bee language system, although the monkey warning
system meets this criterion. Requiring a language to have discrete units is
not just an arbitrary regulation to disqualify the dance of the bees. This discreteness
enables the elements of the language to be combined into an almost
infinite number of phrase structures and for these phrase structures to be
transformed, as already described. As will become more and more apparent
334 | Language Structure
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 334
in these chapters, this ability to combine symbols makes human language different
from the communication systems of all other species.
It is a striking fact that all people in the world, even those in isolated communities,
speak a language. No other species, not even genetically close apes,
spontaneously use a communication system anything like human language.
However, many people have wondered whether apes such as chimpanzees could
be taught a language. Early in the 20th century, there were attempts to teach
chimpanzees to speak that failed miserably (C. Hayes, 1951; Kellogg & Kellogg,
1933). It is now clear that the human vocal apparatus has undergone special
evolutionary adaptations to enable speech, and it was a hopeless goal to try to
teach chimps to speak. However, apes have considerable manual dexterity and,
more recently, there have been some well-publicized attempts to teach chimpanzees
and other apes manual languages.
Some of the studies have used American sign language (e.g., Gardner &
Gardner, 1969), which is a full-fledged language and makes the point that
language need not be spoken. These attempts were only modest successes
(e.g., Terrace, Pettito, Sanders, & Bever, 1979). Although the chimpanzees could
acquire vocabularies of more than a hundred signs, they never used them with
the productivity typical of humans in using their own language. Some of the
more impressive attempts have actually used artificial languages consisting of
“words” called lexigrams, made from plastic shapes, that can be attached to a
magnetic board (e.g., Premack & Premack, 1983).
Perhaps the most impressive example comes from a bonobo great ape called
Kanzi (Savage-Rumbaugh et al., 1993; see Figure 12.4). Bonobos are considered
even closer genetically to humans than chimpanzees are, but are rare. Kanzi’s
mother was a subject of one of these efforts, and Kanzi simply came along with
his mother and observed her training sessions. However, he spontaneously
What Is So Special about Human Language? | 335
FIGURE 12.4 Kanzi, a bonobo,
listening to English. A number of
videos of Kanzi can be found on
YouTube by searching with his
name. (The Language Research Center.)
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 335
336 | Language Structure
language deadens their basic nature and that the real issue
is that humans have lost the ability to understand apes.
The very similarity of primates to humans is what
makes them such attractive subjects for research. There
are severe restrictions on research on
apes in many countries, and in 2008 The
Great Ape Protection Act that would
have prohibited any invasive research
involving great apes was introduced in
the U.S. Congress. Much of the concern
is with use of apes to study human
disease, where the potential benefits are
great but the moral issues of infecting
an animal are also severe. From this
perspective, most cognitive research with
apes, such as that on language acquisition,
is quite benign. From a cognitive
perspective, they are the only creatures
that have thought processes close to
that of humans and they offer potential
insights we cannot get from other species. Nonetheless,
many have argued that all research that removes them
from their natural setting, including language acquisition
research, should be banned.
Implications
Ape language and the ethics of experimentation
The issue of whether apes can be taught human languages
interlinks in complex ways with issues about the ethical
treatment of animals in research. The philosopher Descartes
believed that language was what separated humans from
animals. According to this view, if apes
could be shown capable of acquiring a
language, they would have human status
and should be given the same rights as
humans in experimentation. One might
even ask that they give informed consent
before participating in an experiment.
Certainly, any procedure that involved
injury would not be acceptable. There has
been a fair amount of research involving
invasive brain procedures with primates,
but most of this has involved monkeys,
not the great apes. Interestingly, it has
been reported that studies with linguistic
apes found that they categorized themselves
with humans and separate from
other animals (Linden, 1974). It has been argued that it is
in the best interests of apes to teach them a language
because this would confer on them the rights of humans.
However, others have argued that teaching an ape a human
started to use the lexigrams, and the experimenters began working with their
newfound subject. His spontaneous constructions were quite impressive, and it
was discovered that he had also acquired a considerable ability to understand
spoken language. When he was 5.5 years of age, his comprehension of spoken
English was determined to be equivalent to that of a 2-year-old human.
As in other things, it seems unwise to conclude that human linguistic abilities
are totally discontinuous from the abilities of genetically close primates.
However, the human propensity for language is remarkable in the animal
world. Steven Pinker (1994b) coined the phrase “language instinct” to describe
the propensity for every human to acquire language. In his view, it is something
wired into the human brain through evolution. Just as song birds are born with
the propensity to learn the song of their species, so we are born with the
propensity to learn the language of our society. Just as humans might try to
imitate the song of birds and partly succeed, other species, like the bonobo,
may partly succeed at mastering the language of humans. However, bird song
is special to songbirds and language is special to humans.
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 336
Only humans show the propensity or the ability to acquire a complex
communication system that combines symbols in a multitude of ways like
natural language.
•The Relation Between Language and Thought
All reasonable people would concede that there is some special connection between
language and humans. However, there is a lot of controversy about why
there is such a connection. Many researchers, like Steven Pinker and Noam
Chomsky, believe that humans have some special genetic endowment that enables
them to learn language. However, others argue that what is special is general
human intellectual abilities and that these abilities enable us to shape our
communication system to be something as complex as natural language. I confess
to leaning toward this alternate viewpoint. It raises the question of what
might be the relation between language and thought. There are three possibilities
that have been considered:
1. Thought depends in various ways on language.
2. Language depends in various ways on thought.
3. They are two independent systems.
We will go through each of these ideas in turn, starting with the proposal that
language depends on thought. There have been a number of different versions
of this proposal including the radical behaviorist proposal that thought is just
speech and a more modest proposal called linguistic determinism.
The Behaviorist Proposal
As discussed in Chapter 1, John B.Watson, the father of behaviorism, held that
there was no such thing as internal mental activity at all. All that humans do,
Watson argued, is to emit responses that have been conditioned to various stimuli.
This radical proposal, which, as noted in Chapter 1, held sway in America for
some time, seemed to fly in the face of the abundant evidence that humans
can engage in thinking behavior (e.g., do mental arithmetic) that entails no
response emission. To deal with this obvious counter, Watson proposed that
thinking was just subvocal speech—that, when people were engaged in such
“thinking” activities, they were really talking to themselves. Hence, Watson’s
proposal was that a very important component of thought is simply subvocal
speech. (The philosopher Herbert Feigl once said that Watson “made up his
windpipe that he had no mind.”)
Watson’s proposal was a stimulus for a research program that engaged in
taking recordings to see whether evidence could be found for subvocal activity
of the speech apparatus during thinking. Indeed, often when a participant is
engaged in thought, it is possible to get recordings of subvocal speech activity.
However, the more important observation is that, in some situations, people
engage in various silent thinking tasks with no detectable vocal activity. This
The Relation Between Language and Thought | 337
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 337
finding did not upset Watson. He claimed that we think with our whole bodies—
for instance, with our arms. He cited the fascinating evidence that deaf mutes
actually make signs while asleep. (Speaking people who have done a lot of communication
in sign language also sign while sleeping.)
The decisive experiment addressing Watson’s hypothesis was performed by
Smith, Brown, Toman, and Goodman (1947). They used a curare derivative
that paralyzes the entire voluntary musculature. Smith was the participant for
the experiment and had to be kept alive by means of an artificial respirator.
Because his entire musculature was completely paralyzed, it was impossible for
him to engage in subvocal speech or any other body movement. Nonetheless,
under curare, Smith was able to observe what was going on around him, comprehend
speech, remember these events, and think about them. Thus, it seems
clear that thinking can proceed in the absence of any muscle activity. For our
current purposes, the relevant additional observation is that thought is not just
implicit speech but is truly an internal, nonmotor activity.
Additional evidence that thought is not to be equated with language comes
from the research on memory for meaning that was reviewed in Chapter 5.
There, we considered the fact that people tend to retain not the exact words of a
linguistic communication, but rather a more abstract representation of the
meaning of the communication. Thought might be identified, at least in part,
with this abstract, nonverbal propositional code. As mentioned there in regard
to the perceptual symbol system hypothesis, the abstractness of human thought
is under reconsideration. However, even the perceptual symbol system proposal
holds that thought is more than just subvocal speech and that, rather, it consists
of rich internal perceptual representations.
Additional evidence that thought is more than subvocal speech comes from
the occasional person who has no apparent language at all but who certainly
gives evidence of being able to think. Additionally, it seems hard to claim that
nonverbal animals such as apes are unable to think. Recall, for instance, the
problem-solving exploits of Sultan in Chapter 8. It is always hard to determine
the exact character of the “thought processes” of nonverbal participants and
the way in which these processes differ from the thought processes of verbal
participants, because there is no language with which nonverbal participants
can be interrogated. Thus, the apparent dependence of thought on language
may be an illusion that derives from the fact that it is hard to obtain evidence
about thought without using language.
The behaviorists believed that thought consists only of covert speech and
other implicit motor actions, but evidence has shown that thought can
proceed in the absence of any motor activity.
The Whorfian Hypothesis of Linguistic Determinism
Linguistic determinism is the claim that language determines or strongly
influences the way that a person thinks or perceives the world. This proposal is
much weaker than Watson’s position because it does not claim that language
338 | Language Structure
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 338
and thought are identical. The hypothesis has been advanced by a good many
linguists but has been most strongly associated with Whorf (1956). Whorf was
quite an unusual character himself. He was trained as a chemical engineer at
MIT, spent his life working for the Hartford Fire Insurance Company, and studied
North American Indian languages as a hobby. He was very impressed by the
fact that different languages emphasize in their structure rather different aspects
of the world. He believed that these emphases in a language must have a
great influence on the way that speakers of that language think about the world.
For instance, he claimed that Eskimos have many different words for snow, each
of which refers to snow in a different state (wind-driven, packed, slushy, and so
on), whereas English speakers have only a single word for snow.2 Many other
examples exist at the vocabulary level: The Hanunoo people in the Philippines
supposedly have 92 different names for varieties of rice. The Arabic language
has many different ways of naming camels. Whorf felt that such a rich variety
of terms would cause the speaker of the language to perceive the world differently
from a person who had only a single word for a particular category.
Deciding how to evaluate the Whorfian hypothesis is very tricky. Nobody
would be surprised to learn that Eskimos know more about snow than average
English speakers. After all, snow is a more important part of their life experience.
The question is whether their language has any effect on the Eskimos’ perception
of snow beyond the effect of experience. If speakers of English went through the
Eskimo life experience, would their perception of snow be any different from
that of the Eskimo-language speakers? (Indeed, ski bums have a life experience
that includes a great deal of exposure to snow; they have a great deal of knowledge
about snow and, interestingly, have developed new terms for snow.)
One fairly well-researched test of the issue uses color words. English has 11
basic color words—black, white, red, green, yellow, blue, brown, purple, pink,
orange, and gray—a large number. These words are called basic color words
because they are short and are used frequently, in contrast with such terms as
saffron, turquoise, and magenta. At the other extreme is the language of the
Dani, a Stone Age agricultural people of Indonesian New Guinea. This language
has just two basic color terms: mili for dark, cold hues and mola for bright,
warm hues. If the categories in language determine perception, the Dani should
perceive color in a less refined manner than English speakers do. The relevant
question is whether this speculation is true.
Speakers of English, at least, judge a certain color within the range referred
to by each basic color term to be the best—for instance, the best red, the best
blue, and so on (see Berlin & Kay, 1969). Each of the 11 basic color terms in
English appears to have one generally agreed upon best color, called a focal
color. English speakers find it easier to process and remember focal colors than
nonfocal colors (e.g., Brown & Lenneberg, 1954). The interesting question is
whether the special cognitive capacity for identifying focal colors developed
The Relation Between Language and Thought | 339
2 There have been challenges to Whorf ’s claims about the richness of Eskimo vocabulary for snow
(L.Martin, 1986; Pullman, 1989). In general, there is a feeling that Whorf exaggerated the variety of words
in various languages.
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 339
because English speakers have special words for these colors. If so, it would be a
case of language influencing thought.
To test whether the special processing of focal colors was an instance of
language influencing thought, Rosch (who published some of this work under
her former name, Heider) performed an important series of experiments on the
Dani. The point was to see whether the Dani processed focal colors differently
from English speakers. One experiment (Rosch, 1973) compared Dani and
English speakers’ ability to learn nonsense names for focal colors with that for
nonfocal colors. English speakers find it easier to learn arbitrary names for focal
colors. Dani participants also found it easier to learn arbitrary names for
focal colors than for nonfocal colors, even though they have no names for these
colors. In another experiment (Heider, 1972), participants were shown a color
chip for 5 s; 30 s after the presentation ended, they were required to select the
color from among 160 color chips. Both English and Dani speakers perform
better at this task when they are trying to locate a focal color chip rather than a
nonfocal color chip. The physiology of color vision suggests that many of these
focal colors are specially processed by the visual system (de Valois & Jacobs,
1968). The fact that many languages develop basic color terms for just these
colors can be seen as an instance of thought determining language.3
However, more recent research by Roberson, Davies, and Davidoff (2000)
does suggest an influence of language on ability to remember colors. They
compared British participants with another Papua New Guinea group who
speak Berinmo, a language that has five basic color terms. Color Plate 12.1
compares how the Berinmo cut up the color space with how English speakers
cut up the color space. Replicating the earlier work, they found that there was
superior memory for focal colors regardless of language. However, there were
substantial effects of the color boundaries as well. The researchers examined
distinctions that were important in one language versus another. For instance,
the Berinmo make a distinction between the colors wor and nol in the middle
of the English green category, whereas English speakers make their yellow-green
distinction in the middle of the Berinmo wor category.
Participants from both languages were asked
to learn to sort stimuli at these two boundaries into
two categories. Figure 12.5 shows the amount of
effort that the two populations put into learning the
two distinctions. English speakers found it easiest to
sort stimuli at the yellow-green boundary, whereas
Berinmo found it easiest to sort stimuli at the nol-wor
distinction.
Note that both populations are capable of making
distinctions that are important to the other
population. Thus, it is not that their language has
made them blind to color distinctions. However,
they definitely find it harder to see the distinctions
340 | Language Structure
Nol-Wor
2
0
4
Categories
Mean trials to criterion
Yellow-Green
Berinmo
English
FIGURE 12.5 Mean errors to
criterion for the two populations
learning distinctions at the
nol-wor boundary and at
the yellow-green boundary.
(From Roberson et al., 2000.)
3 For further research on this topic, read Lucy and Shweder (1979, 1988) and Garro (1986).
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 340
not signaled in their language and learn to make them consistently. Thus,
although language does not completely determine how we see the color space,
it does have an influence.
Language can influence thought, but it does not totally determine the types
of concepts that we can think about.
Does Language Depend on Thought?
The alternative possibility is that the structure of language is determined by the
structure of thought. Aristotle argued 2500 years ago that the categories of
thought determined the categories of language. There are some reasons for
believing that he was correct, but most of these reasons were not available to
Aristotle. So, although the hypothesis has been around for 2500 years, we have
better evidence today.
There are numerous reasons to suppose that humans’ ability to think (i.e.,
to engage in nonlinguistic cognitive activity such as remembering and problem
solving) appeared earlier evolutionarily and occurs sooner developmentally than
the ability to use language. Many species of animals without language appear to
be capable of complex cognition. Children, before they are effective at using
their language, give clear evidence of relatively complex cognition. If we accept
the idea that thought evolved before language, it seems natural to suppose that
language arose as a tool whose function was to communicate thought. It is generally
true that tools are shaped to fit the objects on which they must operate.
Analogously, it seems reasonable to suppose that language has been shaped to
fit the thoughts that it must communicate.
We saw in Chapter 5 that propositional structures constitute a very important
type of knowledge structure in representing information both derived
from language and derived from pictures. This propositional structure is
manifested in the phrase structure of language. The basic phrase units of a
language tend to convey propositions. For instance, the tall boy conveys the
proposition that the boy is tall. This phenomenon itself—the existence of
a linguistic structure, the phrase, designed to accommodate a thought structure,
the proposition—seems to be a clear example of the dependence of
language on thought.
Another example of the way in which thought shapes language comes from
Rosch’s research on focal colors. As stated earlier, the human visual system is
maximally sensitive to certain colors. As a consequence, languages have special,
short, high-frequency words with which to designate these colors. Thus, the
visual system has determined how the English language divides up the color
space.
We find additional evidence for the influence of thought on language when
we consider word order. Every language has a preferred word order for expressing
subject (S), verb (V), and object (O). Consider this sentence, which exhibits
the preferred word order in English:
• Lynne petted the Labrador.
The Relation Between Language and Thought | 341
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 341
English is referred to as an SVO language. In a study of a diverse sample of
the world’s languages, Greenberg (1963) found that only four of the six possible
orders of S, V, and O are used in natural languages, and one of these four orders
is rare. The six possible word orders and the frequency of each order in the
world’s languages are as follows (the percentages are from Ultan, 1969):
SOV 44% VOS 2%
SVO 35% OVS 0%
VSO 19% OSV 0%
The important feature is that the subject almost always precedes the object.
This order makes good sense when we think about cognition. An action starts
with the agent and then affects the object. It is natural therefore that the subject
of a sentence, when it reflects its agency, is first.
In many ways, the structure of language corresponds to the structure of how
our minds process the world.
Modularity of Language
We have considered the possibility that thought might depend on language and
the possibility that language might depend on thought. A third logical possibility
is that language and thought might be independent. A special version of this
independence principle is called the modularity position (Chomsky, 1980;
Fodor, 1983). This position holds that important language processes function
independently from the rest of cognition. Fodor argued that a separate linguistic
module first analyzes incoming speech and then passes this analysis on
to general cognition. Fodor thought that this linguistic module was similar in
this respect to early visual processing, which largely proceeds in response to
the visual stimulus independent of higher-level intentions.4 Similarly, in language
generation, the linguistic module takes the intentions to be spoken and
produces the speech. This position does not deny that the linguistic module
may have been shaped to communicate thought. However, it argues that it
operates according to different principles from the rest of cognition and is
“encapsulated” such that it cannot be influenced by general cognition. In essence,
the claim is that language’s communication with other mental processes is
limited to passing its products to general cognition and receiving the products
of general cognition.
One piece of evidence for the independence of language from other cognitive
processes comes from research on people who have substantial deficits in
language but not in general cognition or vice versa. Williams syndrome, a rare
genetic disorder, is an example of a mental retardation that seems not to affect
linguistic fluency (Bellugi,Wang, & Jernigan, 1994). On the other side, there are
people who have severe language deficits without accompanying intellectual
deficits, including both some aphasics and some with developmental problems.
342 | Language Structure
4 However, as reviewed in Chapter 3, there are some effects of visual attention in primary visual cortex—for
example, see the discussion of Figure 3.10.
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 342
Specific language impairment (SLI) is a term used to describe a pattern of deficit
in the development of language that cannot be explained by hearing loss, mental
retardation, or other nonlinguistic factors. It is a diagnosis of exclusion and
probably has a number of underlying causes; in some cases, these causes appear
to be genetic (Stromswold, 2000). Recently, a mutation in a specific gene, called
FOXP2, has been associated with specific language deficits in the popular press
(e.g.,Wade, 2003), although there appear to be other cognitive deficits associated
with this mutation as well (Vargha-Khadem, Watkins, Alcock, Fletcher, &
Passingham, 1995). The FOXP2 gene is very similar in all mammals, although
the human FOXP2 is distinguished from that of other primates by two amino
acids (out of 715). Mutations in the FOXP2 gene are associated with vocal
deficits and other deficits in many species. For instance, it results in incomplete
acquisition of song imitation in birds (Haesler et al., 2007). It has been claimed
that the human form of the FOXP2 gene became established in the human population
about 50 thousand years ago when, according to some proposals, human
language emerged (Enard et al., 2002). However, more recent evidence suggests
these changes are shared with Neanderthals and occurred 300 to 400 thousand
years ago (Krause et al., 2007). Although the FOXP2 gene does play an important
role in language, it does not appear to provide strong evidence for a genetic
basis for a unique language ability.
The modularity hypothesis has turned out to be a major dividing issue in the
field, with different researchers lining up in support or in opposition. Two domains
of research have played a major role in evaluating the modularity proposal:
1. Language acquisition. Here, the issue is whether language is acquired
according to its own learning principles or whether it is acquired like
other cognitive skills.
2. Language comprehension. Here, the issue is whether major aspects of
language processing occur without utilization of any general cognitive
processes.
We will consider some of the issues with respect to comprehension in the
next chapter. In this chapter, we will look at what is known about language
acquisition. After an overview of the general course of language acquisition by
young children, we will turn to the implications of the language-acquisition
process for the uniqueness of language.
The modularity position holds that the acquisition and processing of language
is independent from other cognitive systems.
•Language Acquisition
Having watched my two children acquire a language, I understand how easy it
is to lose sight of what a remarkable feat it is. Days and weeks go by with little
apparent change in their linguistic abilities. Progress seems slow. However,
something remarkable is happening. With very little and often no deliberate
instruction, children by the time they reach age 10 have accomplished implicitly
Language Acquisition | 343
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 343
what generations of Ph.D. linguists have not accomplished explicitly. They have
internalized all the major rules of a natural language—and there appear to be
thousands of such rules with subtle interactions. No linguist in a lifetime has
been able to formulate a grammar for any language that will identify all and only
the grammatical sentences. However, as we progress through childhood, we do
internalize such a grammar. Unfortunately for the linguist, our knowledge of the
grammar of our language is not something that we can articulate. It is implicit
knowledge (see Chapter 7), which we can only display in using the language.
The process by which children acquire a language has some characteristic
features that seem to hold no matter what their native language is (and
languages throughout the world differ dramatically): Children are notoriously
noisy creatures from birth. At first, there is little variety in their speech. Their
vocalizations consist almost totally of an ah sound (although they can produce
it at different intensities and with different emotional tones). In the months
following birth, a child’s vocal apparatus matures. At about 6 months, a change
takes place in children’s utterances. They begin to engage in what is called
babbling, which consists of generating a rich variety of speech sounds with
interesting intonation patterns. However, the sounds are generally totally meaningless
to the listeners.
An interesting feature of early childhood speech is that children produce
sounds that they will not use in the particular language that they will learn.
Moreover, they can apparently make acoustic discriminations among sounds
that will not be used in their language. For instance, Japanese infants can discriminate
between /l/ and /r/, a discrimination that Japanese adults cannot
make (Tsushima et al., 1994). Similarly, English infants can discriminate among
variations of the /t/ sound, which are important in the Hindi language of India,
that English adults cannot discriminate (Werker & Tees, 1999). It is as if the
children enter the world with speech and perceptual capabilities that constitute
a block of marble out of which will be carved their particular language, discarding
what is not necessary for that language.
When a child is about a year old, the first words appear, always a point of
great excitement to the child’s parents. The very first words are there only to the
ears of very sympathetic parents and caretakers, but soon the child develops a
considerable repertoire of words, which are recognizable to the untrained ear
and which the child uses effectively to make requests and to describe what is
happening. The early words are concrete and refer to the here and now. Among
my children’s first words were Mommy, Daddy, Rogers (for Mister Rogers),
cheese, ’puter (for computer), eat, hi, bye, go, and hot. One remarkable feature of
this stage is that the speech consists only of one-word utterances; even though
the children know many words, they never put them together to make multipleword
phrases. Children’s use of single words is quite complex. They often use a
single word to communicate a whole thought. Children will also overextend
their words. Thus, the word dog might be used to refer to any furry four-legged
animal.
The one-word stage, which lasts about 6 months, is followed by a stage in
which children will put two words together. I can still remember our excitement
as parents when our son said his first two-word utterance at 18 months—more
344 | Language Structure
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 344
gee, which meant for him “more brie”—he was a connoisseur of cheese.
Table 12.1 illustrates some of the typical two-word utterances generated by
children at this stage. All their utterances are one or two words. Once their
utterances extend beyond two words, they are of many different lengths.
There is no corresponding three-word stage. The two-word utterances correspond
to about a dozen or so semantic relations, including agent-action,
agent-object, action-object, object-location, object-attribute, possessorobject,
negation-object, and negation-event. The order in which the children
place these words usually corresponds to one of the orders that would
be correct in adult speech in the children’s linguistic community.
Even when children leave the two-word stage and speak in sentences
ranging from three to eight words, their speech retains a peculiar quality, which
is sometimes referred to as telegraphic. Table 12.2 contains some of these longer
multiword utterances. The children speak somewhat as people used to write in
telegraphs (and like people currently do when text messaging), omitting such
unimportant function words as the and is. In fact, it is
rare to find in early-childhood speech any utterance
that would be considered to be a well-formed sentence.
Yet, out of this beginning, grammatical sentences eventually
appear. One might expect that children would
learn to speak some kinds of sentences perfectly, then
learn to speak other kinds of sentences perfectly, and
so on. However, it seems that children start out speaking
all kinds of sentences and all of them imperfectly.
Their language development is characterized not by
learning more kinds of sentences but by their sentences
becoming gradually better approximations of
adult sentences.
Besides the missing words, there are other dimensions in which children’s
early speech is incomplete. A classic example concerns the rules for pluralization
in English. Initially, children do not distinguish in their speech between
singular and plural, using a singular form for both. Then, they will learn the
add s rule for pluralization but overextend it, producing foots or even feets.
Gradually, they learn the pluralization rules for the irregular words. This learning
continues into adulthood. Cognitive scientists have to learn that the plural
of schema is schemata (a fact that I spared the reader from having to deal with
when schemas were discussed in Chapter 5).
Another dimension in which children have to perfect their language is word
order. They have particular difficulties with transformational movements of
terms from their natural position in the phrase structure (see the earlier discussion
in this chapter). So, for instance, there is a point at which children form
questions without moving the verb auxiliary from the verb phrase:
• What me think? • What the doggie have?
Even later, when children’s spontaneous speech seems to be well formed, they
will display errors in comprehension that reveal that they have not yet captured
Language Acquisition | 345
TABLE 12.1
Two-Word Utterances
Kendall swim pillow fell
doggie bark Kendall book
see Kendall Papa door
writing book Kendall turn
sit pool towel bed
shoe off there cow
From Bowerman (1973).
TABLE 12.2
Multiword Utterances
Put truck window My balloon pop
Want more grape juice Doggie bit me mine boot
Sit Adam chair That Mommy nose right there
Mommy put sack She’s wear that hat
No I see truck I like pick dirt up firetruck
Adam fall toy No pictures in there
From Brown (1973).
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 345
all the subtleties in their language. For instance, Chomsky (1970) found that
children had difficulty comprehending sentences such as John promised Bill to
leave, interpreting Bill as the one who leaves. The verb promise is unusual in this
respect—for instance, compare John told Bill to leave, which children will properly
interpret.
By the time children are 6 years old, they have mastered most of their language,
although they continue to pick up details at least until the age of 10. In
that time, they have learned tens of thousands of special case rules and tens of
thousands of words. Studies of the rate of word acquisition by children produced
an estimate of more than five words a day (Carey, 1978; E.V. Clark, 1983).
A natural language requires more knowledge to be acquired for mastery than do
any of the domains of expertise considered in Chapter 9. Of course, children also
put an enormous amount of time into the language-acquisition process—easily
10,000 hr must have been spent practicing speaking and understanding speech
before a child is 6 years old.
Children gradually approximate adult speech by producing ever larger and
more complex constructions.
The Issue of Rules and the Case of Past Tense
A controversy in the study of language acquisition concerns whether children
are learning what might be considered rules such as those that are part of
linguistic theory. For instance, when a child learning English begins to inflect a
verb such as kick with ed to indicate past tense, is that child learning a pasttense
rule or is the child just learning to associate kick and ed? A young child
certainly cannot explicitly articulate the add ed rule, but this inability may just
mean that this knowledge is implicit. An interesting observation in this regard
is that children will generalize the rule to new verbs. If they are introduced to
a new verb (e.g., told that the made-up verb wug means dance) they will
spontaneously generate this verb with the appropriate past tense (wugged in
this example).
Some of the interesting evidence on this score concerns how children learn
to deal with irregular past tenses—for instance, the past tense of sing is sang.
The order in which children learn to inflect verbs for past tense follows the
characteristic sequence noted for pluralization. First, children will use the irregular
correctly, generating sang; then they will overgeneralize the past-tense rule
and generate singed; finally, they will get it right for good and return to sang.
The existence of this intermediate stage of overgeneralization has been used to
argue for the existence of rules, because it is claimed there is no way that the
child could have learned from direct experience to associate ed to sing. Rather,
the argument goes, the child must be overgeneralizing a rule that has been
learned.
This conventional interpretation of the acquisition of past tense was severely
challenged by Rumelhart and McClelland (1986a). They simulated a neural network
as illustrated in Figure 12.6 and had it learn the past tenses of verbs. In the
346 | Language Structure
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 346
network, one inputs the root form of a verb (e.g., kick, sing) and, after a number
of layers of association, the past-tense form should appear.
The computer model was trained with a set of 420 pairs of the root with the
past tense. It simulated a neural learning mechanism to acquire the pairs. Such a
system learns to associate features of the input with features of the output. Thus,
it might learn that words beginning with “s” are associated with past tense endings
of “ed,” thus leading to the “singed” overgeneralization (but things can be
more complex in such neural models). The model mirrored the standard developmental
sequence of children, first generating correct irregulars, then overgeneralizing,
and finally getting it right. It went through the intermediate stage of
generating past-tense forms such as singed because of generalization from regular
past-tense forms.With enough practice, the model, in effect, memorized the
past-tense forms and was not using generalization. Rumelhart and McClelland
concluded:
We have, we believe, provided a distinct alternative to the view that children
learn the rules of English past-tense formation in any explicit sense. We have
shown that a reasonable account of the acquisition of past tense can be
provided without recourse to the notion of a “rule” as anything more than a
description of the language. We have shown that, for this case, there is no
induction problem. The child need not figure out what the rules are, nor even
that there are rules. (p. 267)
Their claims drew a major counter-response from Pinker and Prince (1988).
Pinker and Prince pointed out that the ability to produce the initial stage of
Language Acquisition | 347
Fixed
encoding
network
Pattern associator
modifiable connections Decoding/binding
network
Phonological
representation
of root form
Phonological
representation
of past tense
Feature
representation
of root form
Feature
representation
of past tense
FIGURE 12.6 A network for past tense. The phonological representation of the root is
converted into a distributed feature representation. This representation is converted into
the distributed feature representation of the past tense, which is then mapped onto
a phonological representation of the past tense. (From Rumelhart & McClelland, 1986a.)
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 347
correct irregulars depended on Rumelhart and McClelland’s using a disproportionately
large number of irregulars at first—more so than the child experiences.
They had a number of other criticisms of the model, including the fact
that it sometimes produced utterances that children never produce—for instance,
it produced membled as the past tense of mail.
Another of their criticisms had to do with whether it was even possible to
really learn past tense as the process of associating root form with past-tense
form. It turns out that the way a verb is inflected for past tense does not depend
just on its root form but also on its meaning. For instance, the word ring has
two meanings as a verb—to make a sound or to encircle. Although it is the
same root, the past tense of the first is rang, whereas the past tense of the latter
is ringed, as in
• He rang the bell. • They ringed the fort with soldiers.
It is unclear how fundamental any of these criticisms are, and there are now a
number of more adequate attempts to come up with such associative models
(e.g., MacWhinney & Leinbach, 1991; Daugherty, MacDonald, Petersen, &
Seidenberg, 1993; and, for a rejoinder, see Marcus et al., 1995).
Marslen-Wilson and Tyler (1998) argued that the debate between rulebased
and associative accounts will not be settled by focusing only on children’s
language acquisition. They suggest that more decisive evidence will come from
examining properties of the neural system that implements adult processing of
past tense. They cite two sorts of evidence, which seem to converge in their
implications about the nature of the processing of past tense. First, they cite
evidence that some patients with aphasias have deficient processing of regular
past tense, whereas others have deficient processing of irregular past tenses. The
patients with deficient processing of regular past tense have severe damage to
Broca’s area, which is generally associated with syntactic processing. In contrast,
the patients with deficient processing of irregular past tenses have damage to
their temporal lobes, which are generally associated with associative learning.
Second, they cite the PET imaging data of Jaeger et al. (1996), who studied the
processing of past tense by unimpaired adults. Jaeger et al. found activation in
the region of Broca’s area only during the processing of regular past tense and
found temporal activation during the processing of irregular past tenses. On
the basis of the data, Marslen-Wilson and Tyler concluded that regular past
tense may be processed in a rule-based manner, whereas the irregular may be
processed in an associative manner.
Irregular past tenses are produced associatively, and there is debate about
whether regular past tenses are produced associatively or by rules.
Quality of Input
An important difference between a child’s first-language acquisition and the
acquisition of many skills (including typical second-language acquisition) is
that the child receives little if any instruction in acquiring his or her first
348 | Language Structure
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 348
language. Thus, the child’s task is one of inducing the structure of natural language
from listening to parents, caretakers, and older children. In addition to
not receiving any direct instruction, the child does not get much information
about what are incorrect forms in natural language. Many parents do not correct
their children’s speech at all, and those who do correct their children’s
speech appear to do so without any effect. Consider the following well-known
interaction recorded between a parent and a child (McNeill, 1966):
Child: Nobody don’t like me.
Mother: No, say, “Nobody likes me.”
Child: Nobody don’t like me.
Mother: No, say, “Nobody likes me.”
Child: Nobody don’t like me.
[dialogue repeated eight times]
Mother: Now listen carefully; say, “Nobody likes me.”
Child: Oh! Nobody don’t likeS me.
This lack of negative information is puzzling to theorists of natural language
acquisition.We have seen that children’s early speech is full of errors. If they are
never told about their errors, why do children ever abandon these incorrect
ways of speaking and adopt the correct forms?
Because children do not get much instruction on the nature of language
and ignore most of what they get, their learning task is one of induction—they
must infer from the utterances that they hear what the acceptable utterances in
their language are. This task is very difficult under the best of conditions, and
children often do not operate under the best of conditions. For instance, children
hear ungrammatical sentences mixed in with the grammatical. How are
they to avoid being misled by these sentences? Some parents and caregivers
are careful to make their utterances to children simple and clear. This kind of
speech, consisting of short sentences with exaggerated intonation, is called
motherese (Snow & Ferguson, 1977). However, not all children receive the
benefit of such speech, and yet all children learn their native languages. Some
parents speak to their children in only adult sentences, and the children learn
(Kaluli, studied by Schieffelin, 1979); other parents do not speak to their children
at all, and still the children learn by overhearing adults speak (Piedmont
Carolinas, studied by Heath, 1983). Moreover, among more typical parents,
there is no correlation between degree to which motherese is used and rate of
linguistic developments (Gleitman, Newport, & Gleitman, 1984). So the quality
of the input cannot be that critical.
Another curious fact is that children appear to be capable of learning a
language in the absence of any input. Goldin-Meadow (2003) summarized
research on the deaf children of speaking parents who chose to teach their
children by the oral method. It is very difficult for deaf children to learn to
speak but quite easy for children to learn sign language, which is a perfectly fine
language. Despite the fact that the parents of these children were not teaching
them sign language, they proceeded to invent their own sign language to communicate
with their parents. These invented languages have the structure of
normal languages. Moreover, the children in the process of invention seem to
Language Acquisition | 349
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 349
go through the same periods as children who are learning a language of their
community. That is, they start out with single manual gestures, then progress to
a two-gesture period, and continue to evolve a complete language more or less
at the same points in time as those of their hearing peers. Thus, children seem
to be born with a propensity to communicate and will learn a language no
matter what.
The very fact that young children learn a language so successfully in almost
all circumstances has been used to argue that the way that we learn language
must be different from the way that we learn other cognitive skills. Also pointed
out is the fact that children learn their first language successfully at a point in
development when their general intellectual abilities are still weak.
Children master language at a very young age and with little direct instruction.
A Critical Period for Language Acquisition
A related argument has to do with the claim that young children appear to
acquire a second language much faster than older children or adults do. It is
claimed that there is a certain critical period, from 2 to about 12 years of age,
when it is easiest to learn a language. For a long time, the claim that children
learn second languages more readily than adults was based on informal observations
of children of various ages and of adults in new linguistic communities—
for example, when families move to a another country in response to a corporate
assignment or when immigrants move to another country to reside there
permanently. Young children are said to acquire a facility to get along in the
new language more quickly than their older siblings or their parents. However,
there are a great many differences between the adults, the older children, and
the younger children in amount of linguistic exposure, type of exposure (e.g.,
whether the stock market, history, or Nintendo is being discussed), and willingness
to try to learn (McLaughlin, 1978; Nida, 1971). In careful studies in which
situations have been selected that controlled for these factors, a positive relation
is exhibited between children’s ages and rate of language development
(Ervin-Tripp, 1974). That is, older children (older than 12 years) learn faster
than younger children.
Even though older children and adults may learn a new language more rapidly
than younger children initially, they seem not to acquire the same level of final
mastery of the fine points of language, such as the phonology and morphology
(Lieberman, 1984; Newport, 1986). For instance, the ability to speak a second
language without an accent severely deteriorates with age (Oyama, 1978). In
one study, Johnson and Newport (1989) looked at the degree of proficiency
in speaking English achieved by Koreans and Chinese as a function of the age
at which they arrived in America. All had been in the United States for about
10 years. In general, it seems that the later they came to America, the poorer
their performance was on a variety of measures of syntactic facility. Thus,
although it is not true that language learning is fastest for the youngest, it is does
seem that the greatest eventual mastery of the fine points of language are
achieved by those who start very young.
350 | Language Structure
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 350
Figure 12.7 shows some data from Flege,Yeni-Komshian, and Liu (1999) looking
at the performance of 240 Korean immigrants to the United States. For measures
of both foreign accent and syntactic errors, there is a steady decrease in performance
with age of arrival in the United States. The data give some suggestion
of a more rapid drop around the age of 10—which would be consistent with the
hypothesis of a critical period in language acquisition. However, age of arrival
turns out to be confounded with many other things, and one critical factor is the
relative use of Korean versus English. Based on questionnaire
data, Flege et al. rated these participants with respect to the
relative frequency with which they used English versus
Korean. Figure 12.8 displays this data and shows that there
is a steady decrease in use of English to about the point of
the critical period at which participants reported approximately
equal use of the two languages. Perhaps the decrease
in English performance reflects this difference in amount
of use. To address this question, Flege et al. created two
matched groups (subsets of the original 240) who reported
equal use of English, but one group averaged 9.7 years
when they arrived in the United States and the other group
averaged 16.2. The two groups did not differ on measures
of syntax, but the later arriving group still showed a
stronger accent. Thus, it seems that there may not be a critical
period for acquisition of syntactic knowledge but there
may be one for acquisition of phonological knowledge.
Language Acquisition | 351
Mean foreign accent rating
9
7
6
8
5
4
3
2
1
0 5 10 15 20
(a) Age of arrival in the US
Morphosyntax scores (% correct)
90
100
80
70
60
50
0 5 10 15 20
(b) Age of arrival in the US
Native English
Native Korean
Native English
Native Korean
FIGURE 12.7 Mean language scores of 24 native English speakers and 240 native Korean
participants as a function of age of arrival in the United States. (a) Scores on test of foreign
accent (lower scores mean stronger accent) and (b) scores on tests of morphosyntax (lower
scores mean more errors). (From Flege et al., 1999.)
Ratio of English/Korean use
1.2
1.4
1.6
1.8
2.0
2.2
2.4
1.0
0.8
0 5 10 15 20
Age of arrival (years)
FIGURE 12.8 Relative use
of English versus Korean as a
function of age of arrival in
the United States. (From Flege
et al., 1999.)
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 351
352 | Language Structure
Grammatical processing in bilinguals
Age of second-language
acquisition
1–3 years
4–6 years
11–13 years
FIGURE 12.9 ERP patterns produced in response to grammatical anomalies in English in left
and right hemispheres. (From Weber-Fox & Neville, 1996.)
Weber-Fox and Neville (1996) presented an interesting analysis of the effects
of age of acquisition of language processing. They compared Chinese-English
bilinguals who had learned English as a second language at different ages. One
of their tests included an ERP measurement of sensitivity to syntactic violations
in English. English monolinguals show a strong left lateralization in their
response to such violations, which is a sign of the left lateralization of language.
Figure 12.9 compares the two hemispheres in these adult bilinguals as a function
of the age at which they acquired English. Adults who had learned English
in their first years of life show strong left lateralization like those who learn
English as a first language. If they were delayed in their acquisition to ages between
12 and 13, they show almost no lateralization. Those who had acquired
English at an intermediate age show an intermediate amount of lateralization.
Interestingly,Weber-Fox and Neville reported no such critical period for lexical
or semantic violations. Learning English as late as 16 years of age had almost no
effect on the lateralization of their responses to semantic violations. Thus,
grammar seems to be more sensitive to a critical period.
Most studies on the effect of age of acquisition have naturally concerned
second languages. However, an interesting study of first-language acquisition
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 352
Language Acquisition | 353
was done by Newport and Supalla (1990). They looked at the acquisition of
American sign language, one of the few languages that is acquired as a first
language in adolescence or adulthood. Deaf children of speaking parents are
sometimes not exposed to the sign language until late in life and consequently
acquire no language in their early years. Adults who acquire sign language
achieve a poorer ultimate mastery of it than children do.
There are age-related differences in the success with which children can acquire
a new language, with the strongest effects on phonology, intermediate
effects on syntax, and weakest effects on semantics.
Language Universals
Chomsky (1965) argued that special innate mechanisms underlie the acquisition
of language. Specifically, his claim is that the number of formal possibilities for a
natural language is so great that learning the language would simply be impossible
unless we possessed some innate information about the possible forms of
natural human languages. It is possible to prove formally that Chomsky is correct
in his claim. Although the formal analysis is beyond the scope of this book, an
analogy might help. In Chomsky’s view, the problem that child learners face is to
discover the grammar of their language when only given instances of utterances
of the language. The task can be compared to trying to find a matching sock
(language) from a huge pile of socks (set of possible languages). One can use
various features (utterances) of the sock in hand to determine whether any particular
sock in the pile is the matching one. If the pile of socks is big enough and
the socks are similar enough, this task would prove to be impossible. Likewise,
enough formally possible grammars are similar enough to one another to make it
impossible to learn every possible instance of a formal language. However, because
language learning obviously occurs, we must, according to Chomsky, have
special innate knowledge that allows us to substantially restrict the number of
possible grammars that we have to consider. In the sock analogy, it would be like
knowing ahead of time which part of the pile to inspect. So, although we cannot
learn all possible languages, we can learn a special subset of them.
Chomsky proposed the existence of language universals that limit the possible
characteristics of a natural language and a natural grammar. He assumes
that children can learn a natural language because they possess innate knowledge
of these language universals. A language that violated these universals
would simply be unlearnable, which means that there are hypothetical languages
that no humans could learn. Languages that humans can learn are referred to
as natural languages.
As already noted, we can formally prove that Chomsky’s assertion is
correct—that is, constraints on the possible forms of a natural language must
exist. However, the critical issue is whether these constraints are due to any
linguistic-specific knowledge on the part of children or whether they are simply
general cognitive constraints on learning mechanisms. Chomsky would argue
that the constraints are language specific. It is this claim that is open to serious
question. The issue is: Are the constraints on the form of natural languages
universals of language or universals of cognition?
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 353
In speaking of language universals, Chomsky is concerned with a competence
grammar. Recall that a competence analysis is concerned with an abstract
specification of what a speaker knows about a language; in contrast, a performance
analysis is concerned with the way in which a speaker uses language.
Thus, Chomsky is claiming that children possess innate constraints about the
types of phrase structures and transformations that might be found in a natural
language. Because of the abstract, nonperformance-based character of these
purported universals, one cannot simply evaluate Chomsky’s claim by observing
the details of acquisition of any particular language. Rather, the strategy is to
look for properties that are true of all languages or of the acquisition of all
languages. These universal properties would be manifestations of the language
universals that Chomsky postulates.
Although languages can be quite different from one another, some clear uniformities,
or near-uniformities, exist. For instance, as we saw earlier, virtually no
language favors the object-before-subject word order. However, as noted, this
constraint appears to have a cognitive explanation (as do many other limits on
language form).
Often, the uniformities among languages seem so natural that we do not
realize that other possibilities might exist. One such language universal is that
adjectives appear near the nouns that they modify. Thus, we translate The brave
woman hit the cruel man into French as • La femme brave a frappé l’homme cruel
and not as • La femme cruel a frappé l’homme brave
although a language in which the adjective beside the subject noun modified
the object noun and vice versa would be logically possible. Clearly, however,
such a language design would be absurd in regard to its cognitive demands. It
would require that listeners hold the adjective from the beginning of the sentence
until the noun at the end. No natural language has this perverse structure.
If it really needed showing, I showed with artificial languages that adult participants
were unable to learn such a language (Anderson, 1978b). Thus, many of
the universals of language seem cognitive in origin and so do not really support
Chomsky’s position. In the next subsections, we will consider some universals
that seem more language specific.
There are universal constraints on the kinds of languages that humans can learn.
The Constraints on Transformations
A set of peculiar constraints on movement transformations (refer to the subsection
on transformations on page 332) has been used to argue for the existence of
linguistic universals. One of the more extensively discussed of these constraints
is called the A-over-A constraint. Compare sentence 1 with sentence 2:
1. Which woman did John meet who knows the senator?
2. Which senator did John meet the woman who knows?
354 | Language Structure
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 354
Linguists would consider sentence 1 to be acceptable but not sentence 2. Sentence
1 can be derived by a transformation from sentence 3. This transformation
moves which woman forward:
3. John met which woman who knows the senator?
4. John met the woman who knows which senator?
Sentence 2 could be derived by a similar transformation operating on which
senator in sentence 4, but apparently transformations are not allowed that move
a noun phrase such as which senator if it is embedded within another noun
phrase (in this case, which senator is part of the clause modifying the woman
and so is part of the noun phrase associated with the woman). Transformations
can move deeply embedded nouns if these nouns are not in clauses modifying
other nouns. So, for instance, sentence 5, which is acceptable, is derived transformationally
from sentence 6:
5. Which senator does Mary believe that Bill said that John likes?
6. Mary believes that Bill said that John likes which senator?
Thus, we see that the constraint on the transformation that forms which questions
is arbitrary. It can apply to any embedded noun unless that noun is part
of another noun phrase. The arbitrariness of this constraint makes it hard to
imagine how a child would ever figure it out—unless the child already knew it
as a universal of language. Certainly, the child is never explicitly told this fact
about language.
The existence of such constraints on the form of language offers a challenge
to any theory of language acquisition. The constraints are so peculiar that it is
hard to imagine how they could be learned unless a child was especially prepared
to deal with them.
There are rather arbitrary constraints on the movements that transformations
can produce.
Parameter Setting
With all this discussion about language universals, one might get the impression
that all languages are basically alike. Far from it. On many dimensions, the
languages of the world are radically different. They might have some abstract
properties in common, such as the transformational constraint discussed above,
in common, but there are many properties on which they differ. As already
mentioned, different languages prefer different orders for subject, verb, and
object. Languages also differ in how strict they are about word order. English is
very strict, but some highly inflected languages, such as Finnish, allow people to
say their sentences with almost any word order they choose. There are languages
that do not mark verbs for tense and languages that mark verbs for the flexibility
of the object being acted on.
Another example of a difference, which has been a focus of discussion, is
that some languages, such as Italian or Spanish, are what are called pro-drop
Language Acquisition | 355
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 355
languages: They allow one to optionally drop the pronoun when it appears in
the subject position. Thus, whereas in English we would say, I am going to the
cinema tonight, Italians can say, Vado al cinema stasera, and Spaniards, Voy al
cine esta noche—in both cases, just starting with the verb and omitting the firstperson
pronoun. It has been argued that pro-drop is a parameter on which natural
languages vary, and although children cannot be born knowing whether
their language is pro-drop or not, they can be born knowing it is one way or the
other. Thus, knowledge that the pro-drop parameter exists is one of the purported
universals of natural language.
Knowledge of a parameter such as pro-drop is useful because a number of
features are determined by it. For instance, if a language is not pro-drop, it requires
what are called expletive pronouns. In English, a non-pro-drop language,
the expletive pronouns are it and there when they are used in sentences such
as It is raining or There is no money. English requires these rather semantically
empty pronouns because, by definition, a non-pro-drop language cannot have
empty slots in the subject position. Pro-drop languages such as Spanish and
Italian lack such empty pronouns because they are not needed.
Hyams (1986) argued that children starting to learn any language, including
English, will treat it as a pro-drop language and optionally drop pronouns
even though doing so may not be correct in the adult language. She noted that
young children learning English tend to omit subjects. They will also not use
expletive pronouns, even when they are part of the adult language. When
children in a non-pro-drop language start using expletive pronouns, they
simultaneously optionally stop dropping pronouns in the subject position.
Hyams argued that, at this point, they learn that their language is not a prodrop
language. For further discussion of Hyams’s proposal and alternative
formulations, read R. Bloom (1994).
It is argued that much of the variability among natural languages can be accommodated
by setting 100 or so parameters, such as the pro-drop parameter,
and that a major part of learning a language is learning the setting of these
parameters (of course, there is a lot more to be learned than just this setting—
e.g., an enormous vocabulary). This theory of language acquisition is called
the parameter setting proposal. It is quite controversial, but it provides us with
one picture of what it might mean for a child to be prepared to learn a language
with innate, language-specific knowledge.
Learning the structure of language has been proposed to include learning the
setting of 100 or so parameters on which natural languages vary.
•Conclusions: The Uniqueness of Language:
A Summary
Although it is clear that human language is a very unique communication
system relative to those of other species, the jury is still very much out on the
issue of whether language is really a system different from other human cognitive
systems. The status of language is a major issue for cognitive psychology.
356 | Language Structure
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 356
The issue will be resolved by empirical and theoretical efforts more detailed
than those reviewed in this chapter. The ideas here have served to define the
context for the investigation. The next chapter will review the current state of
our knowledge about the details of language comprehension. Careful experimental
research on such topics will finally resolve the question of the uniqueness
of language.
Key Terms | 357
1. There have emerged a number of computer-based
approaches to representing meaning that are based on
having these programs read through large sets of documents
and representing the meaning of a word in terms
of what other words also occurred with it in these
documents. One interesting feature of these efforts is
that they have no knowledge of the physical world
and what these words refer to. Perhaps the most
well-known system is called Latent Semantic Analysis
(LSA—Landauer, Foltz, & Laham, 1998). The authors
of LSA describe the knowledge in their system as
“analogous to a well-read nun’s knowledge of sex, a
level of knowledge often deemed a sufficient basis for
advising the young” (p. 5). Based on this knowledge,
LSA was able to pass the vocabulary test from the Educational
Testing Service’s Test of English as a Foreign
Language. The test requires that one choose which of
four alternatives best matches the meaning of a word,
and LSA was able to do this by comparing its meaning
representation of the word (based on what documents
the word appeared in) with its meaning representation
of the alternatives (again based on the same information).
Why do you think such a program is so successful?
How would you devise a vocabulary test to expose
aspects of meaning that it does not represent?
2. In addition to the pauses and speech errors discussed in
the chapter, spontaneous speech contains fillers like uh
and um in English (different languages use different
fillers). Clark and Fox Tree (2002) report that um tends
to be associated with a longer delay in speech than uh.
In terms of phrase structure, where would you expect
to see uh and um located?
3. Some languages assign grammatical genders to words
that do not have inherent genders and appear to do so
arbitrarily. So, for instance, the German word for key
is masculine and the Spanish word for key is feminine.
Boroditsky, Schmidt, and Phillips (2003) report that
when asked to describe a key, German speakers are
more likely to use words like hard and jagged, whereas
Spanish speakers are more likely to use words like
shiny and tiny.What does evidence like this say about
the relationship between language and thought?
4. When two linguistic communities often come into contact
such as in trade, they develop simplified languages,
called pidgins, for communicating. These languages are
generally considered not full natural languages. However,
if these language communities live together, the
pidgins will evolve into full-fledged new languages
called creoles. This can happen in one generation, in
which the parents who first made contact with the new
linguistic community continue to use pidgin, whereas
their children are speaking full-fledged creoles.What
does this say about the possible role of a critical period
in language acquisition?
Questions for Thought
Key Terms
competence
grammar
language universals
linguistic determinism
linguistic intuitions
linguistics
modularity
natural languages
parameter setting
performance
phonology
phrase structure
productivity
regularity
semantics
syntax
transformation
Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 357
358
13
Language Comprehension
Afavorite device in science fiction is the computer or robot that can understand
and speak language—whether evil like HAL in 2001, or beneficent like C3PO in
Star Wars. Workers in artificial intelligence have been trying to develop computers
that understand and generate language. Progress is being made, but Stanley Kubrick
was clearly incorrect when he projected HAL for the year 2001. Language-processing
AI programs are still rudimentary compared with what is portrayed in science fiction.
An enormous amount of knowledge and intelligence underlies the successful use of
language.
This chapter will look at language use and, in particular, at language comprehension
(as distinct from language generation). This focus will enable us to look where
the light is—more is known about language comprehension than about language generation.
Language comprehension will be considered in regard to both listening and
reading. The listening process is often thought to be the more basic of the two.
However, many of the same factors apply to both listening and reading. Researchers’
choice between written or spoken material is determined by what is easier to do
experimentally. More often than not, written material is used.
We will consider a detailed analysis of the process of language comprehension,
breaking it down into three stages. The first stage involves the perceptual
processes that encode the spoken (acoustic) or written message. The second stage
is termed the parsing stage. Parsing is the process by which the words in the message
are transformed into a mental representation of the combined meaning of the
words. The third stage is the utilization stage, in which comprehenders use the
mental representation of the sentence’s meaning. If the sentence is an assertion,
listeners may simply store the meaning in memory; if it is a question, they may
answer; if it is an instruction, they may obey. However, listeners are not always so
compliant. They may use an assertion about the weather to make an inference
about the speaker’s personality, they may answer a question with a question, or
they may do just the opposite of what the speaker asks. These three stages—
perception, parsing, and utilization—are by necessity partly ordered in time; however,
they also partly overlap. Listeners can make inferences from the first part of a
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 358
Brain and Language Comprehension | 359
sentence while they are perceiving a later part. This chapter will focus on the two
higher-level processes—parsing and utilization. (The perceptual stage was discussed
in Chapter 2.)
In this chapter, we will answer the questions: • How are individual words combined into the meaning of phrases? • How is syntactic and semantic information combined in sentence
interpretation? • What inferences do comprehenders make as they hear a sentence? • How are meanings of individual sentences combined in the processing
of larger units of discourse?
•Brain and Language Comprehension
Figure 12.1 highlighted the classic language-processing regions that are active
in the parsing stage that involves single sentences. However, when we consider
the utilization stage and the processing involved in larger portions of
text, we find many other regions of the brain active. Figure 13.1 illustrates
some of the regions identified by Mason and Just (2006) in discourse processing
(for a richer representation of all the areas, see Color Plate 13.1). One can
take the union of Figures 12.1 and 13.1 as something closer to the total brain
network involved in language processing. These figures make clear the fact
that language comprehension involves much of the brain and many cognitive
processes.
Comprehension consists of a perceptual stage, a parsing stage, and a utilization
stage, in that order.
Brain Structures
Coherence monitoring
network
Text integration
network Coarse semantic
processing network
Spatial imagery
network
FIGURE 13.1 A representation
of some of the brain regions
involved in discourse processing.
(From Mason & Just, 2006.)
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 359
•Parsing
Constituent Structure
Language is structured according to a set of rules that tell us how to go from a
particular string of words to an interpretation of that string’s meaning. For
instance, in English we know that if we hear a sequence of the form A noun
action a noun, the speaker means that an instance of the first noun performed
the action on an instance of the second noun. In contrast, if the sentence is of
the form A noun was action by a noun, the speaker means that an instance of the
second noun performed the action on an instance of the first noun. Thus, our
knowledge of the structure of English allows us to grasp the difference between
A doctor shot a lawyer and A doctor was shot by a lawyer.
In learning to comprehend a language, we acquire a great many rules that
encode the various linguistic patterns in language and relate these patterns to
meaningful interpretations. However, we cannot possibly learn rules for every
possible sentence pattern—sentences can be very long and complex. A very
large (probably infinite) number of patterns would be required to encode all
possible sentence forms. Although we have not learned to interpret all possible
full-sentence patterns, we have learned to interpret subpatterns, or phrases, of
these sentences and to combine, or concatenate, the interpretations of these
subpatterns. These subpatterns correspond to basic phrases, or units, in a sentence’s
structure. These phrase units are also referred to as constituents. From
the late 1950s to the early 1980s, a series of studies were performed that established
the psychological reality of phrase structure (or constituent structure) in
language processing. Chapter 12 reviewed some of the research documenting
the importance of phrase structure in language generation. Here, we review
some of the evidence for the psychological reality of this constituent structure
in comprehension.
We might expect that the more clearly identifiable the constituent structure
of a sentence is, the more easily the sentence can be understood. Graf and
Torrey (1966) presented sentences to participants a line at a time. The passages
could be presented in form A, in which each line corresponded to a major constituent
boundary, or in form B, in which there was no such correspondence.
Examples of the two types of passages follow:
Form A Form B
During World War II During World War
even fantastic schemes II even fantastic
received consideration schemes received
if they gave promise consideration if they gave
of shortening the conflict. promise of shortening the conflict.
Participants showed better comprehension of passages in form A. This finding
demonstrates that the identification of constituent structure is important to the
parsing of a sentence.
When people read such passages, they naturally pause at boundaries between
clauses. Aaronson and Scarborough (1977) asked participants to read sentences
360 | Language Comprehension
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 360
displayed word by word on a computer screen. Participants would press a key
each time they wanted to read another word. Figure 13.2 illustrates the pattern of
reading times for a sentence that participants were reading for later recall. Notice
the U-shaped patterns with prolonged pauses at the phrase boundaries.With the
completion of each major phrase, participants seemed to need time to process it.
After one has processed the words in a phrase in order to understand it,
there is no need to make further reference to these exact words. Thus, we might
predict that people would have poor memory for the exact wording of a constituent
after it has been parsed and the parsing of another constituent has
begun. The results of an experiment by Jarvella (1971) confirm this prediction.
He read to participants passages with interruptions at various points. At each
interruption, participants were instructed to write down as much of the passage
as they could remember. Of interest were passages that ended with 13-word
sentences such as the following one:
1 2 3 4 5 6
Having failed to disprove the charges,
7 8 9 10 11 12 13
Taylor was later fired by the president.
After hearing the last word, participants were prompted with the first word of
the sentence and asked to recall the remaining words. Each sentence was
composed of a 6-word subordinate clause followed by a 7-word main clause.
Figure 13.3 plots the probability of recall for each of the remaining 12 words
in the sentence (excluding the first, which was used as a prompt). Note the
sharp rise in the function at word 7, the beginning of the main clause. These
data show that participants have best memory for the last major constituent, a
result consistent with the hypothesis that they retain a verbatim representation
of the last constituent only.
An experiment by Caplan (1972) also presents evidence for the use of constituent
structure, but this study used a reaction-time methodology. Participants
Parsing | 361
Word in sentence
Because
of
its
its
lasting
construction
as
well
as
motor’s
power
the
boat
was of high
quality
Time (s)
.3
.4
.5
.6
.7
.8
.9
FIGURE 13.2 Word-by-word reading times for a sample sentence. The short-line markers on the
graph indicate breaks between phrase structures. (Adapted from Aaronson & Scarborough, 1977.)
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 361
were presented aurally first with a sentence and then with a probe word; they
then had to indicate as quickly as possible whether the probe word was in the
sentence. Caplan contrasted pairs of sentences such as the following pair:
1. Now that artists are working fewer hours oil prints are rare.
2. Now that artists are working in oil prints are rare.
Interest focused on how quickly participants would recognize oil in these two
sentences when probed at the ends of the sentences. The sentences were cleverly
constructed so that, in both sentences, the word oil was fourth from the end
and was followed by the same words. In fact, by splicing tape, Caplan arranged
the presentation so that participants heard the same recording of these last four
words whichever full sentence they heard. However, in sentence 1, oil is part of
the last constituent, oil prints are rare, whereas, in sentence 2, it is part of the
first constituent, now that artists are working in oil. Caplan predicted that participants
would recognize oil more quickly in sentence 1 because they would
still have active in memory a representation of this constituent. As he predicted,
the probe word was recognized more rapidly if it was in the last constituent.
Participants process the meaning of a sentence one phrase at a time and
maintain access to a phrase only while processing its meaning.
Immediacy of Interpretation
An important principle to emerge in more recent studies of language processing
is called the principle of immediacy of interpretation. Basically, this principle
says that people try to extract meaning out of each word as it arrives and
362 | Language Comprehension
1.0
0.9
0.8
0.7
1 2 3 4 5 6 7 8 9 10 11 12 13
Ordinal position of word
Proportion correct recall
FIGURE 13.3 Probability of recalling a word as a function of its position in the last 13 words in
a passage. (Adapted from Jarvella, 1971.)
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 362
do not wait until the end of a sentence or even the end of a phrase to decide
how to interpret a word. For instance, Just and Carpenter (1980) studied the
eye movements of participants as they read a sentence.While reading a sentence,
participants will typically fixate on almost every word. Just and Carpenter
found that the time spent fixating on a word is proportional to the amount of
information provided by the word. Thus, if a sentence contains an unfamiliar
or a surprising word, participants pause on that word. They also pause
longer at the end of the phrase containing that word. Figure 13.4 illustrates
the eye fixations of one of their college students reading a scientific passage.
The circles are above the words the student fixated on, and in each circle is the
duration of that fixation. The order of the gazes is left to right except for
the three gazes above engine contains, where the order of gazes is indicated.
Note that unimportant function words such as the and to may be skipped or,
if not skipped, receive relatively little processing. Note the
amount of time spent on the word flywheel. The participant
did not wait until the end of the sentence to think
about this word. Again, look at the amount of time spent
on the highly informative adjective mechanical—the participant
did not wait until the end of the noun phrase to
think about it.
Eye movements have also been used to study the comprehension
of spoken language. In one of these studies
(Allopenna, Magnuson, & Tanenhaus, 1998), participants
were shown computer displays of objects like that in Figure
13.5 and processed instructions such as
Pick up the beaker and put it below the diamond.
Participants would perform this action by selecting the
object with a mouse and moving it, but the experiment
was done to study their eye movements that preceded any
Parsing | 363
1,566 267 100 83 267 617 767 150 150 100
483 450 383 281 383 317 283 533 50 366 566
616 517 1,116
684
250 317 617 367 467
Flywheels are one of the oldest mechanical devices known to man. Every
internal-combustion engine contains a small flywheel that converts the jerky
motion of the pistons into the smooth flow of energy that powers the drive shaft.
FIGURE 13.4 The time spent by a college reader on the words in the opening two sentences of
a technical article about flywheels. The times, indicated above the fixated word, are expressed
in milliseconds. This reader read the sentences from left to right, with one regressive fixation to
an earlier part. (Adapted from Just & Carpenter, 1980.)
+
FIGURE 13.5 An example of a
computer display used in the
study of Allopenna et al. (1998).
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 363
mouse action. Figure 13.6 shows the probabilities
that participants fixate on various items in the
display as a function of time since the beginning of
the articulation of “beaker.” It can be seen that participants
are beginning to look to the two items that
start with the same sound (“beaker” and “beetle”)
even before the articulation of the word finishes.
It takes about 400 msec to say the word. Almost
immediately upon offset of the word, their fixations
on the wrong item (“beetle”) decrease and their
fixations on the correct item (“beaker”) shoot up.
Given that it takes about 200 msec to program an
eye movement, this study provides evidence that
participants are processing the meaning of a word
even before it completes.
This immediacy of processing implies that we
will begin to interpret a sentence even before we encounter
the main verb. Sometimes we are aware of
wondering what the verb will be as we hear the sentence.We are likely to experience
something like this in constructions that put the verb last. Consider what
happens as we process the following sentence: • It was the most expensive car that the CEO bought.
Before we get to bought, we already have some idea of what might be happening
between the CEO and the car. Although this sentence structure with the verb at
the end is unusual for English, it is not unusual for languages such as German.
Listeners of these languages do develop strong expectations about the sentence
before seeing the verb (see Clifton & Duffy, 2001, for a review).
If people process a sentence as each word comes in, why is there so much
evidence for the importance of phrase-structure boundaries? The evidence reflects
the fact that the meaning of a sentence is defined in terms of the phrase
structure, and, even if listeners try to extract all they can from each word, they
will be able to put some things into place only when they reach the end of a
phrase. Thus, people often need extra time at a phrase boundary to complete
this processing. People have to maintain a representation of the current phrase
in memory because their interpretation of it may be wrong, and they may have
to reinterpret the beginning of the phrase. Just and Carpenter (1980) in their
study of reading times found that participants tend to spend extra time at the
end of each phrase in wrapping up the meaning conveyed by that phrase.
In processing a sentence, we try to extract as much information as possible from
each word and spend some additional wrap-up time at the end of each phrase.
The Processing of Syntactic Structure
The basic task in parsing a sentence is to combine the meanings of the individual
words to arrive at a meaning for the overall sentence. There are two basic
364 | Language Comprehension
0 200
0.2
0
0.4
0.6
0.8
1.0
400 600 800
Time since target onset (msec)
Fixation probability
1000
Average target offset
Referent (e.g., “beaker”)
Cohort (e.g., “beetle”)
Unrelated (e.g., “carriage”)
FIGURE 13.6 Probability of
fixating different items in the
display as a function of time
from onset of the critical word
beetle. (From Allopenna et al., 1998.)
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 364
sources of syntactic information that can guide us in this task. One source is
word order and the other is inflectional structure. The following two sentences,
although they have identical words, have very different meanings:
1. The dog bit the cat.
2. The cat bit the dog.
The dominant syntactic cue in English is word order. Other languages rely less
on word order and instead use inflections of words to indicate semantic role.
There is a small remnant of such an inflectional system in some English pronouns.
For instance, he and him, I and me, and so on, signal subject versus
object. McDonald (1984) compared English with German, which has a richer
inflectional system. She asked her English participants to interpret sentences
such as
3. Him kicked the girl.
4. The girl kicked he.
The word-order cue in these sentences suggests one interpretation, whereas the
inflection cue suggests an alternative interpretation. English speakers use the
word-order cue, interpreting sentence 3 with him as the subject and the girl
as the object. German speakers, judging comparable sentences in German, do
just the opposite. Bilingual speakers of both German and English tend to interpret
the English sentences more like German sentences; that is, they assign him
in sentence 3 to the object role and girl to the subject role.
An interesting case of combining word order and inflection in English
requires the use of relative clauses. Consider the following sentence:
5. The boy the girl liked was sick.
This sentence is an example of a center-embedded sentence: One clause, the
girl liked (the boy), is embedded in another clause, The boy was sick. As we will
see, there is evidence that people have difficulty with such clauses, perhaps in
part because the beginning of the sentence is ambiguous. For instance, the sentence
could have concluded as follows:
6. The boy the girl and the dog were sick.
To prevent such ambiguity, English offers relative pronouns, which are effectively
like inflections, to indicate the role of the upcoming words:
7. The boy whom the girl liked was sick.
Sentences 5 and 7 are equivalent except that in sentence 5 whom is deleted,
which indicates that the upcoming words are part of an embedded clause.
One might expect that it is easier to process sentences if they have relative
pronouns to signal the embedding of clauses. Hakes and Foss (1970; Hakes,
1972) tested this prediction by using the phoneme-monitoring task. They used
double-embedded sentences such as
8. The zebra which the lion that the gorilla chased killed was running.
9. The zebra the lion the gorilla chased killed was running.
Parsing | 365
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 365
The only difference between sentences 8 and 9 is whether there are relative
pronouns. Participants were required to perform two simultaneous tasks. One
task was to comprehend and paraphrase the sentence. The second task was
to listen for a particular phoneme—in this case a /g/ (in gorilla). Hakes and
Foss predicted that the more difficult a sentence was to comprehend, the more
time participants would take to detect the target phoneme, because they would
have less attention left over from the comprehension task with which to perform
the monitoring. In fact, the prediction was confirmed; participants did take
longer to indicate hearing /g/ when presented with sentences such as sentence 9,
which lacked relative pronouns.
Although the use of relative pronouns facilitates the processing of such sentences,
there is evidence that center-embedded sentences are quite difficult even
with the relative pronouns. In one experiment, Caplan, Alpert, Waters, and
Olivieri (2000) compared center-embedded sentences such as
10. The juice that the child enjoyed stained the rug.
with comparable sentences that are not center-embedded such as
11. The child enjoyed the juice that stained the rug.
They used PET brain-imaging measures to detect processing differences and
found greater activation in Broca’s area with center-embedded sentences. Broca’s
area is usually found to be more active when participants have to deal with more
complex sentence structures (Martin, 2003).
People use the syntactic cues of word order and inflection to help interpret
a sentence.
Semantic Considerations
People use syntactic patterns, such as those illustrated in the preceding subsection,
for understanding sentences, but they can also make use of the meanings
of the words themselves. A person can determine the meaning of a string of
words simply by considering how they can be put together so as to make sense.
Thus, when Tarzan says, Jane fruit eat, we know what he means even though
this sentence does not correspond to the syntax of English. We realize that a
relation is being asserted between someone capable of eating and something
edible.
Considerable evidence suggests that people use such semantic strategies in
language comprehension. Strohner and Nelson (1974) had 2- and 3-year-old
children use animal dolls to act out the following two sentences:
• The cat chased the mouse. • The mouse chased the cat.
In both cases, the children interpreted the sentence to mean that the cat chased
the mouse, a meaning that corresponded to their prior knowledge about cats
and mice. Thus, these young children were relying more heavily on semantic
patterns than on syntactic patterns.
366 | Language Comprehension
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 366
Fillenbaum (1971, 1974) had adults paraphrase sentences, among which
were “perverse” items such as • John was buried and died.
More than 60% of the participants paraphrased the sentences in a way that gave
them a more conventional meaning; for example, that John died first and then
was buried. However, the normal syntactic interpretation of such constructions
would be that the first activity occurred before the second, as in • John had a drink and went to the party.
in contrast with • John went to the party and had a drink.
So, when a semantic principle is placed in conflict with a syntactic principle, the
semantic principle will sometimes (but not always) determine the interpretation
of the sentence. If you have any doubt about the power of semantics to
dominate syntax, consider the following sentence
No head injury is too trivial to be ignored.
If you interpreted this sentence to mean that no head injury should be ignored,
you are in the vast majority (Wason & Reich, 1979). However, a careful inspection
of the syntax will indicate that the “correct” meaning is that all head
injuries should be ignored—consider “No missile is too small to be banned”—
which means all missiles should be banned.
Sometimes people rely on the plausible semantic interpretation of words in
a sentence.
The Integration of Syntax and Semantics
Listeners appear to combine both syntactic and semantic information in comprehending
a sentence. Tyler and Marslen-Wilson (1977) asked participants to
try to continue fragments such as
1. If you walk too near the runway, landing planes are
2. If you’ve been trained as a pilot, landing planes are
The phrase landing planes, by itself, is ambiguous. It can mean either “planes
that are landing” or “to land planes.”However, when followed by the plural verb
are, the phrase must have the first meaning. Thus, the syntactic constraints
determine a meaning for the ambiguous phrase. The prior context in fragment 1
is consistent with this meaning, whereas the prior context in fragment 2 is not.
Participants took less time to continue fragment 1, which suggests that they
were using both the semantics of the prior context and the syntax of the current
phrase to disambiguate landing planes. When these factors are in conflict,
the participant’s comprehension is slowed.1
Parsing | 367
1 The original Tyler and Marslen-Wilson experiment drew methodological criticisms from Townsend and
Bever (1982) and Cowart (1983). For a response, read Marslen-Wilson and Tyler (1987).
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 367
Bates, McNew, MacWhinney, Devesocvi, and Smith (1982) looked at the
matter of combining syntax and semantics in a different paradigm. They had
participants interpret word strings such as • Chased the dog the eraser
If you were forced to, what meaning would you assign to this word string? The
syntactic fact that objects follow verbs seems to imply that the dog was being
chased and the eraser did the chasing. The semantics, however, suggest the opposite.
In fact, American speakers prefer to go with the syntax but will sometimes
adopt the semantic interpretation—that is, most say The eraser chased the
dog, but some say The dog chased the eraser. On the other hand, if the word
string is • Chased the eraser the dog
listeners agree on the interpretation—that is, that the dog chased the eraser.
Another interesting part of the study by Bates et al. compared Americans
with Italians. When syntactic cues were put in conflict with semantic cues,
Italians tended to go with the semantic cues, whereas Americans preferred the
syntactic cues. The most critical case concerned sentences such as • The eraser bites the dog
or its Italian translation: • La gomma morde il cane
Americans almost always followed the syntax and interpreted this sentence to
mean that the eraser is doing the biting. In contrast, Italians preferred to use the
semantics and interpret that the dog is doing the biting. Like English, however,
Italian has a subject-verb-object syntax.
Thus, we see that listeners combine both syntactic and semantic cues in
interpreting the sentence. Moreover, the weighting of these two types of cues
can vary from language to language. This evidence and other results indicate
that speakers of Italian weight semantic cues more heavily than do speakers of
English.
People integrate semantic and syntactic cues to arrive at an interpretation
of a sentence.
Neural Indicants of Syntactic and Semantic Processing
Researchers have found two indicants of sentence processing in event related
potentials (ERPs) recorded from the brain. First, there is the N400, which is
an indicant of difficulty in semantic processing. It was originally identified as
a response to semantic anomaly, although it is more general than that. Kutas
and Hillyard (1980a, 1980b) discovered the N400 in their original experiments
when participants heard semantically anomalous sentences such as “He spread
the warm bread with socks.” About 400 ms after the anomalous word (socks),
ERP recordings showed a large negative amplitude shift. Second, there is the
P600, which occurs in response to syntactic violations. For instance, Osterhout
and Holcomb (1992) presented their participants with sentences such as “The
368 | Language Comprehension
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 368
broker persuaded to sell the stock” and found a positive wave at about 600 ms
after the word to, which was the point at which there was a violation of the
syntax. Of particular interest in this context is the relation between the N400
and the P600.
Ainsworth-Darnell, Shulman, and Boland (1998) studied how these two
effects combined when participants heard sentences such as
Control: Jill entrusted the recipe to friends before she suddenly disappeared.
Syntactic anomaly: Jill entrusted the recipe friends before she suddenly
disappeared.
Semantic anomaly: Jill entrusted the recipe to platforms before she suddenly
disappeared.
Double anomaly: Jill entrusted the recipe platforms before she suddenly
disappeared.
The last sentence combines a semantic and a syntactic anomaly. Figure 13.7 contrasts
the ERP waveforms obtained from midline and parietal sites in response to
the various types of sentences. An arrow in the ERPs points to the onset of the
Parsing | 369
(a)
CTRL Cz
SYN Cz
SEM Cz
2 Both Cz
4
6
–2
–4
–6
1750 2000 2250 2500 2750 3000 3250
N400
P600
0
(b)
CTRL Pz
SYN Pz
SEM Pz
2 Both Pz
4
6
0
–2
–4
–6
1750 2000 2250 2500 2750 3000 3250
N400
P600
FIGURE 13.7 ERP recordings from (a) central and (b) parietal sites. The arrows point to the
onset of the critical word. (From Ainsworth-Darnell, Shulman, & Boland, 1998.)
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 369
critical word ( friends or platforms). The two types of sentences containing a
semantic anomaly evoked a negative shift (N400) at the midline site about 400 ms
after the critical word. In contrast, the two types of sentences containing a syntactic
anomaly were associated with a positive shift (P600) in the parietal area about
600 ms after the onset of the critical word.Ainsworth et al. used the fact that each
process—syntactic and semantic—affects a different brain region to argue that
the syntactic and semantic processes are separable.
ERP recordings indicate syntactic and semantic violations elicit different
responses in different locations in the brain.
Ambiguity
Many sentences can be interpreted in two or more ways because of either
ambiguous words or ambiguous syntactic constructions. Examples of such
sentences are
• John went to the bank. • Flying planes can be dangerous.
It is also useful to distinguish between transient ambiguity and permanent ambiguity.
The preceding examples are permanently ambiguous. That is, the ambiguity
remains to the end of the sentence. Transient ambiguity refers to ambiguity
in a sentence that is resolved by the end of the sentence; for example, consider
hearing a sentence that begins as follows:
• The old train . . .
At this point, whether old is a noun or an adjective is ambiguous. If the sentence
continues as follows,
• . . . left the station.
then old is an adjective modifying train. On the other hand, if the sentence continues
as follows,
• . . . the young.
then old is the subject of the sentence and train is a verb. This is an example of
transient ambiguity—an ambiguity in the middle of a sentence for which the
resolution depends on how the sentence ends.
Transient ambiguity is quite prevalent in language, and it leads to a serious
interaction with the principle of immediacy of processing described earlier.
Immediacy of processing implies that we commit to an interpretation of a word
or a phrase right away, but transient ambiguity implies that we cannot always
know the correct interpretation immediately. Consider the following sentence:
• The horse raced past the barn fell.
Most people do a double take on this sentence: they first read one interpretation
and then a second. Such sentences are called garden-path sentences because we
are “led down the garden path” and commit to one interpretation at a certain
point only to discover that it is wrong at another point. For instance, in the
preceding sentence, most readers interpret raced as the main verb of the sentence.
370 | Language Comprehension
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 370
The existence of such garden-path sentences is considered to be one of the
important pieces of evidence for the principle of immediacy of interpretation.
People could postpone interpreting such sentences at points of ambiguity until
the ambiguity is resolved, but they do not.
When one comes upon a point of syntactic ambiguity in a sentence, what
determines its interpretation? A powerful principle is the principle of minimal
attachment. This principle basically says that one interprets a sentence in a way
that causes minimal complication of its phrase structure. Because all sentences
must have a main verb, the simple interpretation would be to include raced in the
main sentence rather than creating a relative clause to modify the noun horse.
Many times we are not aware of the ambiguities that exist in sentences. For
instance, consider the following sentence:
• The woman painted by the artist fell.
As we will see, people seem to have difficulty with this sentence (temporarily
interpreting the woman as the one doing the painting), just like the earlier horse
raced sentence. However, people tend not be aware of taking a garden path in
the way that they are with the horse raced sentence.
Why are we aware of a reinterpretation in some sentences, such as the horse
raced example, but not in others, such as the woman painted example? If a
syntactic ambiguity is resolved quickly after we encounter it, we seem to be
unaware of ever considering two interpretations. Only if resolution is postponed
substantially beyond the ambiguous phrase are we aware of the need to
reinterpret it (Ferriera & Henderson, 1991). Thus, in the woman painted example,
the ambiguity is resolved immediately after the verb painted, and thus most
people are not aware of the ambiguity. In contrast, in the horse raced example,
the sentence seems to successfully complete as The horse raced past the barn
only to have this interpretation contradicted by the last word fell.
When people come to a point of ambiguity in a sentence, they adopt one
interpretation, which they will have to retract if it is later contradicted.
Neural Indicants of the Processing of Transient Ambiguity
Brain-imaging studies reveal a good deal about how people process ambiguous
sentences. In one study, Mason, Just, Keller, and Carpenter (2003) compared
three kinds of sentences:
Unambiguous: The experienced soldiers spoke about the dangers of the
midnight raid.
Ambiguous preferred: The experienced soldiers warned about the dangers
before the midnight raid.
Ambiguous unpreferred: The experienced soldiers warned about the dangers
conducted the midnight raid.
The verb spoke in the first sentence is unambiguous, but the verb warned in the
last two sentences has a transient ambiguity of just the sort described in the preceding
subsection: Until the end of the sentence, one cannot know whether the
Parsing | 371
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 371
372 | Language Comprehension
soldiers are doing the warning or are being warned.
As noted, participants prefer the first interpretation.
Mason et al. collected fMRI measures of activation
in Broca’s area as participants read the sentences.
These data are plotted in Figure 13.8 as a function of
time since the onset of the sentences (which lasted
approximately 6–7 s). As is typical of fMRI measures,
the differences among conditions show up only
after the processing of the sentences, corresponding
to the lag in the hemodynamic response. As can be
seen, the unambiguous sentence results in the least
activation, owing to the greater ease in processing
that sentence. However, in comparing the two ambiguous
sentences, we see that activation is greater
for the sentence that ends in the unpreferred way.
FMRI measures such as those in Figure 13.8 can
localize areas in the brain in which processing is taking
place, in this case confirming the critical role of
Broca’s area in the processing of sentence structure. However, these measures
do not identify the fine-grained temporal structure of the processing. An ERP
study by Frisch, Schlesewsky, Saddy, and Alpermann (2002) investigated the
temporal aspect of how people deal with ambiguity. Their study was with German
speakers and took advantage of the fact that some German nouns are ambiguous
in their role assignment. They looked at German sentences that begin
with either of two different nouns and end with a verb. In the following examples,
each German sentence is followed by a word-by-word translation and then
the equivalent English sentence:
1. Die Frau hatte den Mann gesehen.
The woman had the man seen
The woman had seen the man.
2. Die Frau hatte der Mann gesehen.
The woman had the man seen
The man had seen the woman.
3. Den Mann hatte die Frau gesehen.
The man had the woman seen
The woman had seen the man.
4. Der Mann hatte die Frau gesehen.
The man had the woman seen
The man had seen the woman.
Note that, when participants read Die Frau at the beginning of sentences 1 and 2,
they do not know whether the woman is the subject or the object of the sentence.
Only when they read den Mann in sentence 1 can they infer that man is an
object (because of the determiner den) and hence that woman must be the subject.
Similarly, der Mann in sentence 2 indicates that man is the subject and
0 2
0.5
1
1.5
2
2.5
4 6 8 10
Time (seconds)
Change from fixation (%)
12 14
Unambiguous
Ambiguous preferred
Ambiguous unpreferred
FIGURE 13.8 The average
activation change in Broca’s
area for three types of
sentences as a function of time
from beginning of the sentence.
(From Masson et al., 2003.)
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 372
therefore woman must be the object. Sentences 3 and 4, because they begin with
Mann and its inflected article, do not have this transient ambiguity. The difference
in when one can interpret these sentences depends on the fact that the masculine
article is inflected for case in German but the feminine article is not.
Frisch et al. used the P600 (already described with respect to Figure 13.7) to
investigate the syntactic processing of these sentences. They found that the
ambiguous first noun in sentences 1 and 2 was followed by a stronger P600
than were the unambiguous sentences 3 and 4. The contrast between sentences
1 and 2 also is interesting. Although German allows for either subject-object or
object-subject ordering, the subject-object structure in sentence 1 is preferred. For
the unpreferred sentence (2), Frisch et al. found that the second noun was followed
by a greater P600. Thus, when participants reach a transient ambiguity, as in
sentences 1 and 2, they seem to immediately have to work harder to deal with the
ambiguity. They commit to the preferred interpretation and have to do further
work when they learn that it is not the correct interpretation, as in sentence 2.
Activity in Broca’s area increases when participants encounter a transient
ambiguity and when they have to change an initial interpretation of a
sentence.
Lexical Ambiguity
The preceding discussion was concerned with how participants deal with syntactic
ambiguity. In lexical ambiguity, where a single word has two meanings,
there is often no structural difference in the two interpretations of a sentence.
A series of experiments beginning with Swinney (1979) helped to reveal how
people determine the meaning of ambiguous words. Swinney asked participants
to listen to sentences such as • The man was not surprised when he found several spiders, roaches, and
other bugs in the corner of the room.
Swinney was concerned with the ambiguous word bugs (meaning either insects or
electronic listening devices). Just after hearing the word, participants would be
presented with a string of letters on the screen, and their task was to judge whether
that string made a correct word. Thus, if they saw ant, they would say yes; but
if they saw ont, they would say no. This is the lexical-decision task described
in Chapter 6 in relation to the mechanisms of spreading activation. Swinney was
interested in how the word bugs in the passage would prime the lexical judgment.
The critical contrasts involved the relative times to judge spy, ant, or sew,
following bugs. The word ant is related to the primed meaning of bugs, whereas
spy is related to the unprimed meaning. The word sew defines a neutral control
condition. Swinney found that recognition of either spy or ant was facilitated if
that word was presented within 400 ms of the prime, bugs. Thus, the presentation
of bugs immediately activates both of its meanings and their associations.
If Swinney waited more than 700 ms, however, only the related word ant was
facilitated. It appears that a correct meaning is selected in this time and the
other meaning becomes deactivated. Thus, two meanings of an ambiguous
Parsing | 373
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 373
word are momentarily active, but context operates very rapidly to select the appropriate
meaning.
When an ambiguous word is presented, participants select a particular
meaning within 700 ms.
Modularity Compared with Interactive Processing
There are two bases by which people can disambiguate ambiguous sentences.
One possibility is the use of semantics, which is the basis for disambiguating
the word bugs in the sentence given in the preceding subsection. The other possibility
is the use of syntax. Advocates of the language-modularity position (see
Chapter 12) have argued that there is an initial phase in which we merely
process syntax, and only later do we bring semantic factors to bear. Thus, initially
only syntax is available for disambiguation, because syntax is part of a
language-specific module that can operate quickly by itself. In contrast, to bring
semantics to bear requires using all of one’s world knowledge, which goes far
beyond anything that is language specific. Opposing the modularity position is
that of interactive processing, the proponents of which argue that syntax and
semantics are combined at all levels of processing.
Much of the debate between these two positions has concerned the processing
of transient syntactic ambiguity. In the initial study of what has become a
long series of studies, Ferreira and Clifton (1986) asked participants to read
sentences such as
1. The woman painted by the artist was very attractive to look at.
2. The woman that was painted by the artist was very attractive to look at.
3. The sign painted by the artist was very attractive to look at.
4. The sign that was painted by the artist was very attractive to look at.
Sentences 1 and 3 are called reduced relatives because the relative pronoun that is
missing. There is no local syntactic basis for deciding whether the noun-verb
combination is a relative clause construction or an agent-action combination.
Ferreira and Clifton argued that, because of the principle of minimal attachment,
people have a natural tendency to encode noun-verb combinations such as The
woman painted as agent-action combinations. Evidence for this tendency is that
participants take longer to read by the artist in the first sentence than in the
second. The reason is that they discover that their agent-action interpretation is
wrong in the first sentence and have to recover, whereas the syntactic cue that was
in the second sentence prevents them from ever making this misinterpretation.
The real interest in the Ferreira and Clifton experiments is in sentences 3
and 4. Semantic factors should rule out the agent-action interpretation of
sentence 3, because a sign cannot be an animate agent and engage in painting.
Nonetheless, participants took just as long to read by the artist in sentence 3 as
in sentence 1 and longer than in unambiguous sentences 2 or 4. Thus, argued
Ferreira and Clifton, participants first use only syntactic factors and so misinterpret
the phrase The sign painted and then use the syntactic cues in the phrase
374 | Language Comprehension
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 374
by the artist to correct that misinterpretation. Thus, although semantic factors
could have done the job and prevented the misinterpretation, participants
seemingly do all their initial processing by using syntactic cues.
Experiments of this sort have been used to argue for the modularity of language.
The argument is that our initial processing of language makes use of
something specific to language—namely, syntax—and ignores other general,
nonlinguistic knowledge that we have of the world, for example, that signs cannot
paint. However, Trueswell, Tannehaus, and Garnsey (1994) argued that
many of the sentences in the Ferreira and Clifton study were not like sentence 3.
Specifically, although the sentences were supposed to have a semantic basis for
disambiguation, many did not. For instance, among the Ferreira and Clifton
sentences were sentences such as
5. The car towed from the parking lot was parked illegally.
Here car towed was supposed to be unambiguous, but it is possible for car to be
the subject of towed as in
6. The car towed the smaller car from the parking lot.
When Trueswell et al. used sentences that avoided these problems, they found
that participants did not have any difficulty with the sentences. For instance,
participants showed no more difficulty with
7. The evidence examined by the lawyer turned out to be unreliable.
than with
8. The evidence that was examined by the lawyer turned out to be unreliable.
Thus, people do seem to be able to select the correct interpretation when it is not
semantically possible to interpret the noun (evidence) as an agent of the verb.
Thus, the initial syntactic decisions are not made without reference to semantic
factors.
Additionally, McRae, Spivey-Knowlton, and Tannehaus (1998) show that
the relative plausibility of the noun as agent of the verb affects the difficulty of
the construction. They compared the following pairs of sentences:
9. The cop arrested by the detective was guilty of taking bribes.
10. The cop that was arrested by the detective was guilty of taking bribes.
and
11. The crook arrested by the detective was guilty of taking bribes.
12. The crook that was arrested by the detective was guilty of taking bribes.
They found that participants suffered much greater difficulty with the reduced
relatives in sentence 9, where the subject cop is plausible as the agent for arresting,
than in sentence 11, where the subject crook is not.
Participants appear to be able to use semantic information immediately
to guide syntactic decisions.
Parsing | 375
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 375
376 | Language Comprehension
which is the computer. Try to judge which snippet involves
the human and which involves Elbot.
Snippet A
Interrogator: What do you think of this year’s Turner
Prize shortlist?
Respondent: Difficult question. I will have to work on that
and get back to you tomorrow.
Interrogator: You sound rather like Sarah Palin.
Respondent: Did this
Palin ever write a telephone
book?
Interrogator: I doubt
it. Does anyone really
write telephone books?
I don’t suppose you
caught the football last
night?
Respondent: What a
shame. There are many
talented authors who waste their energies on prose or
run-of-the-mill poetry. But to write a telephone book is
highest art!
Snippet B
Interrogator: What do you think of this year’s Turner
Prize shortlist?
Respondent: good I think. Better than the years before
i herad
Interrogator: Which was your favourite?
Respondent: It’s my first time, so I can’t compare
Interrogator: No, I mean of the entries this time around.
Respondent: Not really sure
Implications
Intelligent chatterboxes
Being able to communicate successfully in language is very
much tied to our conception of human intelligence. For instance,
Alan Turing (a famous British logician who broke
the secret communication code of the Nazis during World
War II) proposed in 1950 (Turing, 1950) that we decide
whether a machine is intelligent by whether it can engage
in a conversation that convinces the listener that it is a
human. In what has come to be known as the Turing Test,
a judge would interact with a human and a computer over
a chat system (to eliminate
visual cues). If, after
conversing with both,
the judge could not
determine which was
human and which was
computer, the computer
would be declared to be
intelligent. Turing predicted
that by the year
2000 a computer would
be able to pass this test.
In 1990, the Loebner Prize was created for the first
computer that could pass the Turing test. Each year a
contest is held in which various computer entries are
judged. A bronze prize is awarded yearly to the program
that gives the most convincing conversation, but so far
no machine has been able to fool a majority of the judges,
which would result in the silver prize (the gold prize is
reserved for something that even looks like a human).
The winner in 2008, a program called Elbot, came close
to winning the silver prize, fooling 3 of the 12 judges. It
even deceived reporter Will Pavia of The Times (http://
technology.timesonline.co.uk/tol/news/tech_and_web/
article4934858.ece). Below are two small snippets of
conversation between an interrogator with a human and
with Elbot. I have not identified which is the human and
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 376
•Utilization
After a sentence has been parsed and mapped into a representation of its meaning,
what then? A listener seldom passively records the meaning. If the sentence
is a question or an imperative, for example, the speaker will expect the listener
to take some action in response. Even for declarative sentences, moreover, there
is usually more to be done than simply registering the sentence. Fully understanding
a sentence requires making inferences and connections. In Chapter 6,
we considered the way in which such elaborative processing leads to better
memory. Here, we will review some of the research on how people make such
inferences.
Bridging versus Elaborative Inferences
In understanding a sentence, the comprehender must make inferences that go
beyond what is stated. Researchers typically distinguish between bridging inferences
(also called backward inferences) and elaborative inferences (also called
forward inferences). Bridging inferences reach back in the text to make connections
with earlier parts of the text. These elaborative inferences add new
information to the interpretation of the text and often predict what will be
coming up in the text. To illustrate the difference between bridging and elaborative
inferences, contrast the following pairs of sentences used by Singer (1994):
1. Direct statement: The dentist pulled the tooth painlessly. The patient
liked the method.
2. Bridging inference: The tooth was pulled painlessly. The dentist used a
new method.
3. Elaborative inference: The tooth was pulled painlessly. The patient liked
the new method.
Having been presented with these sentence pairs, participants were asked
whether it was true that A dentist pulled the tooth. This is explicitly stated in example
1, but it is also highly probable in examples 2 and 3, even though it is not
stated. The inference that the dentist pulled the tooth in example 2 is required
to connect dentist in the second sentence to the first and so would be classified
as a backward bridging inference. The inference in example 3 is an elaboration
(because a dentist is not mentioned in either sentence) and so would be classified
as a forward elaborative inference. Participants were equally fast to verify A
dentist pulled the tooth in the bridging inference condition of example 2 as they
were in the direct condition of example 1, indicating that they made the bridging
inference. However, they were about a quarter of a second slower to verify
the sentence in the elaborative-inference condition of example 3, indicating
that they had not made the elaborative inference.
The problem with elaborative inferences is that there are no bounds on how
many such inferences can be made. Consider the sentence The tooth was pulled
painlessly. In addition to inferring who pulled the tooth, one could make
Utilization | 377
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 377
inferences about what instrument was used to make the extraction, why the
tooth was pulled, why the procedure was painless, how the patient felt, what
happened to the patient afterward, which tooth (e.g., incisor or molar was
pulled), how easy the extraction was, and so on. Considerable research has been
undertaken in trying to determine exactly which elaborative inferences are
made (Graesser, Singer, & Trabasso, 1994). In the Singer (1994) study just described,
the elaborative inference seems not to have been made. As an example
of a study in which an elaborative inference seems to have been made, consider
the experiment reported by Long, Golding, and Graesser (1992). They had participants
read a story that included the following critical sentence: • A dragon kidnapped the three daughters.
After reading this sentence, participants made a lexical decision about the
word eat (a lexical decision task, discussed in earlier in this chapter and in
Chapter 6, involves deciding whether a string of letters makes a word). Long et
al. found that participants could make the lexical decision more rapidly after
reading this sentence than in a neutral context. From this data, they argued
that participants made the inference that the dragon’s goal was to eat the
daughters (which had not been directly stated or even suggested in the story).
Long et al. argued that, when reading a story, we normally make inferences
about a character’s goals.
Although bridging inferences are made automatically, it is optional whether
people will make elaborative inferences. It takes effort to make these inferences
and readers need to be sufficiently engaged in the text they are reading to make
them. It also appears to depend on reading ability. For instance, in one study
Murray and Burke (2003) had participants read passages like
Carol was fed up with her job waiting on tables. Customers were rude, the
chef was impossibly demanding, and the manager had made a pass at her just
that day. The last straw came when a rude man at one of her tables complained
that the spaghetti she had just served was cold. As he became louder
and nastier, she felt herself losing control.
The passage then ended with one of the following two sentences:
Experimental:Without thinking of the consequences, she picked up the plate
of spaghetti and raised it above the customer’s head.
Or
Control: To verify the complaint, she picked up the plate of spaghetti and
raised it above the customer’s head.
After reading this sentence, participants were presented with a critical word like
“dump” which is related to an elaborative inference that readers would only
make in the experimental condition. They simply had to read the word. Participants
classified as having high reading ability read the word “dump” faster in
the experimental condition, indicating they had made the inference. However,
low-reading-ability participants did not. Thus, it would appear that high-ability
readers had made the elaborative inference that Carol was going to dump the
spaghetti on the customer’s head, whereas the low-ability readers had not.
378 | Language Comprehension
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 378
In understanding a sentence, listeners make bridging inferences to connect
it to prior sentences but only sometimes make elaborative inferences that
connect to possible future material.
Inference of Reference
An important aspect of making a bridging inference consists of recognizing
when an expression in the sentence refers to something that we should already
know. Various linguistic cues indicate that an expression is referring to something
that we already know. One cue in English turns on the difference between
the definite article the and the indefinite article a. The tends to be used to signal
that the comprehender should know the reference of the noun phrase,
whereas a tends to be used to introduce a new object. Compare the difference
in meaning of the following sentences:
1. Last night I saw the moon.
2. Last night I saw a moon.
Sentence 1 indicates a rather uneventful fact—seeing the same old moon as
always—but sentence 2 carries the clear implication of having seen a new moon.
There is considerable evidence that language comprehenders are quite sensitive
to the meaning communicated by this small difference in the sentences. In one
experiment, Haviland and Clark (1974) compared participants’ comprehension
time for two-sentence pairs such as
3. Ed was given an alligator for his birthday. The alligator was his favorite
present.
4. Ed wanted an alligator for his birthday. The alligator was his favorite
present.
Both pairs have the same second sentence. Pair 3 introduces in its first sentence
a specific antecedent for the alligator. On the other hand, although alligator is
mentioned in the first sentence of pair 4, a specific alligator is not introduced.
Thus, there is no antecedent in the first sentence of pair 4 for the alligator. The
definite article the in the second sentence of both pairs supposes a specific
antecedent. Therefore, we would expect that participants would have difficulty
with the second sentence in pair 4 but not in pair 3. In the Haviland and Clark
experiment, participants saw pairs of such sentences one at a time. After they
comprehended each sentence, they pressed a button. The time was measured
from the presentation of the second sentence until participants pressed a button
indicating that they understood that sentence. Participants took an average
of 1031 ms to comprehend the second sentence in pairs, such as pair 3, in
which an antecedent was given, but they took an average of 1168 ms to comprehend
the second sentence in pairs, such as pair 4, in which there was no
antecedent for the definite noun phrase. Thus, comprehension took more than
a tenth of a second longer when there was no antecedent.
The results of an experiment done by Loftus and Zanni (1975) showed that
choice of articles could affect listeners’ beliefs. These experimenters showed
Utilization | 379
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 379
participants a film of an automobile accident and asked them a series of questions.
Some participants were asked,
5. Did you see a broken headlight?
Other participants were asked,
6. Did you see the broken headlight?
In fact, there was no broken headlight in the film, but question 6 uses a definite
article, which supposes the existence of a broken headlight. Participants were
more likely to answer “Yes” when asked the question in form 6. As Loftus and
Zanni noted, this finding has important implications for the interrogation of
eyewitnesses.
Comprehenders take the definite article the to imply the existence of a
reference for the noun.
Pronominal Reference
Another aspect of processing reference concerns the interpretation of pronouns.
When one hears a pronoun such as she, deciding who is being referenced
is critical. A number of people may have already been mentioned, and all
are candidates for the reference of the pronoun. As Just and Carpenter (1987)
noted, there are a number of bases for resolving the reference of pronouns:
1. One of the most straightforward is to use number or gender cues. Consider • Melvin, Susan, and their children left when (he, she, they) became sleepy.
Each possible pronoun has a different referent.
2. A syntactic cue to pronominal reference is that pronouns tend to refer to
objects in the same grammatical role (e.g., subject versus object). Consider • Floyd punched Bert and then he kicked him.
Most people would agree that the subject he refers to Floyd and the object him
refers to Bert.
3. There is also a strong recency effect such that the most recent candidate
referent is preferred. Consider • Dorothea ate the pie; Ethel ate cake; later she had coffee.
Most people would agree that she probably refers to Ethel.
4. Finally, people can use their knowledge of the world to determine
reference. Compare • Tom shouted at Bill because he spilled the coffee. • Tom shouted at Bill because he had a headache.
Most people would agree that he in the first sentence refers to Bill because you
tend to scold people who make mistakes, whereas he in the second sentence
refers to Tom because people tend to be cranky when they have headaches.
In keeping with the immediacy-of-interpretation principle articulated earlier,
people try to determine who a pronoun refers to immediately upon encountering
380 | Language Comprehension
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 380
it. For instance, in studies of eye fixations (Carpenter & Just, 1977; Ehrlich &
Rayner, 1983; Just & Carpenter, 1987), researchers found that people fixated on a
pronoun longer when it is harder to determine its reference. Ehrlich and Rayner
(1983) also found that participants’ resolution of the reference tends to spill over
into the next fixation, suggesting they are still processing the pronoun while reading
the next word.
Corbett and Chang (1983) found evidence that participants consider multiple
candidates for a referent. They had participants read sentences such as • Scott stole the basketball from Warren and he sank a jumpshot.
After reading the sentence, participants saw a probe word and had to decide
whether the word appeared in the sentence. Corbett and Chang found that time
to recognize either Scott or Warren decreased after reading such a sentence.
They also asked participants to read the following control sentence, which did
not require the referent of a pronoun to be determined: • Scott stole the basketball from Warren and Scott sank a jumpshot.
In this case, only recognition of Scott was facilitated. Warren was facilitated
only in the first sentence because, in that sentence, participants had to consider
it a possible referent of he before settling on Scott as the referent.
The results of both the Corbett and Chang study and the Ehrlich and Rayner
study indicate that resolution of pronoun reference lasts beyond the reading of
the pronoun itself. This finding indicates that processing is not always as immediate
as the immediacy-of-processing principle might seem to imply. The processing
of pronominal reference spills over into later fixations (Ehrlich & Rayner,
1983), and there is still priming for the unselected reference at the end of the
sentence (Corbett & Chang, 1983).
Comprehenders consider multiple possible candidates for the referent of a
pronoun and use syntactic and semantic cues to select a referent.
Negatives
Negative sentences appear to suppose a positive sentence and then ask us to infer
what must be true if the positive sentence is false. For instance, the sentence
John is not a crook supposes that it is reasonable to assume John is a crook but
asserts that this assumption is false. As another example, imagine the following
four replies from a normally healthy friend to the question How are you feeling?
1. I am well.
2. I am sick.
3. I am not well.
4. I am not sick.
Replies 1 through 3 would not be regarded as unusual linguistically, but reply 4
does seem peculiar. By using the negative, reply 4 is supposing that thinking of
our friend as sick is reasonable. Why would we think our friend is sick, and
what is our friend really telling us by saying it is not so? In contrast, the negative
in reply 3 is easy to understand, because supposing that the friend is normally
well is reasonable and our friend is telling us that this is not so.
Utilization | 381
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 381
Clark and Chase (Chase & Clark, 1972; H. H. Clark, 1974; Clark & Chase,
1972) conducted a series of experiments on the verification of negatives (see
also Carpenter & Just, 1975; Trabasso, Rollins, & Shaughnessy, 1971). In a typical
experiment, they presented participants with a card like that shown in
Figure 13.9 and asked them to verify one of four sentences about this card:
1. The star is above the plus—true affirmative.
2. The plus is above the star—false affirmative.
3. The plus is not above the star—true negative.
4. The star is not above the plus—false negative.
The terms true and false refer to whether the sentence is true of the picture; the
terms affirmative and negative refer to whether the sentence structure has a
negative element. Sentences 1 and 2 are simple assertions, but sentences 3 and
4 contain a supposition plus a negation of the supposition. Sentence 3 supposes
that the plus is above the star and asserts that this supposition is false; sentence 4
supposes that the star is above the plus and asserts that this supposition is false.
Clark and Chase assumed that participants would check the supposition first
and then process the negation. In sentence 3, the supposition does not match the
picture, but in sentence 4, the supposition does match the picture. Assuming
that mismatches would take longer to process, Clark and Chase predicted that
participants would take longer to respond to sentence 3, a true negative, than
to sentence 4, a false negative. In contrast, participants should take longer to
process sentence 2, the false affirmative, than sentence 1, the true affirmative,
because sentence 2 does not match the picture. In fact, the difference between
sentences 2 and 1 should be identical with the difference between sentences 3
and 4, because both differences correspond to the extra time due to a mismatch
between the sentence and the picture.
Clark and Chase developed a simple and elegant mathematical model for such
data. They assumed that processing sentences 3 and 4 took N time units longer
than did processing sentences 1 and 2 because of the more complex suppositionplus-
negation structure of sentences 3 and 4. They also assumed that processing
sentence 2 tookMtime units longer than did processing sentence 1 because of the
mismatch between picture and assertion. Similarly, they assumed that processing
sentence 3 tookMtime units longer than did processing sentence 4 because of the
mismatch between picture and supposition. Finally, they assumed that processing
a true affirmative such as sentence 1 took T time units. The time T refers to the
time used in processes exclusive of negation or the picture mismatch. Let us consider
the total time that participants should spend processing a sentence such as
sentence 3: This sentence has a complex supposition-and-negation structure,
which costs N time units, and a supposition mismatch, which costs M time units.
Therefore, total processing time should be T _M_ N. Table 13.1 shows both the
observed data and the reaction-time predictions that can be derived for the Clark
and Chase experiment. The best predicting values for T, M, and N for this
experiment can be estimated from the data as T _ 1469 ms, M _ 246 ms, and
N _ 320 ms. As you can confirm, the predictions match the observed time
remarkably well. In particular, the difference between true negatives and false
negatives is close to the difference between false affirmatives and true affirmatives.
382 | Language Comprehension
FIGURE 13.9 A card like that
presented to participants in Clark
and Chase’s sentence-verification
experiments. Participants were to
say whether simple affirmative
and negative sentences correctly
described these patterns.
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 382
This finding supports the hypothesis that participants do extract the suppositions
of negative sentences and match them to the picture.
Comprehenders process a negative by first processing its embedded supposition
and then the negation.
•Text Processing
So far, we have focused on the comprehension of single sentences in isolation.
Sentences are more frequently processed in larger contexts; for example, in the
reading of a textbook. Texts, like sentences, are structured according to certain
patterns, although these patterns are perhaps more flexible than those for sentences.
Researchers have noted that a number of recurring relations serve to
organize sentences into larger parts of a text. Some of the relations that have
been identified are listed in Table 13.2. These structural relations specify the way
in which a sentence should be related to the overall text. For instance, the first
text structure (response) in Table 13.2 directs the reader to relate one set of
sentences as part of the solution to problems posed by other sentences. These
relations can be at any level of a text. That is, the main relation organizing a
Text Processing | 383
TABLE 13.1
Observed and Predicted Reaction Times in Experiment Verification
Condition Observed Time Equation Predicted Time
True affirmative 1463 ms T 1469 ms
False affirmative 1722 ms T _ M 1715 ms
True negative 2028 ms T _ M _ N 2035 ms
False negative 1796 ms T _ N 1789 ms
TABLE 13.2
Possible Types of Relations among Sentences in a Text
Type of Relation Description
1. Response A question is presented and an answer follows or a problem is
presented and a solution follows.
2. Specific Specific information is given subsequent to a more general point.
3. Explanation An explanation is given for a point.
4. Evidence Evidence is given to support a point.
5. Sequence Points are presented in their temporal sequenceas a set.
6. Cause An event is presented as the cause of another event.
7. Goal An event is presented as the goal of another event.
8. Collection A loose structure of points is presented. (This is perhaps a case
in which there is no real organizing relation.)
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 383
paragraph might be any of the eight in Table 13.2. Subpoints in a paragraph
also may be organized according to any of these relations.
To see how the relations in Table 13.2 might be used, consider Meyer’s
(1974) now-classic analysis of the following paragraph:
Parakeet Paragraph
The wide variety in color of parakeets that are available on the market today resulted
from careful breeding of the color mutant offspring of green-bodied and
yellow-faced parakeets. The light green body and yellow face color combination
is the color of the parakeets in their natural habitat, Australia. The first living
parakeets were brought to Europe from Australia by John Gould, a naturalist, in
1840. The first color mutation appeared in 1872 in Belgium; these birds were
completely yellow. The most popular color of parakeets in the United States is
sky-blue. These birds have sky-blue bodies and white faces; this color mutation
occurred in 1878 in Europe. There are over 66 different colors of parakeets listed
by the Color and Technical Committee of the Budgerigar Society. In addition to
the original green-bodied and yellow-faced birds, colors of parakeets include
varying shades of violets, blues, grays, greens, yellows, and whites. (p. 61)
Her analysis of this paragraph is approximately reproduced in Table 13.3. Note
that this analysis tends to organize various facts into more or less major points.
384 | Language Comprehension
TABLE 13.3
Analysis of the Parakeet Paragraph
1. A explains B.
A. There was careful breeding of color mutants of green-bodied and yellow-faced
parakeets. The historical sequence is
1. Their natural habitat was Australia. Specific detail:
a. Their color here is a light-green body and yellow-face combination.
2. The first living parakeets were brought to Europe from Australia by John
Gould in 1840. Specific detail:
a. John Gould was a naturalist.
3. The first color mutation appeared in 1872 in Belgium. Specific detail:
a. These birds were completely yellow.
4. The sky-blue mutation occurred in 1878 in Europe. Specific details:
a. These birds have sky-blue bodies and white faces.
b. This is the most popular color in America.
B. There is a wide variety in color of parakeets that are on the market today.
Evidence for this is
1. There are over 66 different colors of parakeets listed by the Color and
Technical Committee of the Budgerigar Society.
2. There are many available colors. A collection of these is
a. The original green-bodied and yellow-faced birds
b. Violets
c. Blues
d. Grays
e. Greens
f. Yellows
g. Whites
From Meyer (1974).
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 384
The highest-level organizing relation in this paragraph is explanation (see
item 3, Table 13.2). Specifically, the major points in this explanation are that
(point A) there has been careful breeding of color mutants and (point B) there
is a wide variety of parakeet color, and point A is given as an explanation of
point B. Organized under point A are some events from the history of parakeet
breeding. This organization is an example of a sequence relation. Organized
under these events are specific details. So, for instance, organized under A2 is
the fact that John Gould was a naturalist. Organized under point B is evidence
supporting the assertion about the wide color variety and some details about
the available variation in color.
The propositions in a text can be organized hierarchically according to
various semantic relations.
Text Structure and Memory
A great deal of research has demonstrated the psychological significance of text
structure. A number of hypotheses differ in regard to exactly what system of
relations should be used in the analysis of texts, but they generally agree that
some sort of hierarchical structure organizes the propositions of a text.Memory
experiments have yielded evidence that participants do, to some degree, respond
to that hierarchical structure.
Meyer, Brandt, and Bluth (1978) studied students’ perception of the highlevel
structure of a text—that is, the structural relations at the higher levels
of hierarchies like that in Table 13.3. They found considerable variation in
participants’ ability to recognize the high-level structure that organized a text.
Moreover, they found that participants’ ability to identify the top-level structure
of a text was an important predictor of their memory for the text. In
another study, on ninth-graders, Bartlett (1978) found that only 11% of participants
consciously identified and used high-level structure to remember text
material. This select group did twice as well as other students on their recall
scores. Bartlett also showed that training students to identify and use top-level
structure more than doubled recall performance.
In addition to its hierarchical structure, a text tends to be held together
by causal and logical structures. This tendency is clearest in narratives in
which one event in a sequence of events causes the next event. The scripts
discussed in Chapter 5 are one kind of knowledge structure that is designed
to encode such causal relations. Often the causal links are not explicitly
stated but rather have to be inferred. For instance, we might hear on a
newscast
• There is an accident on the Parkway East. Traffic is being rerouted
through Wilkinsburg.
It is left to the listener to infer that the first fact is the cause of the second fact.
Keenan, Baillet, and Brown (1984) studied the effect of the probability of the
causal relation connecting two sentences on the processing of the second sentence.
Text Processing | 385
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 385
They asked participants to read pairs of sentences, of which the first might be one
of the following sentences:
1a. Joey’s big brother punched him again and again.
1b. Racing down the hill, Joey fell off his bike.
1c. Joey’s crazy mother became furiously angry with him.
1d. Joey went to a neighbor’s house to play.
Keenan et al. were interested in the effect of the first sentence on time to read a
second sentence such as
2. The next day, his body was covered with bruises.
Sentences 1a through 1d are ordered in decreasing probability of a causal connection
to the second sentence. Correspondingly, Keenan et al. found that participants’
reading times for sentence 2 increased from 2.6 s when preceded by
high probable causes such as that given in sentence 1a to 3.3 s when preceded
by low probable causes such as that given in sentence 1d. Thus, it takes longer
to understand a more distant causal relation.
There also are effects of causal relatedness on recall. Those parts of a story
that are more central to its causal structure are more likely to be recalled
(Black & Bern, 1981; Trabasso, Secco, & van den Broek, 1984). For instance,
Black and Bern had participants study stories that included pairs of sentences
such as
• The cat leapt up on the kitchen table. • Fred picked up the cat and put it outside.
which are causally related. They contrasted this pair with pairs of sentences such as
• The cat rubbed against the kitchen table. • Fred picked up the cat and put it outside.
which are less plausibly connected by a causal relation. Although the second
sentence is identical in both cases, participants displayed better memories for
the first sentence of a causally related pair.
Thorndyke (1977) also showed that memory for a story is poorer if the
organization of the text conflicts with what would be considered its “natural”
structure. Some participants studied an original story, whereas other participants
studied the story with its sentences presented in a scrambled order. Participants
were able to recall 85% of the facts in the original story but only 32%
of the facts in the scrambled story.
Mandler and Johnson (1977) showed that children have much more difficulty
than adults do when recalling the causal structure of a story. Adults
recall events and the outcomes of those events together, whereas children
recall the outcomes but tend to forget how they were achieved. For instance,
children might recall from a particular story that the butter melted but might
forget that it melted because it was in the sun. Adults do not have trouble
with such simple causal structures, but they may have difficulty perceiving
386 | Language Comprehension
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 386
the more complex relations connecting parts of a text. For instance, how easy
is it for you to specify the relation that connects this paragraph to the preceding
one?
Palinscar and Brown (1984) developed a training program that specifically
trains children to identify and formulate questions about such things
as the causal structure of text. They were able to raise poor-performing
seventh-graders from the 20th to the 56th percentile in reading comprehension.
This result is similar to that obtained by Bartlett (1978), who improved
reading performance by training students to identify the hierarchical structure
of text.
Memory for textual material is sensitive to the hierarchical and causal
structure of that text and tends to be better when people attend to that
structure.
Levels of Representation of a Text
Kintsch (1998) has argued that a text is represented at multiple levels. For instance,
consider the following pair of sentences taken from an experimental
story entitled “Nick Goes to the Movies.”
• Nick decided to go to the movies. He looked at a newspaper to see what
was playing.
Kintsch argues that this material is represented at three levels:
1. There is the surface level of representation of the exact sentences. This
can be tested by comparing people’s ability to remember the exact sentences
versus paraphrases like “Nick studied the newspaper to see what
was playing.”
2. There is also a propositional level (see Chapter 5) and this can be tested
by seeing whether people remember that Nick read the newspaper at all.
3. There is a situation model that consists of the major points of the story.
Thus, we can see whether people remember that “Nick wanted to see a
film”—something not said in the story but strongly implied.
In one study, Kintsch,Welsch, Schmalhofer, and Zimny (1990) looked at participants’
ability to remember these different sorts of information over periods of
time ranging up to 4 days. The results are shown in Figure 13.10. As we saw in
Chapter 5, surface information is forgotten quite rapidly, whereas the propositional
information is better retained. However, the most striking retention
function involves the situation information. After 4 days, participants have forgotten
half the propositions but still remember perfectly what the story was
about. This fits with many people’s experience in reading novels or seeing movies.
They will quickly forget many of the details but will still remember what the
novel or movie was about months later.
Text Processing | 387
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 387
388 | Language Comprehension
When people follow a story, they construct a high-level situation model of the
story that is more durable than the memory for the surface sentences or the
propositions that made up the story.
•Conclusions
The number and diversity of topics covered in this chapter testify to the impressive
cumulative progress in understanding language comprehension. It is fair to
say that we knew almost nothing about language processing when cognitive
psychology emerged from the collapse of behaviorism 50 years ago. Now, we
have a rather articulate picture of what is happening in scales that range from
100 ms after a word is heard to minutes later when large stretches of complex
text must be integrated. Research on language processing turns out to harbor a
number of theoretical controversies, some of which have been discussed in this
review of the field (e.g., whether early syntactic processing is separate from the
rest of cognition). However, such controversies should not blind us to the impressive
progress that has been made. The heat in the field has also generated
much light.
_0.2
0.0
0.2
0.4
0.6
0.8
1.0
1.2
0
Delay
Trace strength
40 min 2 ds. 4 ds.
Situation
Proposition
Surface
FIGURE 13.10 Memory for a story as a function of time: strengths of the traces for the surface
form of sentences, the propositions that make up the story, and the high-level situation representation.
(From Kintsch et al., 1990.)
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 388
Key Terms | 389
1. Answer the following question: “How many animals
of each kind did Moses take on the ark?” If you are like
most people you answer “two” and did not even notice
that it was Noah and not Moses who took the animals
on the ark (Erickson & Matteson, 1981). People do
this even when they are warned to look out for such
sentences and not answer them (Reder & Kusbit, 1991).
This phenomenon has been called the Moses illusion
even though it has been demonstrated with a wide
range of words besides Moses.What does the Moses
illusion say about how people incorporate the meaning
of individual words into sentences?
2. Christianson, Hollingworth, Halliwell, and Ferreira
(2001) found that when people read the sentence
“While Mary bathed the baby played in the crib” most
people actually interpret the sentence as implying that
Mary bathed the baby. Ferreirra and Patson (2007)
argue that this implies that people do not carefully
parse sentences but settle on “good enough” interpretations.
If people don’t carefully process sentences, what
does that imply about the debate between proponents
of interactive processing and of the modularity
position about how people understand sentences like
“The woman painted by the artist was very attractive
to look at”?
3. Palinscar and Brown (1984) found that teaching
seventh-graders to make elaborative inferences while
reading dramatically improved their reading skills. Do
you think they would have been as successful if they
had focused on teaching children to make bridging
inferences?
4. Bielock, Lyons,Mattarella-Micke, Nusbaum, and Small
(2008) looked at brain activation while participants
listened to sentences about hockey versus other action
sentences. They found greater activation in the premotor
cortex for hockey sentences only for those participants
who were hockey fans.What does this say about the
role of expertise in making elaborative inferences and
developing situation models?
Questions for Thought
Key Terms
bridging (or backward)
inferences
center-embedded
sentences
constituent
elaborative (or forward)
inferences
garden-path sentence
immediacy of interpretation
interactive processing
N400
P600
parsing
principle of minimal
attachment
situation model
transient ambiguity
utilization
Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 389
390
14
Individual Differences
in Cognition
Clearly, all people do not think alike. There are many aspects of cognition, but
humans, naturally being an evaluative species, tend to focus on ways in which
some people perform “better” than other people. This performance is often identified
with the word intelligence—some people are perceived to be more intelligent
than others. Chapter 1 identified intelligence as the defining feature of the human
species. So, to call some members of our species more intelligent than others
can be a potent claim. As we will see, the complexity of human cognition makes
the placement of people on a unidimensional evaluative scale of intelligence
impossible.
This chapter will explore individual differences in cognition both because of its
inherent interest and because it sheds some light on the general nature of human
cognition. The big debate that will be with us throughout this chapter is the nurtureversus-
nature debate. Are some people better at some cognitive tasks because
they are innately endowed with more capacity for those kinds of tasks or because
they have learned more knowledge relevant to these tasks? The answer, not surprisingly,
is that it is some of both, and we will consider and examine some of
the ways in which both basic capacities and experiences contribute to human
intelligence.
More specifically, this chapter will answer the following questions: • How does the thinking of children develop as they mature? • What are the relative contributions of neural growth versus experience
to children’s intellectual development? • What happens to our intellectual capacity through the adult years? • What do intelligence tests measure? • What are the different subcomponents of intelligence?
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 390
•Cognitive Development
Part of the uniqueness of the human species concerns the way in which children
are brought into the world and develop to become adults. Humans have
very large brains in relation to their body size, which created a major evolutionary
problem: How would the birth of such large-brained babies be physically
possible? One way was through progressive enlargement of the birth canal,
which is now as large as is considered possible given the constraints of mammalian
skeletons (Geschwind, 1980). In addition, a child is born with a skull that
is sufficiently pliable for it to be compressed into a cone shape to fit through the
birth canal. Still, the human birth process is particularly difficult compared with
that of most other mammals.
Figure 14.1 illustrates the growth of the human brain during gestation. At
birth, a child’s brain has more neurons than an adult brain has, but the state of
development of these neurons is particularly immature. Compared with those
of many other species, the brains of human infants will develop much more
after birth. At birth, a human brain occupies a volume of about 350 cubic
centimeters (cm3). In the first year of life, it doubles to 700 cm3, and before a
human being reaches puberty, the size of its brain doubles again. Most other
mammals do not have as much growth in brain size after birth (S. J. Gould,
1977). Because the human birth canal has been expanded to its limits, much
of our neural development has been postponed until after birth.
Brain Structures
Brain
stem
Forebrain
Hindbrain
Midbrain
(a) 25 days (b) 50 days (c) 100 days
(d) 20 weeks (e) 28 weeks (f ) 36 weeks (full term)
Neural tube
(forms spinal cord)
FIGURE 14.1 Changes in
structure in the developing
brain. (Adapted from Cowan, 1997,
p. 116).
Cognitive Development | 391
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 391
Even though they spend 9 months developing in the womb, human infants
are quite helpless at birth and spend an extraordinarily long time growing to
adult stature—about 15 years, which is about a fifth of the human life span. In
contrast, a puppy, after a gestation period of just 9 weeks, is more capable at
birth than a human newborn. In less than a year, less than a tenth of its life
span, a dog has reached full size and reproductive capability.
Childhood is prolonged more than would be needed to develop large brains.
Indeed, the majority of neural development is complete by age 5. Humans are
kept children by the slowness of their physical development. It has been speculated
that the function of this slow physical development is to keep children in a
dependency relation to adults (de Beer, 1959).Much has to be learned to become
a competent adult, and staying a child for so long gives the human enough time
to acquire that knowledge. Childhood is an apprenticeship for adulthood.
Modern society is so complex that we cannot learn all that is needed by simply
associating with our parents for 15 years. To provide the needed training,
society has created social institutions such as high schools, colleges, and postcollege
professional schools. It is not unusual for people to spend more than
25 years, almost as long as their professional lives, preparing for their roles in
society.
Human development to adulthood is longer than that of other mammals to
allow time for growth of a large brain and acquisition of a large amount of
knowledge.
Piaget’s Stages of Development
Developmental psychologists have tried to understand the intellectual changes
that take place as we grow from infancy through adulthood. Many have been
particularly influenced by the Swiss psychologist Jean Piaget, who studied and
theorized about child development for more than half a century. Much of the
recent information-processing work in cognitive development has been concerned
with correcting and restructuring Piaget’s theory of cognitive development.
Despite these revisions, his research has organized a large set of qualitative
observations about cognitive development spanning the period from birth to
adulthood. Therefore, it is worthwhile to review these observations to get a
picture of the general nature of cognitive development during childhood.
According to Piaget, a child enters the world lacking virtually all the basic
cognitive competencies of an adult but gradually develops these competencies
by passing through a series of stages of development. Piaget distinguishes four
major stages. The sensory-motor stage is in the first 2 years of life. In this stage,
children develop schemes for thinking about the physical world—for instance,
they develop the notion of an object as a permanent thing in the world. The
second stage is the preoperational stage, which is characterized as spanning the
period from 2 to 7 years of age. Unlike the younger child, a child in this period
can engage in internal thought about the world, but these mental processes are
intuitive and lack systematicity. For instance, a 4-year-old who was asked to
describe his painting of a farm and some animals said, “First, over here is a
392 | Individual Differences in Cognition
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 392
house where the animals live. I live in a house. So do my mommy and daddy.
This is a horse. I saw horses on TV. Do you have a TV?”
The third stage is the concrete-operational stage, which spans the period
from age 7 to age 11. In this period, children develop a set of mental operations
that allow them to treat the physical world in a systematic way. However, children
still have major limitations on their capacity to reason formally about the
world. The capacity for formal reasoning emerges in Piaget’s fourth period, the
formal-operational stage, spanning the years from 11 to adulthood. Upon entering
this period, although there is still much to learn, a child has become an
adult conceptually and is capable of scientific reasoning—which Piaget takes as
the paradigm case of mature intellectual functioning.
Piaget’s concept of a stage has always been a sore point in developmental psychology.
Obviously, a child does not suddenly change on an 11th birthday from
the stage of concrete operations to the stage of formal operations. There are large
differences among children and cultures, and the ages given are just approximations.
However, careful analysis of the development within a single child also
fails to find abrupt changes at any age. One response to this gradualness has been
to break down the stages into smaller substages. Another response has been to
interpret stages as simply ways of characterizing what is inherently a gradual and
continuous process. Siegler (1996) argued that, on careful analysis, all cognitive
development is continuous and gradual. He characterized the belief that children
progress through discrete stages as “the myth of the immaculate transition.”
Just as important as Piaget’s stage analysis is his analysis of children’s performance
in specific tasks within these stages. These task analyses provide the
empirical substance to back up his broad and abstract characterization of the
stages. Probably his most well-known task analysis is his research on conservation,
considered next.
Piaget proposed that children progress through four stages of increasing intellectual
sophistication: sensory-motor, preoperational, concrete-operational,
and formal-operational.
Conservation
The term conservation most generally refers to knowledge of the properties of
the world that are preserved under various transformations.A child’s understanding
of conservation develops as the child progresses through the Piagetian stages.
Conservation in the sensory-motor stage. A child must come to understand
that objects continue to exist over transformations in time and space. If a cloth is
placed over a toy that a 6-month-old is reaching for, the infant stops reaching
and appears to lose interest in the toy (Figure 14.2). It is as if the object ceases to
exist for the child when it is no longer in view. Piaget concluded from his experiments
that children do not come into the world with this knowledge but rather
develop a concept of object permanence during the first year.
According to Piaget, the concept of object permanence develops slowly and is
one of the major intellectual developments in the sensory-motor stage. An older
Cognitive Development | 393
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 393
infant will search for an object that has been hidden, but more demanding tests
reveal failings in the older infant’s understanding of a permanent object. In one
experiment, an object is put under cover A, and then, in front of the child, it is
removed and put under cover B. The child will often look for the object under
cover A. Piaget argues that the child does not understand that the object will still
be in location B. Only after the age of 12 months can the child succeed consistently
at this task.
Conservation in the preoperational and concrete-operational stages. A
number of important advances in conservation occur at about 6 years of age,
which, according to Piaget, is the transition between the preoperational and the
concrete-operational stages. Before this age, children can be shown to have
some glaring errors in their reasoning. These errors start to correct themselves
at this point. The cause of this change has been controversial, with different
theorists pointing to language (Bruner, 1964) and the advent of schooling
(Cole & D’Andrade, 1982), among other possible causes. Here, we will content
ourselves with a description of the changes leading to a child’s understanding
of conservation of quantity.
As adults, we can almost instantaneously recognize that there are four apples
in a bowl and can confidently know that these apples will remain four when
dumped into a bag. Piaget was interested in how a child develops the concept of
quantity and learns that quantity is something that is preserved under various
transformations, such as moving the objects from a bowl to a bag. Figure 14.3
illustrates a typical conservation problem that has been posed by psychologists
in many variations to preschool children in countless experiments. A child is
presented with two rows of objects, such as checkers. The two rows contain the
same number of objects and have been lined up so as to correspond. The child is
asked whether the two rows have the same amount and responds that they do.
The child can be asked to count the objects in the two rows to confirm that
conclusion. Now, before the child’s eyes, one row is compressed so that it is
shorter than the other row, but no checkers are added or removed. Again asked
394 | Individual Differences in Cognition
FIGURE 14.2 An illustration of a child’s apparent inability to understand the permanence of an
object. (Monkmeyer Press Photo Service, Inc. From Santrock and Yussen, 1989. Reprinted by permission of the publisher.
Copyright © 1989 by Wm. C. Brown.)
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 394
Cognitive Development | 395
FIGURE 14.3 A typical experimental situation to test for conservation of number. (Monkmeyer
Press Photo Service, Inc. From Santrock and Yussen, 1989. Reprinted by permission of the publisher. Copyright © 1989
Wm. C. Brown.)
which row has more objects, the child now says that the longer row has more.
The child appears not to know that quantity is something that is preserved
under transformations such as the compression of space. If asked to count the
two rows, the child expresses great surprise that they have the same number.
A general feature in demonstrations of lack of conservation is that the
irrelevant physical features of a display distract children. Another example is the
liquid-conservation task, which is illustrated in Figure 14.4. A child is shown two
identical beakers containing identical amounts of water and an empty, tall,
thin beaker. When asked whether the two identical beakers hold the same
amount of water, the child answers “Yes.” The water from one beaker is then
poured into the tall, thin beaker. When asked whether the amount of water in
the two containers is the same, the child now says that the tall beaker holds
more. Young children are distracted by physical appearance and do not relate
their having seen the water poured from one beaker into the other to the
unchanging quantity of liquid. Bruner (1964) demonstrated that a child is less
likely to fail to conserve if the tall beaker is hidden from sight while it is being
filled; then the child does not see the high column of water and so is not distracted
by physical appearance. Thus, it is a case of being overwhelmed by
physical appearance. The child does understand that water preserves its quantity
after being poured.
Failure of conservation has also been shown with weight and volume of solid
objects (for a discussion of studies of conservation, see Brainerd, 1978; Flavell,
1985; Ginsburg & Opper, 1980). It was once thought that the ability to perform
successfully on all these tasks depended on acquiring a single abstract concept of
conservation. Now, however, it is clear that successful conservation appears earlier
on some tasks than on others. For instance, conservation of number usually
appears before conservation of liquid. Additionally, children in transition will
show conservation of number in one experimental situation but not in another.
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 395
Conservation in the formal-operational period. When children reach the
formal-operational period, their understanding of conservation reaches new
levels of abstraction. They are able to understand the idealized conservations that
are part of modern science, including concepts such as the conservation of
energy and the conservation of motion. In a frictionless world, an object once set
in motion continues, an abstraction that the child never experiences. However,
the child comes to understand this abstraction and the way in which it relates
to experiences in the real world.
As children develop, they gain increasingly sophisticated understanding
about what properties of objects are conserved under which transformations.
What Develops?
Clearly, as Piaget and others have documented, major intellectual changes take
place in childhood. However, there are serious questions concerning what underlies
these changes. There are two ways of explaining why children perform
better on various intellectual tasks as they get older: One is that they “think
better,” and the other is that they “know better.” The think-better option holds
that children’s basic cognitive processes become better. Perhaps they can hold
396 | Individual Differences in Cognition
(a) (b)
FIGURE 14.4 A typical experimental situation to test for conservation of liquid. (Monkmeyer Press Photo
Service, Inc. From Santrock and Yussen, 1989. Reprinted by permission of the publisher. Copyright © 1989 Wm. C. Brown Publishers.)
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 396
more information in working memory or process information faster. The knowbetter
option holds that children have learned more facts and better methods as
they get older. I refer to this as “know better,” not “know more,” because it is not
just a matter of adding knowledge but also a matter of eliminating erroneous
facts and inappropriate methods (such as relying on appearance in the conservation
tasks). Perhaps this superior knowledge enables them to perform the tasks
more efficiently. A computer metaphor is apt here: A computer application can
be made to perform better by running the same program on a faster machine
that has more memory or by running a better program on the same machine.
Which is it in the case of child development—better machine or better program?
Rather than the reason being one or the other, the child’s improvement
is due to both factors, but what are their relative contributions? Siegler (1998)
argued that many of the developmental changes that take place in the first 2 years
are to be understood in relation to neural changes. Such changes in the first
2 years are considerable. As we already noted, an infant is born with more
neurons than the child will have at a later age. Although the number of neurons
decreases, the number of synaptic connections increases 10-fold in the first
2 years, as illustrated in Figure 14.5. The number of synapses reaches a peak at
about age 2, after which it declines. The earlier pruning of neurons and the later
Cognitive Development | 397
(a) (b) (c)
I
II
III
IV
V
VI
I
II
III
IV
V
VI
I
II
III
IV
V
VI
FIGURE 14.5 Postnatal development of human cerebral cortex around Broca’s area: (a) newborn;
(b) 3 months; (c) 24 months. (From Lenneberg, 1967.)
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 397
pruning of synaptic connections can be thought of as a process by which the
brain can fine-tune itself. The initial overproduction guarantees that there will
be enough neurons and synapses to process the required information. When
some neurons or synapses are not used, and so are proved unnecessary, they
wither away (Huttenlocher, 1994). After age 2, there is not much further growth
of neurons or their synaptic connections, but the brain continues to grow
because of the proliferation of other cells. In particular, the glial cells increase,
including those that provide the myelinated sheaths around the axons of
neurons. As discussed in Chapter 1, myelination enables the axon to conduct
brain signals rapidly. The process of myelination continues into the late teens
but at an increasingly gradual pace. The effects of this gradual myelination can
be considerable. For instance, the time for a nerve impulse to cross the hemispheres
in an adult is about 5 milliseconds (ms), which is four to five times as
fast as in a 4-year-old (Salamy, 1978).
It is tempting to emphasize the improvement in processing capacity as the
basis for improvement after age 2. After all, consider the physical difference
between a 2-year-old and an adult. When my son was 2 years old, he had difficulty
mastering the undoing of his pajama buttons. If his muscles and coordination
had so much maturing to do, why not his brain? This analogy, however, does
not hold: A 2-year-old has reached only 20% of his adult body weight, whereas
the brain has already reached 80% of its final size. Cognitive development after
age 2 may depend more on the knowledge that a person puts into his or her
brain rather than on any improvement in the physical capacities of the brain.
Neural development is a more important contributor to cognitive development
before the age of 2 than after.
The Empiricist-Nativist Debate
There is relatively little controversy either about the role that physical development
of the brain plays in the growth of human intellect or about the incredible
importance of knowledge to human intellectual processes. However, there is an
age-old nature-versus-nurture controversy that is related to, but different from, the
issue of physical growth versus knowledge accumulation. This debate is between
the nativists and the empiricists (see Chapter 1) about the origins of that knowledge.
The nativists argue that the most important aspects of our knowledge about
the world appear as part of our genetically programmed development, whereas
the empiricists argue that virtually all knowledge comes from experience with the
environment. One reason that this issue is emotionally charged is that it would
seem tied to conceptions about what makes humans special and what their potential
for change is. The nativist view is that we sell ourselves short if we believe
that our minds are just a simple reflection of our experiences, and empiricists
believe that we undersell the human potential if we think that we are not capable
of fundamental change and improvement. The issue is not this simple, but it
nonetheless fuels great passion on both sides of the debate.
We have already visited this issue in the discussions of language acquisition
and of whether important aspects of human language are innately specified,
398 | Individual Differences in Cognition
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 398
such as language universals. However, similar arguments have been made for our
knowledge of human faces or our knowledge of biological categories. A particularly
interesting case concerns our knowledge of number. Piaget used experiments
such as those on number conservation to argue that we do not have an
innate sense of numbers, but others have used experiments to argue otherwise.
For instance, in studies of infant attention, young children have been shown to
discriminate one object from two and two from three (Antell & Keating, 1983;
Starkey, Spelke, & Gelman, 1990; van Loosbroek & Smitsman, 1992). In these
studies, young children become bored looking at a certain number of objects but
show renewed interest when the number of objects changes. There is even evidence
for a rudimentary ability to add and subtract (Simon, Hespos, & Rochat,
1995;Wynn, 1992). For instance, if a 5-month-old child sees one object appear on
stage and then disappear behind a screen, and then sees a second object appear
on stage and disappear behind the screen, the child is surprised if there are not
two objects when the screen is raised (Figure 14.6—note this contradicts Piaget’s
claims about failure of conservation in the sensory-motor stage). This reaction is
taken as evidence that the child calculates 1 _ 1 _ 2. Dehaene (2000) argued that
a special structure in the parietal cortex is responsible for representing number
and showed that it is especially active in certain numerical judgment tasks.
On the other hand, others argue that most of human knowledge could not
be coded into our genes. This argument was strengthened with the realization
in 2001 that a human has only 30,000 genes—only about one-third the number
originally estimated. Moreover, more than 97% of these genes are generally believed
to be shared with chimpanzees, which does not leave much for encoding
Cognitive Development | 399
1. Object placed in case
Sequence of events: 1+1 = 1 or 2
2. Screen comes up 3. Second object added 4. Hand leaves empty
5. Screen drops…
Then either: (a) Possible Outcome or (b) Impossible Outcome
6. revealing 2 objects 5. Screen drops… 6. revealing 1 object
FIGURE 14.6 In Karen Wynn’s experiment, she showed 5-month-old infants one or two dolls
on a stage. Then she hid the dolls behind a screen and visibly removed or added one. When
she lifted the screen out of the way, the infants would often stare longer when shown a wrong
number of dolls.
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 399
the rich knowledge that is uniquely human. Elman et al. (1996), among other
researchers, have argued that, although the basic mechanisms of human information
processing are innately specified and these mechanisms enable human
thought, most of the knowledge and structure of the mind must be acquired
through experience.
There is considerable debate in cognitive science about the degree to which
our basic knowledge is innate or acquired from experience.
Increased Mental Capacity
A number of developmental theories have proposed that there are basic cognitive
capacities that increase from birth through the teenage years (Case, 1985;
Fischer, 1980; Halford, 1982; Pascual-Leone, 1980). These theories are often called
neo-Piagetian theories of development. Consider Case’s memory-space proposal,
which is that a growing working-memory capacity is the key to the developmental
sequence. The basic idea is that more-advanced cognitive performance requires
that more information be held in working memory.
An example of this analysis is Case’s (1978) description of how children solve
Noelting’s (1975) juice problems. A child is given two empty pitchers, A and B,
and is told that several tumblers of orange juice and tumblers of water will be
poured into each pitcher. The child’s task is to predict which pitcher will taste
most strongly of orange juice. Figure 14.7 illustrates four stages of juice problems
that children can solve at various ages. At the youngest age, children can
reliably solve only problems where all orange juice goes into one pitcher and all
water into another. At ages 4 to 5, they can count the number of tumblers of
orange juice going into a pitcher and choose the pitcher that holds the larger
number—not considering the number of tumblers of water. At ages 7 to 8, they
notice whether there is more orange juice or more water going into a pitcher. If
pitcher A has more orange juice than water and pitcher B has more water than
orange juice, they will choose pitcher A even if the absolute number of glasses of
orange juice is fewer. Finally, at age 9 or 10, children compute the difference between
the amount of orange juice and the amount
of water (still not a perfect solution).
Case argued that the working-memory requirements
differ for the various types of problems represented
in Figure 14.7. For the simplest problems,
a child has to keep only one fact in memory—
which set of tumblers has the orange juice. Children
at ages 3 to 4 can keep only one such fact in
mind. If both sets of tumblers have orange juice,
the child cannot solve the problem. For the second
type of problem, a child needs to keep two things
in memory—the number of orange juice tumblers
in each array. In the third type of problem, a child
needs to keep additional partial products in mind
to determine which side has more orange juice
400 | Individual Differences in Cognition
Age
A B
3−4
4−5
7−8
9−0
FIGURE 14.7 The Noelting juice
problem solved by children at
various ages. The problem is
to tell which pitcher will taste
more strongly of orange juice
after participants observe
the tumblers of water and
tumblers of juice that will be
poured into each pitcher.
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 400
than water. To solve the fourth type of problem, a child needs four facts to make
a judgment:
1. The absolute difference in tumblers going into pitcher A
2. The sign of the difference for pitcher A (i.e., whether there is more water
or more orange juice going into pitcher)
3. The absolute difference in tumblers going into pitcher B
4. The sign of the difference for pitcher B
Case argued that children’s developmental sequences are controlled by their
working-memory capacity for the problem. Only when they can keep four facts
in memory will they achieve the fourth stage in the developmental sequence.
Case’s theory has been criticized (e.g., Flavell, 1978) because it is hard to decide
how to count the working-memory requirements.
Another question concerns what controls the growth in working memory.
Case argued that a major factor in the increase of working memory is increased
speed of neural function. He cited the evidence that the degree of myelination
increases with age, with spurts approximately at those points where he postulated
major changes in working memory. On the other hand, he also argued
that practice plays a significant role as well: With practice, we learn to perform
our mental operations more efficiently, and so they do not require as much
working-memory capacity.
The research of Kail (1988) can be viewed as consistent with the proposal
that speed of mental operation is critical. This investigator looked at a number
of cognitive tasks, including the mental rotation task examined in Chapter 4.
He presented participants with pairs of letters in different orientations and
asked them to judge whether the letters were the same or were mirror images of
one another. As discussed in Chapter 4, participants tend to mentally rotate an
image of one object into congruence with the other to make this judgment. Kail
observed people, who ranged in age from 8 to 22, performing this task and
found that they became systematically faster with age. He was interested in
rotation rate, which he measured as the number of milliseconds to rotate one
degree of angle. Figure 14.8 shows these data, plotting rate of rotation as a function
of age. The time to rotate a degree of angle decreases
as a function of age.
In some of his writings, Kail argued that this result is
evidence of an increase in basic mental speed as a function
of age. However, an alternative hypothesis is that it reflects
accumulating experience over the years at mental rotation.
Kail and Park (1990) put this hypothesis to the test by giving
11-year-old children and adults more than 3,000 trials
of practice at mental rotation. They found that both groups
sped up but that adults started out faster. However, Kail
and Park showed that all their data could be fit by a single
power function that assumed that the adults came into the
experiment with what amounted to an extra 1,800 trials
of practice (Chapters 6 and 9 showed that learning curves
Cognitive Development | 401
8
3
4
5
12
Age (years)
Mental rotation
Rotation rate (ms/degree)
16 20
FIGURE 14.8 Rates of mental
rotation, estimated from the
slope of the function relating
response time to the orientation
of the stimulus. (Kail, 1988.)
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 401
tended to be fit by power functions). Figure 14.9 shows the resulting data,
with the children’s learning function superimposed on the adult’s learning
function. The practice curve for the children assumes that they start with about
150 trials of prior practice, and the practice curve for the adults assumes that they
start with 1,950 trials of prior practice. However, after 3,000 trials of practice,
children are a good bit faster than beginning adults. Thus, although the rate of
information processing increases with development, this increase may have a
practice-related rather than a biological explanation.
Qualitative and quantitative developmental changes take place in cognitive
development because of increases both in working-memory capacity and in
rate of information processing.
Increased Knowledge
Chi (1978) demonstrated that developmental differences may be knowledge
related.Her domain of demonstration was memory. Not surprisingly, children do
worse than adults on almost every memory task. Is their performance because
402 | Individual Differences in Cognition
1,000
1
0
2
3
4
5
2,000 3,000 4,000
Imputed trials of practice
Rotation rate (ms/degree)
Children
Adults
5,000
FIGURE 14.9 Data from Kail and Park (1990): Children and adults are on the same learning
curve, but adults are advanced 1,800 trials.
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 402
their memories have less capacity or is it because
they know less about what they are being asked to remember?
To address this question, Chi compared the
memory performance of 10-year-olds with that of
adults on two tasks—a standard digit-span task and
a chess memory task (see the discussion of these
tasks in Chapters 6 and 9). The 10-year-olds were
skilled chess players, whereas the adults were novices
at chess. The chess task was the one discussed in
Chapter 9 on page 258—a chessboard was presented
for 10 s and then withdrawn, and participants were
then asked to reproduce the chess pattern.
Figure 14.10 illustrates the number of chess
pieces recalled by children and adults. It also contrasts
these results with the number of digits recalled
in the digit-span task. As Chi predicted, the adults
were better on the digit-span task, but the children were better on the chess
task. The children’s superior chess performance was attributed to their greater
knowledge of chess. The adults’ superior digit performance was due to their
greater familiarity with digits—the dramatic digit-span performance of participant
S. F. in Chapter 9 shows just how much digit knowledge can lead to
improved memory performance.
The novice-expert contrasts in Chapter 9 are often used to explain developmental
phenomena. We saw that a great deal of experience in a domain is
required if a person is to become an expert. Chi’s argument is that children,
because of their lack of knowledge, are near universal novices, but they can become
more expert than adults through concentrated experience in one domain,
such as chess.
The Chi experiment contrasted child experts with adult novices. Schneider,
Körkel, and Weinert (1988) looked at the effect of expertise at various age
levels. They categorized German schoolchildren as either experts or novices
with respect to soccer. They did so separately for grade levels 3, 5, and 7. The
students at each grade level were asked to recall a story about soccer. Table 14.1
illustrates the amount of recall displayed as a function of grade level and
expertise. The effect of expertise was much greater than that of grade level. On
a recognition test, there was no effect of grade level, only an effect of expertise.
They also classified each group of participants into high-ability
and low-ability participants on the basis of their performance
on intelligence tests. Although such tests generally predict
memory for stories, Schneider et al. found no effect of general
ability level, only of knowledge for soccer. They argue that highability
students are just those who know a lot about a lot of
domains and consequently generally do well on memory tests.
However, when tested on a story about a specific domain such
as soccer, a high-ability student who knows nothing about that
domain will do worse than a low-ability student who knows a
lot about the domain.
Cognitive Development | 403
Children
Number recalled
Digits
Chess
5 pieces
6
7
8
9
10
Adults
FIGURE 14.10 Number of chess
pieces and number of digits
recalled by children versus
adults. (From Chi, 1978.)
TABLE 14.1
Mean Percentages of Idea Units Recalled
as a Function of Grade and Expertise
Grade Soccer Experts Soccer Novices
3 54 32
5 52 33
7 61 42
From Körkel (1987).
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 403
404 | Individual Differences in Cognition
In addition to lack of relevant knowledge, children have difficulty on memory
tasks because they do not know the strategies that lead to improved memory. The
clearest case concerns rehearsal. If you were asked to dial a novel seven-digit telephone
number, I would hope that you would rehearse it until you were confident
that you had it memorized or until you had dialed the number. It would not
occur to young children that they should rehearse the number. In one study comparing
5-year-olds with 10-year-olds, Keeney, Cannizzo, and Flavell (1967) found
that 10-year-olds almost always verbally rehearsed a set of objects to be remembered,
whereas 5-year-olds seldom did. Young children’s performance often improves
if they are instructed to follow a verbal rehearsal strategy, although very
young children are simply unable to execute such a rehearsal strategy.
Chapter 6 emphasized the importance of elaborative strategies for good
memory performance. Particularly for long-term retention, elaboration appears
to be much more effective than rote rehearsal. There also appear to be sharp
developmental trends with respect to the use of elaborative encoding strategies.
For instance, Paris and Lindauer (1976) looked at the elaborations that children
use to relate two paired-associates nouns such as lady and broom. Older children
are more likely to generate interactive sentences such as The lady flew
on the broom than static sentences such as The lady had a broom. Such interactive
sentences will lead to better memory performance. Young children are also
poorer at drawing the inferences that improve memory for a story (Stein &
Trabasso, 1981).
Younger children often do worse on tasks than do older children, because they
have less relevant knowledge and poorer strategies.
Cognition and Aging
Changes in cognition do not cease when we reach adulthood. As we get older,
we continue to learn more things, but human cognitive ability does not uniformly
increase with added years, as we might expect if intelligence were only a
matter of what one knows. Figure 14.11 shows data compiled by Salthouse
(1992) on two components of the Wechsler Adult Intelligence Scale-Revised
(WAIS-R). One component deals with verbal intelligence, which includes elements
such as vocabulary and language comprehension. As you can see, this
component maintains itself quite constantly through the years. In contrast, the
performance component, which includes abilities such as reasoning and problem
solving, decreases dramatically.
The importance of these declines in basic measures of cognitive ability can
be easily exaggerated. Such tests are typically given rapidly, and older adults do
better on slower tests. Additionally, such tests tend to be like school tests, and
young adults have had more recent experience with such tests. When it comes
to relevant job-related behavior, older adults often do better than younger
adults (e.g., Perlmutter, Kaplan, & Nyquist, 1990), owing both to their greater
accumulation of knowledge and to a more mature approach to job demands.
There is also evidence that previous generations did not do as well on tests even
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 404
when they were young. This is the so-call “Flynn effect”—IQ scores appear to
have risen about 3 points per decade (Flynn, 1987). The comparisons in Figure
14.11 are not only of people of different ages but also of people who grew up in
different periods. Some of the apparent decline in the figure might be due to
differences among generations (education, nutrition, etc.) and not age-related
factors.
Although non-age-related factors may explain some of the decline shown in
Figure 14.9, there are substantial age-related declines in brain function. Brain
cells gradually die. Some areas are particularly susceptible to cell death. The
hippocampus, which is particularly important to memory (see Chapter 7),
loses about 5% of its cells every decade (Selkoe, 1992). Other cells, though they
might not die, have been observed to shrink and atrophy. On the other hand,
there is some evidence for compensatory growth: Cells remaining in the
hippocampus will grow to compensate for the age-related deaths of their neighbors.
There is also increasing evidence for the birth of new neurons, particularly
in the region of the hippocampus (E. Gould & Gross, 2002).Moreover, the
number of new neurons seems to be very much related to the richness of a person’s
experience. Although these new neurons are few in number compared
with the number lost, they may be very valuable because new neurons are more
plastic and may be critical to encoding new experiences.
Although there are age-related neural losses, they may be relatively minor
in most intellectually active adults. The real problem concerns the intellectual
deficits associated with various brain-related disorders. The most common
of these disorders is Alzheimer’s disease, which is associated with substantial
Cognitive Development | 405
Chronological age (years)
20 30 40 50 60 70 80
70
80
90
100
110
120
130
Mean IQ
Verbal
Performance
FIGURE 14.11 Mean verbal and performance IQs from the WAIS-R standardization sample as
a function of age. (From Salthouse, 1992. Reprinted by permission from LEA, Inc.)
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 405
impairment of brain function, particularly in the
temporal region including the hippocampus. Many
of these diseases progress slowly, and some of the
reason for age-related deficits in tests such as that of
Figure 14.11 may be due to the fact that some of the
older participants are suffering from the early stages
of such diseases. However, even when health factors
are taken into account and when the performance
of the same participants is tracked in longitudinal
studies (so there is not a generational confound),
there is evidence for age-related intellectual decline,
although it may not become significant until after
age 60 (Schaie, 1996).
As we get older, there is a race going on between
growth in knowledge and loss of neural function.
People in many professions (artists, scientists, philosophers)
tend to produce their best work in their midthirties.
Figure 14.12 shows some interesting data
from Lehman (1953). He examined the works of 182
famous deceased philosophers who collectively wrote
some 1,785 books. Figure 14.12 plots the probability that a book was considered
that philosopher’s best book as a function of the age at which it was
written. These philosophers remained prolific, publishing many books in their
seventies. However, as Figure 14.12 displays, a book written in this decade is
unlikely to be considered a philosopher’s best.1 Lehman reviewed data from
a number of fields consistent with the hypothesis that the thirties tend to be
the time of peak intellectual performance. However, as Figure 14.12 shows,
people often maintain relatively high intellectual performance into their forties
and fifties.
The evidence for an age-related correlation between brain function and
cognition makes it clear that there is a contribution of biology to intelligence
that knowledge cannot always overcome. Salthouse (1992) argued that, in
information-processing terms, people lose their ability to hold information in
working memory with age. He contrasted participants of different ages on the
reasoning problems presented in Figure 14.13. These problems differ in the
number of premises that need to be combined to come to a particular solution.
Figure 14.13 shows how people at various ages perform in these tasks. As can be
seen, people’s ability to solve these problems generally declines with the number
of premises that need to be combined. However, this drop-off is much
steeper for older adults. Salthouse argued that older adults are slower than
younger adults in information processing, which inhibits their ability to maintain
information in working memory.
406 | Individual Differences in Cognition
70 80
.15
.10
.05
20 30 40 50
Decade
Probability of best book
60
.00
FIGURE 14.12 Probability that
a particular book will become a
philosopher’s best as a function
of the age at which the philosopher
wrote the book. (Adapted from
Lehman, 1953.)
1 It is important to note that this graph denotes the probability of a specific book being the best, and so the
outcome is not an artifact of the number of books written during a decade.
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 406
Cognitive Development | 407
Twenties
Sixties
Forties
Number of premises
Percentage correct
1 2 3 4 5
50
60
70
80
90
100
Q and R do the OPPOSITE
If Q INCREASES, what will happen to R?
D and E do the OPPOSITE
C and D do the SAME
If C INCREASES, what will happen to E?
R and S do the SAME
Q and R do the OPPOSITE
S and T do the OPPOSITE
If Q INCREASES, what will happen to T?
U and V do the OPPOSITE
W and X do the SAME
T and U do the SAME
V and W do the OPPOSITE
If T INCREASES, what will happen to X?
FIGURE 14.13 Illustration of integrative reasoning trials hypothesized to vary in working-memory
demands (top), and mean performance of adults in their twenties, forties, and sixties with each
trial type (bottom).
Increased knowledge and maturity sometimes compensate for age-related
declines in rates of information processing.
Summary for Cognitive Development
With respect to the nature-versus-nurture issue, the developmental data paint a
mixed picture. A person’s brain is probably at its best physically in the early twenties,
and intellectual capacity tends to follow brain function. The relation seems
particularly strong in the early years of childhood. However, we saw evidence that
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 407
practice could overcome age-related differences in speed (Figure 14.9), and knowledge
could be a more dominant factor than age (Figure 14.10 and Table 14.1).
Additionally, the point of peak intellectual output appears to take place later
than in a person’s twenties (Figure 14.12), indicating the need for accumulated
knowledge. As discussed in Chapter 9, truly exceptional performance in a field
tends to require at least 10 years of experience in that field.
•Psychometric Studies of Cognition
We now turn from considering how cognition varies as a function of age to
considering how cognition varies within a population of a fixed age. All this
research has basically the same character. It entails measuring the performances
of various people on a number of tasks and then looking at the way in which
these performance measures correlate across different tests. Such tests are
referred to as psychometric tests. This research has established that there is not
a single dimension of “intelligence” on which people vary but rather that individual
differences in cognition are much more complex. We will first examine
research on intelligence tests.
Intelligence Tests
Research on intelligence testing has had a much longer sustained intellectual
history than cognitive psychology. In 1904, the Minister of Public Instruction in
Paris named a commission charged with identifying children in need of remedial
education. Alfred Binet set about developing a test that would objectively
identify students having intellectual difficulty. In 1916, Lewis Terman adapted
Binet’s test for use with American students. His efforts led to the development
of the Stanford-Binet, a major general intelligence test in use in America today
(Terman & Merrill, 1973). The other major intelligence test used in America is
the Wechsler, which has separate scales for children and adults. These tests include
measures of digit span, vocabulary, analogical reasoning, spatial judgment,
and arithmetic. A typical question for adults on the Stanford-Binet is,
“Which direction would you have to face so your right hand would be to the
north?” A great deal of effort goes into selecting test items that will predict
scholastic performance.
Both of these tests produce measures that are called intelligence quotients
(IQs). The original definition of IQ relates mental age to chronological age. The
test establishes one’s mental age. If a child can solve problems on the test that
the average 8-year-old can solve, then the child has a mental age of 8 independent
of chronological age. IQ is defined as the ratio of mental age to chronological
age multiplied by 100 or
IQ _ 100 _ MA/CA
where MA is mental age and CA is chronological age. Thus, if a child’s mental
age were 6 and chronological age were 5, the IQ would be 100 _ 6/5 _ 120.
408 | Individual Differences in Cognition
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 408
This definition of IQ proved unsuitable for a number of reasons. It cannot
extend to measurement of adult intelligence, because performance on intelligence
tests starts to level off in the late teens and declines in later years. To deal
with such difficulties, the common way of defining IQ now is in terms of deviation
scores. A person’s raw score is subtracted from the mean score for that person’s
age group, and then this difference is transformed into a measure that will
vary around 100, roughly as the earlier IQ scores would. The precise definition
is expressed as
(score _ mean)
IQ _ 100 _ 15 _ ________________
standard deviation
where standard deviation is a measure of the variability of the scores. IQs so
measured tend to be distributed according to a normal distribution. Figure 14.14
shows such a normal distribution of intelligence scores and the percentage of
people who have scores in various ranges.
Whereas the Stanford-Binet and the Weschler are general intelligence
tests, many others were developed to test specialized abilities, such as spatial
ability. These tests partly owe their continued use in the United States to the
fact that they do predict performance in school with some accuracy, which
was one of Binet’s original goals. However, their use for this purpose is
controversial. In particular, because such tests can be used to determine
who can have access to what educational opportunities, there is a great deal
of concern that they should be constructed so as to prevent biases against
certain cultural groups. Immigrants often do poorly on tests of intelligence
because of cultural biases on the tests. For instance, immigrant Italians of
less than a century ago scored an average of 87 on IQ tests (Sarason &
Doris, 1979), whereas today their descendants have slightly above average IQs
(Ceci, 1991).
Psychometric Studies of Cognition | 409
70
2% 11% 11% 2%
33% 33%
85 100 115 130
FIGURE 14.14 A normal distribution of IQ measures.
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 409
The very concept of intelligence is culturally relative. What one culture
values as intelligent another culture will not. For instance, the Kpelle, an
African culture, think that the way in which Westerners sort instances into
categories (for instance, sorting apples and oranges into the same category—a
basis for some items in intelligence tests) is foolish (Cole, Gay, Glick, & Sharp,
1971). Robert Sternberg (personal communication) notes that some cultures
do not even have a word for intelligence. Still, the fact remains that intelligence
tests do predict performance in our (Western) schools. Whether they
are doing a valuable service in assessing students for schools or are simply
enforcing arbitrary cultural beliefs about what is to be valued is a difficult
question.
Related to the issue of the fairness of intelligence tests is whether they
measure innate endowment or acquired ability (the nature-versus-nurture
issue again). Potentially definitive data would seem to come from studies of
identical twins reared apart. Sometimes such twins have been adopted into different
families—they have identical genetic endowment but different environmental
experiences. The research on this topic is controversial (Kamin, 1974),
but analyses (Bouchard, 1983; Bouchard & McGue, 1981) indicate that identical
twins raised apart have IQs much more similar to each other than do
nonidentical fraternal twins raised in the same family. This evidence seems
to indicate the existence of a strong innate component of IQ. Yet drawing the
conclusion that intelligence is largely innate would be a mistake. Intelligence and
IQ are by no means the same thing. Because of the goals of intelligence tests,
they must predict success across a range of environments, particularly academic.
Thus they must discount the contributions of specific experiences to
intelligence. As noted in Chapter 9, for instance, chess masters tend not to have
particularly high IQs. This tendency is more a comment on the IQ test than on
chess masters. If an IQ test focused on chess experience, it would have little
success in predicting academic success generally. Thus, intelligence tests try
to measure raw abilities and general knowledge that are reasonably expected
of everyone in a culture. However, as we saw in Chapter 9, excellence in any
specific domain depends on knowledge and experience that are not general in
the culture.
Another interesting demonstration of this lack of correlation between
expertise and IQ was performed by Ceci and Liker (1986). These researchers
looked at the ability of avid horse-racing fans to handicap races. They
found that handicapping skill is related to the development of a complex
interactive model of horse racing but that there was no relation between this
skill and IQ.
Although specific experience is clearly important to success in any field, the
remarkable fact is that these intelligence tests are able to predict success in certain
endeavors. They predict with modest accuracy both performance in school
and general success in life (or at least in Western societies).What is it about the
mind that they are measuring? Much of the theoretical work in the field has
been concerned with trying to answer this question. To understand how this
question has been pursued, one must understand a little about a major method
of the field, factor analysis.
410 | Individual Differences in Cognition
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 410
Psychometric Studies of Cognition | 411
scoring members of our society have more limited opportunities
and often are sorted by their test scores into
environments where there is more antisocial behavior.
Another confounding factor is that success in society
is at every point determined by judgments of
other members of the society.
For instance, most studies of job
performance use measures like
ratings of supervisors rather than
actual measures of job performance.
Promotions are often largely
dependent on judgments of superiors.
Also, legal resolutions such
as sentencing decisions in criminal
cases have strong judgmental
aspects to them. It could be that
IQ more strongly affects these
social judgments than the actual
performances being judged such as how well one does
one’s job or how bad a particular activity was. Individuals
in positions of power, such as judges and supervisors,
tend to have high IQs. Thus, there is the possibility that
some of the success associated with high IQ is an ingroup
effect where high-IQ people favor people who are
like them.
Implications
Does IQ determine success in life?
IQ appears to have a strong predictive relationship to
many socially relevant factors besides academic performance.
The American Psychological Report Intelligence:
Knowns and Unknowns (Neisser et al., 1996) states that
IQ accounts for about one-fifth of the variance (positive
correlations in the range of .3 to .5)
in factors like job performance and
income. It has an even stronger relationship
to socioeconomic status.
There are weaker negative correlations
with antisocial measures like
criminal activity.
There is a natural tendency to
infer from this that IQ is directly
related to being a successful member
of our society, but there are
reasons to question a direct relationship.
Access to various educational
opportunities and to some jobs depends on test
scores. Access to other professions depends on completing
various educational programs, the access to which is
partly determined by test scores. Given the strong relationship
between IQ and these test scores, we would
expect that higher-IQ members of our society would get
better training and professional opportunities. Lower-
Standard intelligence tests measure general factors that predict success
in school.
Factor Analysis
The general intelligence tests contain a number of subtests that measure individual
abilities. As already noted, many specialized tests also are available for
measuring particular abilities. The basic observation is that people who do well
on one test or subtest tend to do well on another test or subtest. The degree to
which people perform comparably on two subtests is measured by a correlation
coefficient. If all the same people who did well on one test did just as well on
another, the correlation between the two tests would be 1. If all the people who
did well on one test did proportionately badly on another, the correlation coefficient
would be _1. If there were no relation between how people did on one
test and how they did on another test, the correlation coefficient would be zero.
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 411
Typical correlations between tests are positive, but not 1, indicating a less than
perfect relation between performance on one test and on another.
For example, Hunt (1985) looked at the relations among the seven tests described
in Table 14.2. Table 14.3 shows the intercorrelations among these test
scores. As can be seen, some pairs of tests are more correlated than others. For
instance, there is a relatively high (.67) correlation between reading comprehension
and vocabulary but a relatively low (.14) correlation between reading
comprehension and spatial reasoning. Factor analysis is a way of trying to
make sense of these correlational patterns. The basic idea is to try to arrange
these tests in a multidimensional space such that the distances among the tests
correspond to their correlation. Tests close together will have high correlations
and so measure the same thing. Figure 14.15 shows an attempt to organize the
tests in Table 14.2 into a two-dimensional area. The reader can confirm that
412 | Individual Differences in Cognition
TABLE 14.2
Description of Some of the Tests on the Washington Pre-College Test Battery
Test Name Description
1. Reading comprehension Answer questions about paragraph
2. Vocabulary Choose synonyms for a word
3. Grammar Identify correct and poor usage
4. Quantitative skills Read word problems and decide whether problem
can be solved
5. Mechanical reasoning Examine a diagram and answer questions about it;
requires knowledge of physical and mechanical
principles
6. Spatial reasoning Indicate how two-dimensional figures will appear
if they are folded through a third dimension
7. Mathematics achievement A test of high school algebra
From Hunt (1985).
TABLE 14.3
Intercorrelations Between Results of the Tests Listed in Table 14.2
Test No. 1 2 3 4 5 6 7
1 1.00 .67 .63 .40 .33 .14 .34
2 1.00 .59 .29 .46 .19 .31
3 1.00 .41 .34 .20 .46
4 1.00 .39 .46 .62
5 1.00 .47 .39
6 1.00 .46
7 1.00
From Hunt (1985).
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 412
the closer the tests are in this space, the higher
their correlation in Table 14.3.
An interesting question is how to make sense
of this space. As we go from the bottom to the
top, the tests become increasingly symbolic and
linguistic. We might refer to this dimension as a
linguistic factor. Second, we might argue that, as
we go from the left to the right, the tests become
more computational in character.We might consider
this dimension a reasoning factor. High
correlations can be explained in terms of students
having similar values of these factors.
Thus, there is a high correlation between quantitative
skills and mathematics achievement because
they both have an intermediate degree of
linguistic involvement and require substantial
reasoning. People who have strong reasoning
ability and average or better verbal ability will
tend to do well on these tests.
Factor analysis is basically an effort to go from a set of intercorrelations like
those in Table 14.3 to a small set of factors or dimensions that explain those
intercorrelations. There has been considerable debate about what the underlying
factors are. Perhaps you can see other ways to explain the correlations in
Table 14.3. For instance, you might argue that a linguistic factor links tests 1
through 3, a reasoning factor links tests 4, 5, and 7, and there is a separate spatial
factor for test 6. Indeed, we will see that there have been many proposals
for separate linguistic, reasoning, and spatial factors, although, as shown by the
data in Table 14.3, it is a little difficult to separate the spatial and reasoning
factors.
The difficulty in interpreting such data is manifested in the wide variety of
positions that have been taken about what the underlying factors of human
intelligence are. Spearman (1904) argued that only one general factor underlies
performance across tests. He called his factor g. In contrast, Thurstone (1938)
argued that there are a number of separate factors, including verbal, spatial, and
reasoning. Guilford (1982) proposed no less than 120 distinct intellectual abilities.
Cattell (1963) proposed a distinction between fluid and crystallized intelligence;
crystallized intelligence refers to acquired knowledge, whereas fluid
intelligence refers to the ability to reason or to solve problems in novel domains.
(In Figure 14.11, fluid intelligence, not crystallized intelligence, shows the agerelated
decay.) Horn (1968), elaborating on Cattell’s theory, argued that there is
a spatial intelligence that can be separated from fluid intelligence. Table 14.3 can
be interpreted in terms of the Horn-Cattell theory, where crystallized intelligence
maps into the linguistic factor (tests 1 to 3), fluid intelligence into the reasoning
factor (tests 4, 5, and 7), and spatial intelligence into the spatial factor
(test 6). Fluid intelligence tends to be tapped strongly in mathematical tests, but
it is probably better referred to as a reasoning ability rather than a mathematical
Psychometric Studies of Cognition | 413
1. Reading comprehension
2. Vocabulary
3. Grammar
5. Mechanical
reasoning
4. Quantitative skills
7. Mathematics achievement
6. Spatial reasoning
FIGURE 14.15 A twodimensional
representation
of the tests in Table 14.2.
The distance between points
decreases with increases in the
intercorrelations in Table 14.3.
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 413
ability. It is a bit difficult to separate the fluid and spatial intelligences in factor
analytical studies, but it appears possible (Horn & Stankov, 1982).
Although it is hard to draw any firm conclusions about what the real factors
are, it seems clear that there is some differentiation in human intelligence as measured
by intelligence tests. Probably, the Horn-Cattell theory or the Thurstone
theory offer the best analyses, producing what we will call a verbal factor, a spatial
factor, and a reasoning factor. The rest of this chapter will provide further
evidence for the division of the human intellect into these three abilities. This
conclusion is significant because it indicates that there is some specialization in
achieving human cognitive function.
In a survey of virtually all data sets, Carroll (1993) proposed a theory of
intelligence that combines the Horn-Cattell and Thurstone perspectives. He
proposed what he called a three-strata theory. At the lowest stratum are specific
abilities such as the ability to be a physicist. Such abilities Carroll thinks are
largely not inheritable. At the next stratum are broader abilities such as the
verbal factor (crystallized intelligence), the reasoning factor (fluid intelligence),
and the spatial factor. Finally, Carroll noted that these factors tend to correlate
together to define something like Spearman’s g at the highest stratum.
In the past few decades, there has been considerable interest in the way in
which these measures of individual differences relate to the kinds of theories of
information processing that are found in cognitive psychology. For instance,
how do participants with high spatial abilities differ from those with low spatial
abilities in the processes entailed in the spatial imagery tasks discussed in
Chapter 4? Makers of intelligence tests have tended to ignore such questions
because their major goal is to predict scholastic performance. We will look at
some information-processing studies that try to understand the reasoning factor,
the verbal factor, and the spatial factor.
Factor-analysis methods identify that a reasoning ability, a verbal ability,
and a spatial ability underlie performance on various intelligence tests.
Reasoning Ability
Typical tests used to measure reasoning include mathematical problems, analogy
problems, series extrapolation problems, deductive syllogisms, and problemsolving
tasks. These tasks are the kinds analyzed in great detail in Chapters 8
through 10. In the context of this book, such abilities might better be called
problem-solving abilities.Most of the research in psychometric tests has focused
only on whether a person gets a question right or not. In contrast, informationprocessing
analyses try to examine the steps by which a person decides on an
answer to such a question and the time necessary to perform each step.
The research of Sternberg (1977; Sternberg & Gardner, 1983) is an attempt to
connect the psychometric research tradition with the information-processing
tradition. He analyzed how people process a wide variety of reasoning problems.
Figure 14.16 illustrates one of his analogy problems. Participants were
asked to solve the analogy “A is to B as C is to D1 or D2?” Sternberg analyzed the
process of making such analogies into a number of stages. Two critical stages in
414 | Individual Differences in Cognition
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 414
his analysis are called reasoning and comparison. Reasoning requires finding
each feature that changes between A and B and applying it to C. In Figure 14.16,
A and B differ by a change in costume from spotted to striped. Thus, one predicts
that C will change from spotted to striped to yield D. Comparison requires
comparing the two choices, D1 and D2; D1 and D2 are compared feature by
feature until a feature is found that enables a choice. Thus, a participant may
first check that both D1 and D2 have an umbrella (which they do), then that
both wear a striped suit (which they do), and then that both have a dark hat
(which only D1 has). The dark hat feature will allow the participant to reject D2
and accept D1.
Sternberg was interested in the time that participants needed to make these
judgments. He theorized that they would take a certain amount longer for each
feature in which A differed from B because this feature would have to be
changed to derive D from C. Sternberg and Gardner (1983) estimated a time of
0.28 s for each such feature. This length of time is the reasoning parameter. They
also estimated 0.60 seconds to compare a feature predicted of D with the features
of D1 and D2. This length of time is the comparison parameter. The values
0.28 and 0.60 are just averages; the actual values of these reasoning and comparison
times varied across participants. Sternberg and Gardner looked at the
correlations between the values of these parameters for individual participants
and the psychometric measures of participants’ reasoning abilities. They found
a correlation of .79 between the reasoning parameter and a psychometric measure
of reasoning and a correlation of .75 between the comparison parameter
and the psychometric measure. These correlations mean that participants who
are slow in reasoning or comparison do poorly in psychometric tests of reasoning.
Thus, Sternberg and Gardner were able to show that measures of speed
Psychometric Studies of Cognition | 415
(a)
(c) D1 D2
(b)
FIGURE 14.16 An example of an analogy problem used by Sternberg and Gardner (1983).
(Copyright © 1983 by the APA. Adapted by permission.)
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 415
identified in an information-processing analysis are critical to psychometric
measures of intelligence.
Participants who score high on reasoning ability are able to perform individual
steps of reasoning rapidly.
Verbal Ability
Probably the most robust factor to emerge from intelligence tests is the verbal
factor. There has been considerable interest in determining what processes distinguish
people with strong verbal abilities. Goldberg, Schwartz, and Stewart
(1977) compared people with high verbal ability those with low verbal ability
with respect to the way in which they make various kinds of word judgments.
One kind of word judgment concerned simply whether pairs of words were
identical. Thus, participants would say yes to a pair such as
• bear, bear
Other participants were asked to judge whether pairs of words sounded alike.
Thus, they would say yes to a pair such as • bare, bear
A third group of participants were asked to judge whether pairs of words were
in the same category. Thus, they would say yes to a pair such as • lion, bear
Figure 14.17 shows the difference in time taken to make these three judgments
between participants with high verbal abilities and those with low verbal abilities.
As can be seen, participants with high verbal ability enjoy only a small
advantage on the identity judgments but show much larger advantages on the
sound and meaning matches. This study and others (e.g., Hunt, Davidson, &
Lansman, 1981) have convinced researchers that a major advantage of participants
with high verbal ability is the speed with which they can go from a linguistic
stimulus to information about it—in the study depicted in Figure 14.17
participants were going from the visual word to information about its sound
and meaning. Thus, as in the Sternberg studies in the preceding subsection,
speed of processing is related to intellectual ability.
There is also evidence for a fairly strong relation between working-memory
capacity for linguistic material and verbal ability. Daneman and Carpenter
(1980) developed the following test of individual differences in workingmemory
capacity. Participants would read or hear a number of unrelated sentences
such as • When at last his eyes opened, there was no gleam of triumph, no shade
of anger. • The taxi turned up Michigan Avenue where they had a clear view of the
lake.
After reading or hearing these sentences, participants had to recall the last
word of each sentence. They were tested on groups ranging from two to seven
416 | Individual Differences in Cognition
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 416
such sentences. The largest group of sentences for which they could recall the
last words was defined as the reading span or listening span. College students
had spans from 2 to 5.5 sentences. It turns out that these spans are very strongly
related to their comprehension scores and to tests of verbal ability. These reading
and listening spans are much more strongly related than are measures of
simple digit span. Daneman and Carpenter argued that a larger reading and
listening span indicates the ability to store a larger part of the text during
comprehension.
People of high verbal ability are able to rapidly retrieve meanings of words
and have large working memories for verbal information.
Spatial Ability
Efforts have been made to relate measures of spatial ability to research on
mental rotation, such as that discussed in Chapter 4. Just and Carpenter (1985)
Psychometric Studies of Cognition | 417
High verbal
Identity
700
800
900
1,000
1,100
1,200
1,300
Sound
Type of similarity
Meaning
Response time (m/s)
Low verbal
FIGURE 14.17 Response time of participants having high verbal abilities compared with those
having low verbal abilities in judging the similarity of pairs of words as a function of three types
of similarity. (From Goldberg, Schwartz, & Stewart, 1977.)
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 417
compared participants with low spatial ability and those with high spatial ability
performing the Shepard and Metzler mental rotation tasks (see Figure 4.4).
Figure 14.18 plots the speed with which these two types of participants can rotate
figures of differing angular disparity. As can be seen, participants with low spatial
ability not only performed the task more slowly but were also more affected
by angle of disparity. Thus the rate of mental rotation is lower for participants
with low spatial ability.
Spatial ability has often been set in contrast with verbal ability. Although
some people rate high on both abilities or low on both, interest often focuses
on people who display a relative imbalance of the abilities. MacLeod, Hunt,
and Matthews (1978) found evidence that these different types of people will
solve a cognitive task differently. They looked at performance on the Clark and
Chase sentence-verification task considered in Chapter 13. Recall that, in this
task, participants are presented with sentences such as The plus is above the star
or The star is not above the plus and asked to determine whether the sentence accurately
describes the picture. Typically, participants are slower when there is a
negative such as not in the sentence and when the supposition of the sentences
mismatches the picture.
MacLeod et al. speculated, however, that there were really two groups of
participants—those who took a representation of the sentence and matched it
against a picture and those who first converted the sentence into an image of a
picture and then matched that image against the picture. They speculated that
418 | Individual Differences in Cognition
0
0
2,000
4,000
6,000
8,000
10,000
12,000
14,000
16,000
30 60
Angular disparity (degrees)
Low spatial
High spatial
Response time (m/s)
90 120 150 180
FIGURE 14.18 Mean time taken to determine that two objects
have the same three-dimensional shape as a function of the angular
difference in their portrayed orientations. Separate functions are
plotted for participants with high spatial ability and those with low
spatial ability. (From Just & Carpenter, 1985. Copyright © 1985 by the APA. Adapted
by permission.)
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 418
the first group would be high in verbal ability, whereas
the second group would be high in spatial ability. In
fact, they did find two groups of participants. Figure 14.19
shows the judgment times of these two groups as a function
of whether the sentence was true and whether it
contained a negative. As can be seen, one group of participants
showed no effect of whether the sentence contained
a negative, whereas the other group showed a very
substantial effect. The group of participants not showing
the effect of a negative had higher scores on tests of
spatial ability than those of the other group. The group
not showing the effect was the group of participants who
compared an image formed from the sentence against
the picture. Such an image would not have a negative
in it.
Reichle, Carpenter, and Just (2000) performed an fMRI
brain-imaging study of the regions activated in these two
strategies. They explicitly instructed participants to use
either an imagery strategy or a verbal strategy to solve these
problems. The participants instructed to use the imagery
strategy were told:
Carefully read each sentence and form a mental picture of
the objects in the sentence and their arrangement. . . . After
the picture appears, compare the picture to your mental
image. (p. 268)
On the other hand, participants told to use the verbal
strategy were told:
Don’t try to form a mental image of the objects in the sentence, but instead
look at the sentence only long enough to remember it until the picture is
presented. . . . After the picture appears, decide whether or not the sentence
that you are remembering describes the picture. (p. 268)
They found that parietal regions associated with mental imagery tended to be
activated in participants who were told to use the imagery strategy (see Figure
4.1), whereas regions associated with verbal processing tended to be activated
in participants given the verbal strategy (see Figure 11.1). Interestingly,
when told to use the imagery strategy, participants who had lower imagery
ability showed greater activation in their imagery areas. Conversely, when told
to use the verbal strategy, participants with lower verbal ability tended to show
greater activation in their verbal regions. Thus, participants apparently have to
engage in more neural effort when they are required to use their less favored
strategy.
People with high spatial ability can perform elementary spatial operations
quite rapidly and often choose to solve a task spatially rather than verbally.
Psychometric Studies of Cognition | 419
True
affirmative
500
600
700
800
900
1,000
1,100
1,200
1,300
1,400
1,500
1,600
False
affirmative
Sentence difficulty
Mean verification time (m/s)
High-spatial participants
High-verbal participants
False
negative
True
negative
FIGURE 14.19 Mean time taken to judge a sentence as
a function of sentence type for participants with high
verbal ability compared with those with high spatial ability.
(From MacLeod, Hunt, & Mathews, 1978.)
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 419
420 | Individual Differences in Cognition
Conclusions from Psychometric Studies
A major outcome of the research relating psychometric measures to cognitive
tasks is to reinforce the distinction between verbal and spatial ability. A second
conclusion of this research is that differences in an ability (reasoning, linguistic,
or spatial) may result from differences in rates of processing and workingmemory
capacities. A number of researchers (e.g., Salthouse, 1992; Just &
Carpenter, 1992) have argued that the working-memory differences may result
from differences in processing speed, in that people can maintain more information
in working memory when they can process it more rapidly.
As already mentioned, Reichle et al. (2000) suggested that more-able participants
can solve problems with less expenditure of effort. An early study
confirming this general relation was performed by Haier et al. (1988). These
researchers looked at PET recordings taken during an abstract-reasoning task.
They found that the better-performing participants showed less PET activity,
again indicating that poorer-performing participants have to work harder at the
same task. Like the information-processing work pointing to processing speed,
this finding suggests that differences in intelligence may correspond to very
basic processes. There is a tendency to see such results as favoring a nativist
view, but in fact they are neutral to the nature-versus-nurture controversy.
Some people may take longer and may need to expand more effort to solve a
problem, either because they have practiced less or because they have inherently
less efficient neural structures.We saw earlier in the chapter that, with practice,
children could become faster than adults at processes such as mental rotation.
Figure 9.1 illustrated how the activity of the brain decreases as participants
become more practiced and faster at a task.
Individual differences in general factors such as verbal, reasoning, and
spatial abilities appear to correspond to the speed and ease with which
basic cognitive processes are performed.
•Conclusions
This concludes our consideration of human intelligence (this chapter) and
human cognition (this book). A recurring theme throughout the book has been
the diversity of the components of the mind. The first chapter reviewed evidence
for different specializations in the nervous system. The early chapters reviewed
the evidence for different levels of processing as the information entered the system.
The different types of knowledge representation and the distinction between
procedural and declarative knowledge were presented. Then, we considered the
distinct status of language.Many of these distinctions have been reinforced in this
chapter on individual differences. Throughout this book, different brain regions
have been shown to be specialized to perform different functions.
A second dimension of discussion has been rate of processing. Latency data
have been the most frequently used measure of cognitive functioning in this
book. Often, error measures (the second most common dependent measure) were
shown to be merely indications of slow processing.We have seen evidence in this
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 420
Key Terms | 421
1. Chapter 12 discussed data on child language acquisition.
In learning a second language, younger children
initially learn less rapidly, but there is evidence that
they achieve higher levels of mastery. Discusses this
phenomenon from the point of view of this chapter.
Consider in particular Figure 12.8.
2. Most American presidents are between the ages of
50 and 59 when they are first elected as president. The
youngest elected president was Kennedy (43 when he
was first elected) and the oldest was Reagan (69 when
he was first elected). The 2008 presidential election
featured a contest between a 47-year-old Obama and a
72-year-old McCain.What are the implications of this
chapter for an ideal age for an American president?
3. J. E. Hunter and R. F. Hunter (1984) report that
ability measures like IQ are better predictors of job
performance than academic grades.Why might be this
be so? A potentially relevant fact is that the most commonly
used measure of job performance is supervisor
ratings.
4. The chapter reviewed a series of results indicating
that higher ability people tended to perform basic
information processing steps in less time. There is also
a relationship between ability and perceived time it
takes to perform a demanding task (Fink & Neubauer,
2005). Generally, the more difficult an intellectual task
we perform, the more we tend to underestimate how
long it took. Higher ability people tend to have more
realistic estimates of the passage of time (i.e., they
underestimate less).Why might they underestimate
time less? How could this be related to the fact that
they perform the task more rapidly?
Questions for Thought
Key Terms
concrete-operational
stage
conservation
crystallized intelligence
factor analysis
fluid intelligence
formal-operational stage
intelligence quotient (IQ)
preoperational stage
psychometric test
sensory-motor stage
chapter that individuals vary in their rate of processing, and this book has stressed
that this rate can be increased with practice. Interestingly, the neuroscience
evidence tends to associate faster processing with lower metabolic expenditure.
The more efficient mind seems to performits tasks faster and at less cost.
In addition to the quantitative component of speed, individual differences
have a qualitative component. People can differ in where their strengths lie.
They can also differ in their selection of strategies for solving problems.We saw
evidence in Chapter 9 that one dimension of growing expertise is the development
of more-effective strategies.
One might view the human mind as being analogous to a large corporation
that consists of many interacting components. The differences among corporations
are often due to the relative strengths of their components.With practice,
different components tend to become more efficient at doing their tasks. Another
way to achieve improvement is by strategic reorganizations of parts of the
corporation. However, there is more to a successful company than just the sum
of its parts. These pieces have to interact together smoothly to achieve the overall
goals of the organization. Some researchers (e.g., Anderson et al., 2004;
Newell, 1990) have complained about the rather fragmented picture of the
human mind that emerges from current research in cognitive psychology. One
agenda for future research will be to understand how all the pieces fit together
to achieve a human mind.
Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 421
21⁄2-D sketch:Marr’s proposal for a visual representation that identifies
where surfaces are located in space relative to the viewer. (p. 40)
3-D model: Marr’s proposal for an object-centered representation
of a visual scene. (p. 40)
abstraction theory: A theory holding that concepts are represented
as abstract descriptions of their central tendencies. Contrast with
instance theory. (p. 140)
ACT (Adaptive Control of Thought): Anderson’s theory of how
declarative knowledge and procedural knowledge interact in complex
cognitive processes. (p. 156)
action potential: The sudden change in electric potential that
travels down the axon of a neuron. (p. 15)
activation: A state of memory traces that determines both the
speed and the probability of access to a memory trace. (p. 156)
affirmation of the consequent: The logical fallacy that one can
reason from the affirmation of the consequent of a conditional
statement to the affirmation of its antecedent: If A, then B and
B is true together can be thought (falsely) to imply A is true.
(p. 276)
AI: See artificial intelligence.
allocentric representation: A representation of the environment
according to a fixed coordinate system. Contrast with egocentric
representation. (p. 107)
amnesia:Amemory deficit due to brain damage. See also anterograde
amnesia; retrograde amnesia; Korsakoff syndrome. (p. 200)
amodal hypothesis: The proposal that meaning is not represented in
a particular modality. Contrast with multimodal hypothesis. (p. 130)
amodal symbol system: The proposal that information is
represented by symbols that are not associated with a particular
modality. Contrast with perceptual symbol system. (p. 127)
analogy: The process by which a problem solver maps the solution
for one problem into a solution for another problem. (p. 218)
antecedent: The condition of a conditional statement; that is, the A
in If A, then B. (p. 275)
anterior cingulate cortex (ACC):Medial portion of the prefrontal
cortex important in control and dealing with conflict. (p. 89)
anterograde amnesia: Loss of the ability to learn new things after
an injury. Contrast with retrograde amnesia. (pp. 146, 201)
aphasia: An impairment of speech that results from a brain
injury. (p. 22)
apperceptive agnosia: A form of visual agnosia marked by the
inability to recognize simple shapes such as circles and
triangles. (p. 33)
arguments: An element of a propositional representation that
corresponds to a time, place, person, or object. (p. 123)
Glossary
articulatory loop: Baddeley’s proposed system for rehearsing
verbal information. (p. 153)
artificial intelligence (AI): A field of computer science that
attempts to develop programs that will enable machines to display
intelligent behavior. (p. 2)
associative agnosia: A form of visual agnosia marked by the inability
to recognize complex objects such as an anchor, even though
the patient can recognize simple shapes and can copy drawings of
complex objects. (p. 33)
associative spreading: Facilitation in access to information when
closely related items are presented. (p. 159)
associative stage: The second of Fitts’s stages of skill acquisition,
in which the declarative representation of a skill is converted into
a procedural representation. (p. 241)
atmosphere hypothesis: The proposal by Woodworth and Sells
that, when faced with a categorical syllogism, people tend to
accept conclusions having the same quantifiers as those of the
premises. (p. 283)
attention: The allocation of cognitive resources among ongoing
processes. (p. 64)
attenuation theory: Treisman’s theory of attention, which proposes
that we weaken some incoming sensory signals on the basis of their
physical characteristics. (p. 67)
attribute identification: The problem of determining what attributes
are relevant to the formation of a hypothesis. See also rule learning.
(p. 301)
auditory sensory store: A memory system that effectively holds all
the information heard for a brief period of time. Also called echoic
memory. (p. 149)
automaticity: The ability to perform a task with little or no central
cognitive control. (p. 86)
autonomous stage: The third of Fitts’s stages of skill acquisition,
in which the performance of a skill becomes automated. (p. 241)
axon: The part of a neuron that carries information from one
region of the brain to another. (p. 14)
backup avoidance: The tendency in problem solving to avoid
operators that take one back to a state already visited. (p. 221)
backward inference: See bridging inference.
bar detectors: A cell in the visual cortex that responds most to bars
in the visual field. Compare edge detector. (p. 38)
basal ganglia: Subcortical nuclei that play a critical role in the
control of motor movement and cognition. (p. 20)
Bayes’s theorem: A theorem that prescribes how to combine the
prior probability of a hypothesis with the conditional probability
422
Anderson7e_Glossary.qxd 8/22/09 8:52 AM Page 422
Glossary | 423
concrete-operational stage: The third of Piaget’s four stages of
development, during which a child has systematic schemes for
thinking about the physical world. (p. 393)
conditional probability: In the context of Bayes’s theorem, the
probability that a particular piece of evidence will be found if
a hypothesis is true. (p. 301)
conditional statement: An assertion that, if an antecedent
Leave a Reply
Want to join the discussion?Feel free to contribute!