EFTA00307512.pdf
dataset_9 pdf 2.6 MB • Feb 3, 2026 • 17 pages
Vol.. 63, No. 2 MARCH, 1956
THE PSYCHOLOGICAL REVIEW
THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO:
SOME LIMITS ON OUR CAPACITY FOR
PROCESSING INFORMATION'
GEORGE A. MILLER
Harvard University
My problem is that I have been perse- judgment. Historical accident, how-
cuted by an integer. For seven years ever, has decreed that they should have
this number has followed me around, has another name. We now call them ex-
Intruded in my most private data, and periments on the capacity of people to
has assaulted me from the pages of our transmit information. Since these ex-
most public journals. This number as- periments would not have been done
sumes a variety of disguises, being some- without the appearance of information
times a little larger and sometimes a theory on the psychological scene, and
little smaller than usual, but never since the results are analyzed in terms
changing so much as to be unrecogniz- of the concepts of information theory,
able. The persistence with which this I shall have to preface my discussion
number plagues me is far more than with a few remarks about this theory.
a random accident. There is, to quote
INFORMATION MEASUREMENT
a famous senator, a design behind it,
some pattern governing its appearances. The "amount of information" is ex-
Either there really is something unusual actly the same concept that we have
about the number or else I am suffering talked about for years under the name
from delusions of persecution. of "variance." The equations are dif-
I shall begin my case history by tell- ferent, but if we hold tight to the idea
ing you about some experiments that that anything that increases the vari-
tested how accurately people can assign ance also increases the amount of infor-
numbers to the magnitudes of various mation we cannot go far astray.
aspects of a stimulus. In the tradi- The advantages of this new way
tional language of psychology these of talking about variance are simple
would be called experiments in absolute enough. Variance is always stated in
terms of the unit of measurement—
Th1s paper was first read as an Invited inches, pounds, volts, etc.—whereas the
Address before the Eastern Psychological As-
sociation In Philadelphia on April IS, 193.5. amount of information is a dimension-
Preparation of the paper was supported by less quantity. Since the information in
the Harvard Psycho-Acoustic Laboratory un- a discrete statistical distribution does
der Contract NSort-76 between Harvard Uni- not depend upon the unit of measure-
versity and the Office of Naval Research, U. S. ment, we can extend the concept to
Navy (Project NR142-201, Report PNR-174). situations where we have no metric and
Reproduction for any purpose of the U. S.
Government is permitted. we would not ordinarily think of using
81
I'
EFTA00307512
82 GEORGE A. Mn...ten
the variance. And It also enables us to des. Then the left circle can be taken
compare results obtained in quite dif- to represent the variance of the input,
ferent experimental situations where it the right circle the variance of the out-
would be meaningless to compare vari- put, and the overlap the covariance of
ances based on different metrics. So input and output. I shall speak of the
there are some good reasons for adopt- left circle as the amount of input infor-
ing the newer concept. mation, the right circle as the amount
The similarity of variance and amount of output information, and the overlap
of information might be explained this as the amount of transmitted informa-
way: When we have a large variance, tion.
we are very ignorant about what is go- In the experiments on absolute judg-
ing to happen. If we are very ignorant, ment, the observer is considered to be
then when we make the observation It a communication channel. Then the
gives us a lot of information. On the left circle would represent the amount
other hand, if the variance is very small, of information in the stimuli, the right
we know in advance how our observa- circle the amount of information in his
tion must come out, so we get little in- responses, and the overlap the stimulus-
formation from making the observation. response correlation as measured by the
If you will now imagine a communi- amount of transmitted information. The
cation system, you will realize that experimental problem is to increase the
there is a great deal of variability about amount of input information and to
what goes into the system and also a measure the amount of transmitted in-
great deal of variability about what formation. If the observer's absolute
comes out. The input and the output judgments are quite accurate, then
can therefore be described in terms of nearly all of the input information will
their variance (or their information). be transmitted and will be recoverable
If it is a good communication system, from his responses. If he makes errors,
however, there must be some system- then the transmitted information may
atic relation between what goes in and be considerably less than the input. We
what comes out. That is to say, the expect that, as we increase the amount
output will depend upon the input, or of input information, the observer will
will be correlated with the input. If we begin to make more and more errors;
measure this correlation, then we can we can test the limits of accuracy of his
say how much of the output variance is absolute judgments. If the human ob-
attributable to the input and how much server is a reasonable kind of communi-
is due to random fluctuations or "noise" cation system, then when we increase
introduced by the system during trans- the amount of input information the
mission. So we see that the measure transmitted information will increase at
of transmitted information is simply a first and will eventually level off at some
measure of the input-output correlation. asymptotic value. This asymptotic value
There are two simple rules to follow. we take to be the channel capacity of
Whenever I refer to "amount of in- the observer: it represents the greatest
formation," you will understand "vari- amount of information that he can give
ance." And whenever I refer to "amount us about the stimulus on the basis of
of transmitted information," you will an absolute judgment. The channel ca-
understand "covariance" or "correla- pacity is the upper limit on the extent
tion." to which the observer can match his re-
The situation can be described graphi- sponses to the stimuli we give him.
cally by two partially overlapping cir- Now just a brief word about the bit
EFTA00307513
THE MAGICAL NUMBER SEVEN 83
and we can begin to look at some data. ABSOLUTE JUDGMENTS OF UNI-
One bit of information is the amount of DIMENSIONAL STIMULI
information that we need to make a Now let us consider what happens
decision between two equally likely al- when we make absolute judgments of
ternatives. If we must decide whether tones. Pollack (17) asked listeners to
a man is less than six feet tall or more identify tones by assigning numerals to
than six feet tall and if we know that them. The tones were different with re-
the chances are SO-50, then we need spect to frequency, and covered the
one bit of information. Notice that range from 100 to 8000 cps in equal
this unit of information does not refer logarithmic steps. A tone was sounded
in any way to the unit of length that and the listener responded by giving a
we use—feet, inches, centimeters, etc. numeral. After the listener had made
However you measure the man's height, his response he was told the correct
we still need just one bit of information. identification of the tone.
Two bits of information enable us to When only two or three tones were
decide among four equally likely alter- used the listeners never confused them.
natives. Three bits of information en- With four different tones confusions
able us to decide among eight equally were quite rare, but with five or more
likely alternatives. Four bits of infor- tones confusions were frequent. With
mation decide among 16 alternatives, fourteen different tones the listeners
five among 32, and so on. That is to made many mistakes.
say, if there are 32 equally likely alter- These data are plotted in Fig. 1.
natives, we must make five successive Along the bottom is the amount of in-
binary decisions, worth one bit each, be- put information in bits per stimulus.
fore we know which alternative is cor- As the number of alternative tones was
rect. So the general rule is simple: increased from 2 to 14, the input infor-
every time the number of alternatives mation increased from 1 to 3.8 bits. On
is increased by a factor of two, one bit the ordinate is plotted the amount of
of information is added.
There are two ways we might in-
crease the amount of input information.
We could increase the rate at which we
give information to the observer, so that 2 2 5
sirs
the amount of information per unit time
would increase. Or we could ignore the 0
time variable completely and increase
PITCHES
the amount of input information by 4 i00-8000 CP5
increasing the number of alternative
stimuli. In the absolute judgment ex- 0 3 n 5
periment we are interested in the second INPUT INrORMAT ION
alternative. We give the observer as
no. 1. Data from Pollack (17, 18) on the
much time as he wants to make his re- amount of information that is transmitted by
sponse; we simply increase the number listeners who make absolute judgments of
of alternative stimuli among which he auditory pitch. As the amount of input in-
must discriminate and look to see where formation is increased by increasing from 2
to 14 the number of different pitches to be
confusions begin to occur. Confusions judged, the amount of transmitted informa-
will appear near the point that we are tion approaches as its upper limit a channel
calling his "channel capacity." capacity of about IS bits per judgment.
EFTA00307514
84 GEORGE A. Minn
transmitted information. The amount
of transmitted information behaves in
much the way we would expect a corn-
munication channel to behave; the trans- 2.3
mitted information increases linearly up BITS
to about 2 bits and then bends off to-
ward an asymptote at about 2.5 bits.
This value, 2.5 bits, therefore, is what
we are calling the channel capacity of
the listener for absolute judgments of
pitch. INPUT INFORMATION
So now we have the number 2.5 FIG. 2. Data from Garner (7) on the chan-
bits. What does it mean? First, note nel capacity for absolute judgments of audi-
that 2.5 bits corresponds to about six tory loudness.
equally likely alternatives. The result
means that we cannot pick more than Next you can ask how reproducible
six different pitches that the listener will this result is. Does it depend on the
never confuse. Or, stated slightly dif- spacing of the tones or the various con-
ferently, no matter how many alterna- ditions of judgment? Pollack varied
tive tones we ask him to judge, the best these conditions in a number of ways.
we can expect him to do is to assign The range of frequencies can be changed
them to about six different classes with- by a factor of about 20 without chang-
out error. Or, again, if we know that ing the amount of information trans-
there were N alternative stimuli, then mitted more than a small percentage.
his judgment enables us to narrow down Different groupings of the pitches de-
the particular stimulus to one out of creased the transmission, but the loss
N/6. was small. For example, if you can
Most people are surprised that the discriminate five high-pitched tones in
number is as small as six. Of course, one series and five low-pitched tones in
there is evidence that a musically so- another series, it is reasonable to ex-
phisticated person with absolute pitch pect that you could combine all ten into
can identify accurately any one of 50 a single series and still tell them all
or 60 different pitches. Fortunately, I apart without error. When you try it,
do not have time to discuss these re- however, it does not work. The chan-
markable exceptions. I say it is for- nel capacity for pitch seems to be about
tunate because I do not know how to six and that is the best you can do.
explain their superior performance. So While we are on tones, let us look
I shall stick to the more pedestrian fact next at Garner's (7) work on loudness.
that most of us can identify about one Garner's data for loudness are sum-
out of only five or six pitches before we marized in Fig. 2. Garner went to some
begin to get confused. trouble to get the best possible spacing
It is interesting to consider that psy- of his tones over the intensity range
chologists have been using seven-point from 15 to 110 db. He used 4, 5, 6, 7,
rating scales for a long time, on the 10, and 20 different stimulus intensities.
intuitive basis that trying to rate into The results shown in Fig. 2 take into
finer categories does not really add much account the differences among subjects
to the usefulness of the ratings. Pol- and the sequential influence of the im-
lack's results indicate that, at least for mediately preceding judgment. Again
pitches, this Intuition is fairly sound. we find that there seems to be a limit.
EFTA00307515
THE MAGICAL NUMBER SEVEN 85
and Garner (8) asked observers to in-
terpolate visually between two scale
markers. Their results are shown in
Fig. 4. They did the experiment in
- I9 two ways. In one version they let the
gas observer use any number between zero
and 100 to describe the position, al-
TASTF$ though they presented stimuli at only
•Fuocaattts of samw
CONCENTRATION 5, 10, 20, or 50 different positions. The
2 3 4 5 results with this unlimited response
technique are shown by the filled circles
elm INFORMATION on the graph. In the other version the
Fro. S. Data from Beebe-Center, Rogers, observers were limited in their re-
and O'Connell (1) on the channel capacity for sponses to reporting just those stimu-
absolute judgments of saltiness. lus values that were possible. That is
to say, in the second version the num-
The channel capacity for absolute judg- ber of different responses that the ob-
ments of loudness is 2.3 bits, or about server could make was exactly the same
five perfectly discriminable alternatives. as the number of different stimuli that
Since these two studies were done in the experimenter might present. The
different laboratories with slightly dif- results with this limited response tech-
ferent techniques and methods of analy- nique are shown by the open circles on
sis, we are not in a good position to the graph. The two functions are so
argue whether five loudnesses is signifi- similar that it seems fair to conclude
cantly different from six pitches. Prob- that the number of responses available
ably the difference is in the right direc- to the observer had nothing to do with
tion, and absolute judgments of pitch the channel capacity of 3.25 bits.
are slightly more accurate than absolute The Hake-Garner experiment has been
judgments of loudness. The important repeated by Coonan and Klemmer. Al-
point, however, is that the two answers though they have not yet published
are of the same order of magnitude. their results, they have given me per-
The experiment has also been done mission to say that they obtained chan-
for taste intensities. In Fig. 3 are the nel capacities ranging from 3.2 bits for
results obtained by Beebe-Center, Rog-
ers, and O'Connell (1) for absolute
judgments of the concentration of salt
solutions. The concentrations ranged
from 0.3 to 34.7 gin. Neel per 100
cc. tap water in equal subjective steps.
They used 3, 5, 9, and 17 different con-
centrations. The channel capacity is POINTS ON A LINE
o me • ms
1.9 bits, which is about four distinct • mr • KO
concentrations. Thus taste intensities a
seem a little less distinctive than audi- 2 3 4 5
tory stimuli, but again the order of iNFUT INFORMATION
magnitude is not far off. Fro. 4. Data from Hake and Garner (8)
On the other hand, the channel ca- on the channel capacity for absolute Judg-
pacity for judgments of visual position ments of the position of a pointer in a linear
seems to be significantly larger. Hake interval.
EFTA00307516
86 GEORGE A. MILLER
very short exposures of the pointer po- for the long exposure. Curvature was
sition to 3.9 bits for longer exposures. apparently harder to judge. When the
These values are slightly higher than length of the arc was constant, the re-
Hake and Garner's, so we must con- sult at the short exposure duration was
clude that there are between 10 and IS 2.2 bits, but when the length of the
distinct positions along a linear inter- chord was constant, the result was only
val. This is the largest channel ca- 1.6 bits. This last value is the lowest
pacity that has been measured for any that anyone has measured to date. I
unidimensional variable. should add, however, that these values
At the present time these four experi- are apt to be slightly too low because
ments on absolute judgments of simple, the data from all subjects were pooled
unidimensional stimuli are all that have before the transmitted information was
appeared in the psychological journals. computed.
However, a great deal of work on other Now let us see where we are. First,
stimulus variables has not yet appeared the channel capacity does seem to be a
in the journals. For example, Eriksen valid notion for describing human ob-
and Hake (6) have found that the servers. Second, the channel capacities
channel capacity for judging the sizes measured for these unidimensional vari-
of squares is 2.2 bits, or about five ables range from 1.6 bits for curvature
categories, under a wide range of ex- to 3.9 bits for positions in an interval.
perimental conditions. In a separate
Although there is no question that the
experiment Eriksen (5) found 2.8 bits
differences among the variables are real
for size, 3.1 bits for hue, and 2.3 bits
and meaningful, the more impressive
for brightness. Geldard has measured
fact to me is their considerable simi-
the channel capacity for the skin by
placing vibrators on the chest region. larity. If I take the best estimates I
A good observer can identify about four can get of the channel capacities for all
intensities, about five durations, and the stimulus variables I have mentioned,
about seven locations. the mean is 2.6 bits and the standard
One of the most active groups in this deviation is only 0.6 bit. In terms of
area has been the Air Force Operational distinguishable alternatives, this mean
Applications Laboratory. Pollack has corresponds to about 6.5 categories, one
been kind enough to furnish me with standard deviation includes from 4 to
the results of their measurements for 10 categories, and the total range is
several aspects of visual displays. They from 3 to 15 categories. Considering
made measurements for area and for the wide variety of different variables
the curvature, length, and direction of that have been studied, I find this to
lines. In one set of experiments they be a remarkably narrow range.
used a very short exposure of the stimu- There seems to be some limitation
lus—%0 second—and then they re- built into us either by learning or by
peated the measurements with a 5- the design of our nervous systems, a
second exposure. For area they got limit that keeps our channel capacities
2.6 bits with the short exposure and in this general range. On the basis of
2.7 bits with the long exposure. For the present evidence it seems safe to
the length of a line they got about 2.6 say that we possess a finite and rather
bits with the short exposure and about small capacity for making such unfelt-
3.0 bits with the long exposure. Direc- mensional judgments and that this ca-
tion, or angle of inclination, gave 2.8 pacity does not vary a great deal from
bits for the short exposure and 3.3 bits one simple sensory attribute to another.
EFTA00307517
THE MAGICAL NUMBER SEVEN 87
ABSOLUTE JUDGMENTS OF MULTI- sults. Now the channel capacity seems
DIMENSIONAL STIMULI to have increased to 4.6 bits, which
means that people can identify accu-
You may have noticed that I have rately any one of 24 positions in the
been careful to say that this magical square.
number seven applies to one-dimensional The position of a dot In a square is
judgments. Everyday experience teaches clearly a two-dimensional proposition.
us that we can identify accurately any Both its horizontal and its vertical po-
one of several hundred faces, any one sition must be identified. Thus it seems
of several thousand words, any one of natural to compare the 4.6-bit capacity
several thousand objects, etc. The story for a square with the 3.25-bit capacity
certainly would not be complete if we for the position of a point in an inter-
stopped at this point. We must have val. The point in the square requires
some understanding of why the one- two judgments of the interval type. If
dimensional variables we judge in the we have a capacity of 3.25 bits for esti-
laboratory give results so far out of mating intervals and we do this twice,
line with what we do constantly in our we should get 6.5 bits as our capacity
behavior outside the laboratory. A pos- for locating points in a square. Adding
sible explanation lies in the number of the second independent dimension gives
independently variable attributes of the us an increase from .3.25 to 4.6, but it
stimuli that are being judged. Objects, falls short of the perfect addition that
faces, words, and the like differ from would give 6.5 bits.
one another in many ways, whereas the Another example is provided by Beebe-
simple stimuli we have considered thus Center, Rogers, and O'Connell. When
far differ from one another in only one they asked people to identify both the
respect. saltiness and the sweetness of solutions
Fortunately, there are a few data on containing various concentrations of salt
what happens when we make absolute and sucrose, they found that the chan-
judgments of stimuli that differ from nel capacity was 2.3 bits. Since the ca-
one another in several ways. Let us pacity for salt alone was 1,9, we might
look first at the results Klemmer and expect about 3.8 bits if the two aspects
Frick (13) have reported for the abso- of the compound stimuli were judged
lute judgment of the position of a dot independently. As with spatial loca-
in a square. In Fig. S we see their re- tions, the second dimension adds a little
to the capacity but not as much as it
conceivably might.
A third example is provided by Pol-
- --4.6 BITS — 7ete lack (18), who asked listeners to judge
both the loudness and the pitch of pure
tones. Since pitch gives 2.5 bits and
loudness gives 2.3 bits, we might hope
POINTS IN A SQUARE to get as much as 4.8 bits for pitch and
I43 GRID loudness together. Pollack obtained 3.1
.03 SEC. EXPOSURE
bits, which again indicates that the
I z 3 4 5 6 7 6 a second dimension augments the channel
INPUT INFORMATION capacity but not so much as it might.
Fro. S. Data from ICkmmer and Frick (13) A fourth example can be drawn from
on the channel capacity for absolute Judg- the work of Halsey and Chapanis (9)
ments of the position of a dot In a square. on confusions among colors of equal
EFTA00307518
88 GEORGE A. MILLER
luminance. Although they did not ana- 10
lyze their results in informational terms,
they estimate that there are about 11 to
15 identifiable colors, or, in our terms,
3
about 3.6 bits. Since these colors varied 3 6
in both hue and saturation, it is prob-
ably correct to regard this as a two-
dimensional judgment. If we compare 2
this with Eriksen's 3.1 bits for hue
(which is a questionable comparison to 0 2 3 4 5 6 7
draw), we again have something less
NUMBER OF VAAI&SLC ASPECTS
than perfect addition when a second
dimension is added. Fm. 6. The general form of the relation be-
It is still a long way, however, from tween channel capacity and the number of in-
these two-dimensional examples to the dependently variable attributes of the stimuli.
multidimensional stimuli provided by
faces, words, etc. To fill this gap we decreasing rate. It Is Interesting to
have only one experiment, an auditory note that the channel capacity is in-
study done by Pollack and Ficks (19). creased even when the several variables
They managed to get six different acous- are not independent. Eriksen (5) re-
tic variables that they could change: ports that, when size, brightness, and
frequency, intensity, rate of Interrup- hue all vary together in perfect correla-
tion, on-time fraction, total duration, tion, the transmitted information is 4.1
and spatial location. Each one of these bits as compared with an average of
six variables could assume any one of about 2.7 bits when these attributes are
five different values, so altogether there varied one at a time. By confounding
were 58, or 15,625 different tones that three attributes, Eriksen increased the
they could present. The listeners made dimensionality of the input without in-
a separate rating for each one of these creasing the amount of input informa-
six dimensions. Under these conditions tion; the result was an increase in chan-
the transmitted information was 7.2 bits, nel capacity of about the amount that
which corresponds to about ISO differ- the dotted function in Fig. 6 would lead
ent categories that could be absolutely us to expect.
identified without error. Now we are The point seems to be that, as we
beginning to get up into the range that add more variables to the display, we
ordinary experience would lead us to increase the total capacity, but we de-
expect. crease the accuracy for any particular
Suppose that we plot these data, variable. In other words, we can make
fragmentary as they are, and make a relatively crude judgments of several
guess about how the channel capacity things simultaneously.
changes with the dimensionality of the We might argue that in the course of
stimuli. The result is given in Fig. 6. evolution those organisms were most
In a moment of considerable daring I successful that were responsive to the
sketched the dotted line to indicate widest range of stimulus energies in
roughly the trend that the data seemed their environment. In order to survive
to be taking. in a constantly fluctuating world, it was
Clearly, the addition of independently better to have a little information about
variable attributes to the stimulus in- a lot of things than to have a lot of in-
creases the channel capacity, but at a formation about a small segment of the
EFTA00307519
THE MAGICAL NUMBER SEVEN 89
environment. If a compromise was nec- find out. There is a limit, however, at
essary, the one we seem to have made is about eight or nine distinctive features
clearly the more adaptive. in every language that has been studied,
Pollack and Ficks's results are very and so when we talk we must resort to
strongly suggestive of an argument that still another trick for increasing our
linguists and phoneticians have been channel capacity. Language uses se-
making for some time (11). According quences of phonemes, so we make sev-
to the linguistic analysis of the sounds eral judgments successively when we
of human speech, there are about eight listen to words and sentences. That is
or ten dimensions—the linguists call to say, we use both simultaneous and
them distinctive features—that distin- successive discriminations in order to
guish one phoneme from another. These expand the rather rigid limits imposed
distinctive features are usually binary, by the inaccuracy of our absolute judg-
or at most ternary, in nature. For ex- ments of simple magnitudes.
ample, a binary distinction is made be- These multidimensional judgments are
tween vowels and consonants, a binary strongly reminiscent of the abstraction
decision is made between oral and nasal experiment of Kiilpe (14). As you may
consonants, a ternary decision is made remember, Kiilpe showed that observers
among front, middle, and back pho- report more accurately on an attribute
nemes, etc. This approach gives us for which they are set than on attributes
quite a different picture of speech per- for which they are not set. For exam-
ception than we might otherwise obtain
ple, Chapman (4) used three different
from our studies of the speech spectrum attributes and compared the results ob-
and of the ear's ability to discriminate
tained when the observers were in-
relative differences among pure tones.
I am personally much interested in this structed before the tachistoscopic pres-
entation with the results obtained when
new approach (15), and I regret that
they were not told until after the pres-
there is not time to discuss it here.
It was probably with this linguistic entation which one of the three attri-
theory in mind that Pollack and Ficks butes was to be reported. When the
conducted a test on a set of tonal instruction was given in advance, the
stimuli that varied in eight dimensions, judgments were more accurate. When
but required only a binary decision on the instruction was given afterwards,
each dimension. With these tones they the subjects presumably had to judge all
measured the transmitted information three attributes in order to report on
at 6.9 bits, or about 120 recognizable any one of them and the accuracy was
kinds of sounds. It is an intriguing correspondingly lower. This Is in com-
question, as yet unexplored, whether plete accord with the results we have
one can go on adding dimensions in- just been considering, where the ac-
definitely in this way. curacy of judgment on each attribute
In human speech there is clearly a decreased as more dimensions were
limit to the number of dimensions that added. The point is probably obvious,
we use. In this instance, however, it is but I shall make it anyhow, that the
not known whether the limit is imposed abstraction experiments did not demon-
by the nature of the perceptual ma- strate that people can judge only one
chinery that must recognize the sounds attribute at a time. They merely showed
or by the nature of the speech ma- what seems quite reasonable, that peo-
chinery that must produce them. Some- ple are less accurate if they must Judge
body will have to do the experiment to more than one attribute simultaneously.
EFTA00307520
90 GEORGE A. Main
&INTIM° two dimensions of numerousness are
I cannot leave this general area with- area and density. When the subject
out mentioning, however briefly, the ex- can subitize, area and density may not
periments conducted at Mount Holyoke be the significant variables, but when
College on the discrimination of num- the subject must estimate perhaps they
ber (12). In experiments by Kaufman, are significant. In any event, the com-
Lord, Reese, and Volkmann random parison is not so simple as it might
patterns of dots were flashed on a screen seem at first thought.
for t/5 a second. Anywhere from 1 This is one of the ways in which the
to more than 200 dots could appear in magical number seven has persecuted
the pattern. The subject's task was to me. Here we have two closely related
report how many dots there were. kinds of experiments, both of which
The first point to note is that on pat- point to the significance of the number
terns containing up to five or six dots seven as a limit on our capacities. And
the subjects simply did not make errors. yet when we examine the matter more
The performance on these small num- closely, there seems to be a reasonable
bers of dots was so different from the suspicion that it is nothing more than
performance with more dots that it was a coincidence.
given a special name. Below seven the
THE SPAN OF IMMEDIATE MEMORY
subjects were said to subitize; above
seven they were said to estimate. This Let me summarize the situation in
is, as you will recognize, what we once this way. There is a clear and definite
optimistically called "the span of atten- limit to the accuracy with which we can
tion." identify absolutely the magnitude of
This discontinuity at seven is, of a unidimensional stimulus variable. I
course, suggestive. Is this the same would propose to call this limit the
basic process that limits our unidimen- span of absolute judgment, and I
sional judgments to about seven cate- maintain that for unidimensional judg-
gories? The generalization is tempting, ments this span is usually somewhere
but not sound in my opinion. The data in the neighborhood of seven. We are
on number estimates have not been ana- not completely at the mercy of this
lyzed in informational terms; but on limited span, however, because we have
the basis of the published data I would a variety of techniques for getting
guess that the subjects transmitted around it and increasing the accuracy
something more than four bits of in- of our judgments. The three most im-
formation about the number of dots. portant of these devices are (a) to
Using the same arguments as before, we make relative rather than absolute judg-
would conclude that there are about 20 ments; or, if that is not possible, (b)
or 30 distinguishable categories of nu- to increase the number of dimensions
merousness. This is considerably more along which the stimuli can differ; or
information than we would expect to (c) to arrange the task in such a way
get from a unidimensional display. It that we make a sequence of several ab-
is, as a matter of fact, very much like a solute judgments in a row.
two-dimensional display. Although the The study of relative judgments is
dimensionality of the random dot pat- one of the oldest topics in experimental
terns is not entirely clear, these results psychology, and I will not pause to re-
are in the same range as Klemmer and view it now. The second device, in-
Frick's for their two-dimensional dis- creasing the dimensionality, we have just
play of dots in a square. Perhaps the considered. It seems that by adding
EFTA00307521
THE MAGICAL NUMBER SEVEN 91
more dimensions and requiring crude, a lot of different kinds of test materials
binary, yes-no judgments on each at- this span is about seven items in length.
tribute we can extend the span of abso- I have just shown you that there is a
lute judgment from seven to at least span of absolute judgment that can dis-
ISO. Judging from our everyday be- tinguish about seven categories and that
havior, the limit is probably in the there is a span of attention that will
thousands, if indeed there is a limit. In encompass about six objects at a glance.
my opinion, we cannot go on compound- What is more natural than to think that
ing dimensions indefinitely. I suspect all three of these spans are different as-
that there is also a span of perceptual pects of a single underlying process?
dimensionality and that this span is And that is a fundamental mistake, as
somewhere in the neighborhood of ten, I shall be at some pains to demonstrate.
but I must add at once that there is no This mistake is one of the malicious
objective evidence to support this sus- persecutions that the magical number
picion. This is a question sadly need- seven has subjected me to.
ing experimental exploration. My mistake went something like this.
Concerning the third device, the use We have seen that the invariant fea-
of successive judgments, I have quite a ture in the span of absolute judgment
bit to say because this device introduces is the amount of information that the
memory as the handmaiden of discrimi- observer can transmit. There is a real
nation. And, since mnemonic processes operational similarity between the ab-
are at least as complex as are perceptual solute judgment experiment and the
processes, we can anticipate that their immediate memory experiment. If im-
interactions will not be easily disen- mediate memory is like absolute judg-
tangled. ment, then it should follow that the in-
Suppose that we start by simply ex- variant feature in the span of immediate
tending slightly the experimental pro- memory is also the amount of informa-
cedure that we have been using. Up tion that an observer can retain. If the
to this point we have presented a single amount of information in the span of
stimulus and asked the observer to name immediate memory is a constant, then
it immediately thereafter. We can ex- the span should be short when the indi-
tend this procedure by requiring the ob- vidual items contain a lot of informa-
server to withhold his response until we tion and the span should be long when
have given him several stimuli in suc- the items contain little information. For
cession. At the end of the sequence of example, decimal digits are worth 3.3
stimuli he then makes his response. We bits apiece. We can recall about seven
still have the same sort of input-out- of them, for a total of 23 bits of in-
put situation that is required for the formation. Isolated English words are
measurement of transmitted informa- worth about 10 bits apiece. If the total
tion. But now we have passed from amount of information is to remain
an experiment on absolute judgment to constant at 23 bits, then we should be
what is traditionally called an experi- able to remember only two or three
ment on immediate memory. words chosen at random. In this way
Before we look at any data on this I generated a theory about how the span
topic I feel I must give you a word of of immediate memory should vary as a
warning to help you avoid some obvi- function of the amount of information
ous associations that can be confusing. per item in the test materials.
Everybody knows that there is a finite The measurements of memory span in
span of immediate memory and that for the literature are suggestive on this
EFTA00307522
92 GEORGE A. Main
50 -
question, but not definitive. And so it
was necessary to do the experiment to CONSIANT
see. Hayes (10) tried it out with five O f 40 NOOS
OF ! /.
z
different kinds of test materials: binary 5
digits, decimal digits, letters of the al- ffi so
phabet, letters plus decimal digits, and
with 1,000 monosyllabic words. The it 20
lists were read aloud at the rate of one Ct.
I.ETTEF4
item per second and the subjects bad as AtED
much time as they needed to give their piGn‘
responses. A procedure described by 1 4 S
Woodworth (20) was used to score the
mrFORmATiON PER ITEM IN NITS
responses.
The results are shown by the filled Fm. 8. Data from Pollack (16) on the
circles in Fig. 7. Here the dotted line amount of information retained after one
indicates what the span should have amount presentation plotted as a function of the
of information per item in the test
been if the amount of information in the materials.
span were constant. The solid curves
represent the data. Hayes repeated the There is nothing wrong with Hayes's
experiment using test vocabularies of experiment, because Pollack (16) re-
different sizes but all containing only peated it much more elaborately and
English monosyllables (open circles in got essentially the same result. Pol-
Fig. 7). This more homogeneous test lack took pains to measure the amount
material did not change the picture sig- of information transmitted and did not
nificantly. With binary items the span rely on the traditional procedure for
is about nine and, although it drops to scoring the responses. His results are
about five with monosyllabic English plotted in Fig. 8. Here it is clear that
words, the difference is far less than the amount of information transmitted
the hypothesis of constant information is not a constant, but increases almost
would require. linearly as the amount of information
per item in the input is increased.
C44maN. 15(1103 'PA
APIS DTI I SOIOITS WORDS And so the outcome is perfectly clear.
80 In spite of the coincidence that the
magical number seven appears in both
40 places, the span of absolute judgment
%CONSTANT and the span of immediate memory are
t INFORMATION quite different kinds of limitations that
E 50
a
1 are imposed on our ability to process
a information. Absolute judgment is lim-
20
ited by the amount of information. Im-
mediate memory is limited by the num-
ber of items. In order to capture this dis-
tinction in somewhat picturesque terms,
00 2 1 6 8 10 12 I have fallen into the custom of distin-
INFORMATION PER ITEM IN BITS guishing between bits of information
and chunks of information. Then I can
Pro. 7. Data from Hayes (10) on the span
of immediate memory plotted as a function say that the number of bits of informa-
of the amount of Information per item in the tion is constant for absolute judgment
test materials. and the number of chunks of informa-
EFTA00307523
THE MAGICAL NUMBER $EVEN 93
tion is constant for immediate memory. achieved at different rates and overlap
The span of immediate memory seems each other during the learning process.
to be almost independent of the number I ant simply pointing to the obvious
of bits per chunk, at least over the fact that the dits and dahs are organ-
range that has been examined to date. ized by learning into patterns and that
The contrast of the terms bit and as these larger chunks emerge the
chunk also serves to highlight the fact amount of message that the operator
that we are not very definite about what can remember increases correspondingly.
constitutes a chunk of information. For In the terms I am proposing to use, the
example, the memory span of five words operator l
Entities
0 total entities mentioned
No entities found in this document
Document Metadata
- Document ID
- 18edfb73-98dc-430b-bd6a-e60eed79362f
- Storage Key
- dataset_9/EFTA00307512.pdf
- Content Hash
- 0c44030001b7280d7354dcf3eeb19886
- Created
- Feb 3, 2026