Epstein Files

EFTA00307512.pdf

dataset_9 pdf 2.6 MB Feb 3, 2026 17 pages
Vol.. 63, No. 2 MARCH, 1956 THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION' GEORGE A. MILLER Harvard University My problem is that I have been perse- judgment. Historical accident, how- cuted by an integer. For seven years ever, has decreed that they should have this number has followed me around, has another name. We now call them ex- Intruded in my most private data, and periments on the capacity of people to has assaulted me from the pages of our transmit information. Since these ex- most public journals. This number as- periments would not have been done sumes a variety of disguises, being some- without the appearance of information times a little larger and sometimes a theory on the psychological scene, and little smaller than usual, but never since the results are analyzed in terms changing so much as to be unrecogniz- of the concepts of information theory, able. The persistence with which this I shall have to preface my discussion number plagues me is far more than with a few remarks about this theory. a random accident. There is, to quote INFORMATION MEASUREMENT a famous senator, a design behind it, some pattern governing its appearances. The "amount of information" is ex- Either there really is something unusual actly the same concept that we have about the number or else I am suffering talked about for years under the name from delusions of persecution. of "variance." The equations are dif- I shall begin my case history by tell- ferent, but if we hold tight to the idea ing you about some experiments that that anything that increases the vari- tested how accurately people can assign ance also increases the amount of infor- numbers to the magnitudes of various mation we cannot go far astray. aspects of a stimulus. In the tradi- The advantages of this new way tional language of psychology these of talking about variance are simple would be called experiments in absolute enough. Variance is always stated in terms of the unit of measurement— Th1s paper was first read as an Invited inches, pounds, volts, etc.—whereas the Address before the Eastern Psychological As- sociation In Philadelphia on April IS, 193.5. amount of information is a dimension- Preparation of the paper was supported by less quantity. Since the information in the Harvard Psycho-Acoustic Laboratory un- a discrete statistical distribution does der Contract NSort-76 between Harvard Uni- not depend upon the unit of measure- versity and the Office of Naval Research, U. S. ment, we can extend the concept to Navy (Project NR142-201, Report PNR-174). situations where we have no metric and Reproduction for any purpose of the U. S. Government is permitted. we would not ordinarily think of using 81 I' EFTA00307512 82 GEORGE A. Mn...ten the variance. And It also enables us to des. Then the left circle can be taken compare results obtained in quite dif- to represent the variance of the input, ferent experimental situations where it the right circle the variance of the out- would be meaningless to compare vari- put, and the overlap the covariance of ances based on different metrics. So input and output. I shall speak of the there are some good reasons for adopt- left circle as the amount of input infor- ing the newer concept. mation, the right circle as the amount The similarity of variance and amount of output information, and the overlap of information might be explained this as the amount of transmitted informa- way: When we have a large variance, tion. we are very ignorant about what is go- In the experiments on absolute judg- ing to happen. If we are very ignorant, ment, the observer is considered to be then when we make the observation It a communication channel. Then the gives us a lot of information. On the left circle would represent the amount other hand, if the variance is very small, of information in the stimuli, the right we know in advance how our observa- circle the amount of information in his tion must come out, so we get little in- responses, and the overlap the stimulus- formation from making the observation. response correlation as measured by the If you will now imagine a communi- amount of transmitted information. The cation system, you will realize that experimental problem is to increase the there is a great deal of variability about amount of input information and to what goes into the system and also a measure the amount of transmitted in- great deal of variability about what formation. If the observer's absolute comes out. The input and the output judgments are quite accurate, then can therefore be described in terms of nearly all of the input information will their variance (or their information). be transmitted and will be recoverable If it is a good communication system, from his responses. If he makes errors, however, there must be some system- then the transmitted information may atic relation between what goes in and be considerably less than the input. We what comes out. That is to say, the expect that, as we increase the amount output will depend upon the input, or of input information, the observer will will be correlated with the input. If we begin to make more and more errors; measure this correlation, then we can we can test the limits of accuracy of his say how much of the output variance is absolute judgments. If the human ob- attributable to the input and how much server is a reasonable kind of communi- is due to random fluctuations or "noise" cation system, then when we increase introduced by the system during trans- the amount of input information the mission. So we see that the measure transmitted information will increase at of transmitted information is simply a first and will eventually level off at some measure of the input-output correlation. asymptotic value. This asymptotic value There are two simple rules to follow. we take to be the channel capacity of Whenever I refer to "amount of in- the observer: it represents the greatest formation," you will understand "vari- amount of information that he can give ance." And whenever I refer to "amount us about the stimulus on the basis of of transmitted information," you will an absolute judgment. The channel ca- understand "covariance" or "correla- pacity is the upper limit on the extent tion." to which the observer can match his re- The situation can be described graphi- sponses to the stimuli we give him. cally by two partially overlapping cir- Now just a brief word about the bit EFTA00307513 THE MAGICAL NUMBER SEVEN 83 and we can begin to look at some data. ABSOLUTE JUDGMENTS OF UNI- One bit of information is the amount of DIMENSIONAL STIMULI information that we need to make a Now let us consider what happens decision between two equally likely al- when we make absolute judgments of ternatives. If we must decide whether tones. Pollack (17) asked listeners to a man is less than six feet tall or more identify tones by assigning numerals to than six feet tall and if we know that them. The tones were different with re- the chances are SO-50, then we need spect to frequency, and covered the one bit of information. Notice that range from 100 to 8000 cps in equal this unit of information does not refer logarithmic steps. A tone was sounded in any way to the unit of length that and the listener responded by giving a we use—feet, inches, centimeters, etc. numeral. After the listener had made However you measure the man's height, his response he was told the correct we still need just one bit of information. identification of the tone. Two bits of information enable us to When only two or three tones were decide among four equally likely alter- used the listeners never confused them. natives. Three bits of information en- With four different tones confusions able us to decide among eight equally were quite rare, but with five or more likely alternatives. Four bits of infor- tones confusions were frequent. With mation decide among 16 alternatives, fourteen different tones the listeners five among 32, and so on. That is to made many mistakes. say, if there are 32 equally likely alter- These data are plotted in Fig. 1. natives, we must make five successive Along the bottom is the amount of in- binary decisions, worth one bit each, be- put information in bits per stimulus. fore we know which alternative is cor- As the number of alternative tones was rect. So the general rule is simple: increased from 2 to 14, the input infor- every time the number of alternatives mation increased from 1 to 3.8 bits. On is increased by a factor of two, one bit the ordinate is plotted the amount of of information is added. There are two ways we might in- crease the amount of input information. We could increase the rate at which we give information to the observer, so that 2 2 5 sirs the amount of information per unit time would increase. Or we could ignore the 0 time variable completely and increase PITCHES the amount of input information by 4 i00-8000 CP5 increasing the number of alternative stimuli. In the absolute judgment ex- 0 3 n 5 periment we are interested in the second INPUT INrORMAT ION alternative. We give the observer as no. 1. Data from Pollack (17, 18) on the much time as he wants to make his re- amount of information that is transmitted by sponse; we simply increase the number listeners who make absolute judgments of of alternative stimuli among which he auditory pitch. As the amount of input in- must discriminate and look to see where formation is increased by increasing from 2 to 14 the number of different pitches to be confusions begin to occur. Confusions judged, the amount of transmitted informa- will appear near the point that we are tion approaches as its upper limit a channel calling his "channel capacity." capacity of about IS bits per judgment. EFTA00307514 84 GEORGE A. Minn transmitted information. The amount of transmitted information behaves in much the way we would expect a corn- munication channel to behave; the trans- 2.3 mitted information increases linearly up BITS to about 2 bits and then bends off to- ward an asymptote at about 2.5 bits. This value, 2.5 bits, therefore, is what we are calling the channel capacity of the listener for absolute judgments of pitch. INPUT INFORMATION So now we have the number 2.5 FIG. 2. Data from Garner (7) on the chan- bits. What does it mean? First, note nel capacity for absolute judgments of audi- that 2.5 bits corresponds to about six tory loudness. equally likely alternatives. The result means that we cannot pick more than Next you can ask how reproducible six different pitches that the listener will this result is. Does it depend on the never confuse. Or, stated slightly dif- spacing of the tones or the various con- ferently, no matter how many alterna- ditions of judgment? Pollack varied tive tones we ask him to judge, the best these conditions in a number of ways. we can expect him to do is to assign The range of frequencies can be changed them to about six different classes with- by a factor of about 20 without chang- out error. Or, again, if we know that ing the amount of information trans- there were N alternative stimuli, then mitted more than a small percentage. his judgment enables us to narrow down Different groupings of the pitches de- the particular stimulus to one out of creased the transmission, but the loss N/6. was small. For example, if you can Most people are surprised that the discriminate five high-pitched tones in number is as small as six. Of course, one series and five low-pitched tones in there is evidence that a musically so- another series, it is reasonable to ex- phisticated person with absolute pitch pect that you could combine all ten into can identify accurately any one of 50 a single series and still tell them all or 60 different pitches. Fortunately, I apart without error. When you try it, do not have time to discuss these re- however, it does not work. The chan- markable exceptions. I say it is for- nel capacity for pitch seems to be about tunate because I do not know how to six and that is the best you can do. explain their superior performance. So While we are on tones, let us look I shall stick to the more pedestrian fact next at Garner's (7) work on loudness. that most of us can identify about one Garner's data for loudness are sum- out of only five or six pitches before we marized in Fig. 2. Garner went to some begin to get confused. trouble to get the best possible spacing It is interesting to consider that psy- of his tones over the intensity range chologists have been using seven-point from 15 to 110 db. He used 4, 5, 6, 7, rating scales for a long time, on the 10, and 20 different stimulus intensities. intuitive basis that trying to rate into The results shown in Fig. 2 take into finer categories does not really add much account the differences among subjects to the usefulness of the ratings. Pol- and the sequential influence of the im- lack's results indicate that, at least for mediately preceding judgment. Again pitches, this Intuition is fairly sound. we find that there seems to be a limit. EFTA00307515 THE MAGICAL NUMBER SEVEN 85 and Garner (8) asked observers to in- terpolate visually between two scale markers. Their results are shown in Fig. 4. They did the experiment in - I9 two ways. In one version they let the gas observer use any number between zero and 100 to describe the position, al- TASTF$ though they presented stimuli at only •Fuocaattts of samw CONCENTRATION 5, 10, 20, or 50 different positions. The 2 3 4 5 results with this unlimited response technique are shown by the filled circles elm INFORMATION on the graph. In the other version the Fro. S. Data from Beebe-Center, Rogers, observers were limited in their re- and O'Connell (1) on the channel capacity for sponses to reporting just those stimu- absolute judgments of saltiness. lus values that were possible. That is to say, in the second version the num- The channel capacity for absolute judg- ber of different responses that the ob- ments of loudness is 2.3 bits, or about server could make was exactly the same five perfectly discriminable alternatives. as the number of different stimuli that Since these two studies were done in the experimenter might present. The different laboratories with slightly dif- results with this limited response tech- ferent techniques and methods of analy- nique are shown by the open circles on sis, we are not in a good position to the graph. The two functions are so argue whether five loudnesses is signifi- similar that it seems fair to conclude cantly different from six pitches. Prob- that the number of responses available ably the difference is in the right direc- to the observer had nothing to do with tion, and absolute judgments of pitch the channel capacity of 3.25 bits. are slightly more accurate than absolute The Hake-Garner experiment has been judgments of loudness. The important repeated by Coonan and Klemmer. Al- point, however, is that the two answers though they have not yet published are of the same order of magnitude. their results, they have given me per- The experiment has also been done mission to say that they obtained chan- for taste intensities. In Fig. 3 are the nel capacities ranging from 3.2 bits for results obtained by Beebe-Center, Rog- ers, and O'Connell (1) for absolute judgments of the concentration of salt solutions. The concentrations ranged from 0.3 to 34.7 gin. Neel per 100 cc. tap water in equal subjective steps. They used 3, 5, 9, and 17 different con- centrations. The channel capacity is POINTS ON A LINE o me • ms 1.9 bits, which is about four distinct • mr • KO concentrations. Thus taste intensities a seem a little less distinctive than audi- 2 3 4 5 tory stimuli, but again the order of iNFUT INFORMATION magnitude is not far off. Fro. 4. Data from Hake and Garner (8) On the other hand, the channel ca- on the channel capacity for absolute Judg- pacity for judgments of visual position ments of the position of a pointer in a linear seems to be significantly larger. Hake interval. EFTA00307516 86 GEORGE A. MILLER very short exposures of the pointer po- for the long exposure. Curvature was sition to 3.9 bits for longer exposures. apparently harder to judge. When the These values are slightly higher than length of the arc was constant, the re- Hake and Garner's, so we must con- sult at the short exposure duration was clude that there are between 10 and IS 2.2 bits, but when the length of the distinct positions along a linear inter- chord was constant, the result was only val. This is the largest channel ca- 1.6 bits. This last value is the lowest pacity that has been measured for any that anyone has measured to date. I unidimensional variable. should add, however, that these values At the present time these four experi- are apt to be slightly too low because ments on absolute judgments of simple, the data from all subjects were pooled unidimensional stimuli are all that have before the transmitted information was appeared in the psychological journals. computed. However, a great deal of work on other Now let us see where we are. First, stimulus variables has not yet appeared the channel capacity does seem to be a in the journals. For example, Eriksen valid notion for describing human ob- and Hake (6) have found that the servers. Second, the channel capacities channel capacity for judging the sizes measured for these unidimensional vari- of squares is 2.2 bits, or about five ables range from 1.6 bits for curvature categories, under a wide range of ex- to 3.9 bits for positions in an interval. perimental conditions. In a separate Although there is no question that the experiment Eriksen (5) found 2.8 bits differences among the variables are real for size, 3.1 bits for hue, and 2.3 bits and meaningful, the more impressive for brightness. Geldard has measured fact to me is their considerable simi- the channel capacity for the skin by placing vibrators on the chest region. larity. If I take the best estimates I A good observer can identify about four can get of the channel capacities for all intensities, about five durations, and the stimulus variables I have mentioned, about seven locations. the mean is 2.6 bits and the standard One of the most active groups in this deviation is only 0.6 bit. In terms of area has been the Air Force Operational distinguishable alternatives, this mean Applications Laboratory. Pollack has corresponds to about 6.5 categories, one been kind enough to furnish me with standard deviation includes from 4 to the results of their measurements for 10 categories, and the total range is several aspects of visual displays. They from 3 to 15 categories. Considering made measurements for area and for the wide variety of different variables the curvature, length, and direction of that have been studied, I find this to lines. In one set of experiments they be a remarkably narrow range. used a very short exposure of the stimu- There seems to be some limitation lus—%0 second—and then they re- built into us either by learning or by peated the measurements with a 5- the design of our nervous systems, a second exposure. For area they got limit that keeps our channel capacities 2.6 bits with the short exposure and in this general range. On the basis of 2.7 bits with the long exposure. For the present evidence it seems safe to the length of a line they got about 2.6 say that we possess a finite and rather bits with the short exposure and about small capacity for making such unfelt- 3.0 bits with the long exposure. Direc- mensional judgments and that this ca- tion, or angle of inclination, gave 2.8 pacity does not vary a great deal from bits for the short exposure and 3.3 bits one simple sensory attribute to another. EFTA00307517 THE MAGICAL NUMBER SEVEN 87 ABSOLUTE JUDGMENTS OF MULTI- sults. Now the channel capacity seems DIMENSIONAL STIMULI to have increased to 4.6 bits, which means that people can identify accu- You may have noticed that I have rately any one of 24 positions in the been careful to say that this magical square. number seven applies to one-dimensional The position of a dot In a square is judgments. Everyday experience teaches clearly a two-dimensional proposition. us that we can identify accurately any Both its horizontal and its vertical po- one of several hundred faces, any one sition must be identified. Thus it seems of several thousand words, any one of natural to compare the 4.6-bit capacity several thousand objects, etc. The story for a square with the 3.25-bit capacity certainly would not be complete if we for the position of a point in an inter- stopped at this point. We must have val. The point in the square requires some understanding of why the one- two judgments of the interval type. If dimensional variables we judge in the we have a capacity of 3.25 bits for esti- laboratory give results so far out of mating intervals and we do this twice, line with what we do constantly in our we should get 6.5 bits as our capacity behavior outside the laboratory. A pos- for locating points in a square. Adding sible explanation lies in the number of the second independent dimension gives independently variable attributes of the us an increase from .3.25 to 4.6, but it stimuli that are being judged. Objects, falls short of the perfect addition that faces, words, and the like differ from would give 6.5 bits. one another in many ways, whereas the Another example is provided by Beebe- simple stimuli we have considered thus Center, Rogers, and O'Connell. When far differ from one another in only one they asked people to identify both the respect. saltiness and the sweetness of solutions Fortunately, there are a few data on containing various concentrations of salt what happens when we make absolute and sucrose, they found that the chan- judgments of stimuli that differ from nel capacity was 2.3 bits. Since the ca- one another in several ways. Let us pacity for salt alone was 1,9, we might look first at the results Klemmer and expect about 3.8 bits if the two aspects Frick (13) have reported for the abso- of the compound stimuli were judged lute judgment of the position of a dot independently. As with spatial loca- in a square. In Fig. S we see their re- tions, the second dimension adds a little to the capacity but not as much as it conceivably might. A third example is provided by Pol- - --4.6 BITS — 7ete lack (18), who asked listeners to judge both the loudness and the pitch of pure tones. Since pitch gives 2.5 bits and loudness gives 2.3 bits, we might hope POINTS IN A SQUARE to get as much as 4.8 bits for pitch and I43 GRID loudness together. Pollack obtained 3.1 .03 SEC. EXPOSURE bits, which again indicates that the I z 3 4 5 6 7 6 a second dimension augments the channel INPUT INFORMATION capacity but not so much as it might. Fro. S. Data from ICkmmer and Frick (13) A fourth example can be drawn from on the channel capacity for absolute Judg- the work of Halsey and Chapanis (9) ments of the position of a dot In a square. on confusions among colors of equal EFTA00307518 88 GEORGE A. MILLER luminance. Although they did not ana- 10 lyze their results in informational terms, they estimate that there are about 11 to 15 identifiable colors, or, in our terms, 3 about 3.6 bits. Since these colors varied 3 6 in both hue and saturation, it is prob- ably correct to regard this as a two- dimensional judgment. If we compare 2 this with Eriksen's 3.1 bits for hue (which is a questionable comparison to 0 2 3 4 5 6 7 draw), we again have something less NUMBER OF VAAI&SLC ASPECTS than perfect addition when a second dimension is added. Fm. 6. The general form of the relation be- It is still a long way, however, from tween channel capacity and the number of in- these two-dimensional examples to the dependently variable attributes of the stimuli. multidimensional stimuli provided by faces, words, etc. To fill this gap we decreasing rate. It Is Interesting to have only one experiment, an auditory note that the channel capacity is in- study done by Pollack and Ficks (19). creased even when the several variables They managed to get six different acous- are not independent. Eriksen (5) re- tic variables that they could change: ports that, when size, brightness, and frequency, intensity, rate of Interrup- hue all vary together in perfect correla- tion, on-time fraction, total duration, tion, the transmitted information is 4.1 and spatial location. Each one of these bits as compared with an average of six variables could assume any one of about 2.7 bits when these attributes are five different values, so altogether there varied one at a time. By confounding were 58, or 15,625 different tones that three attributes, Eriksen increased the they could present. The listeners made dimensionality of the input without in- a separate rating for each one of these creasing the amount of input informa- six dimensions. Under these conditions tion; the result was an increase in chan- the transmitted information was 7.2 bits, nel capacity of about the amount that which corresponds to about ISO differ- the dotted function in Fig. 6 would lead ent categories that could be absolutely us to expect. identified without error. Now we are The point seems to be that, as we beginning to get up into the range that add more variables to the display, we ordinary experience would lead us to increase the total capacity, but we de- expect. crease the accuracy for any particular Suppose that we plot these data, variable. In other words, we can make fragmentary as they are, and make a relatively crude judgments of several guess about how the channel capacity things simultaneously. changes with the dimensionality of the We might argue that in the course of stimuli. The result is given in Fig. 6. evolution those organisms were most In a moment of considerable daring I successful that were responsive to the sketched the dotted line to indicate widest range of stimulus energies in roughly the trend that the data seemed their environment. In order to survive to be taking. in a constantly fluctuating world, it was Clearly, the addition of independently better to have a little information about variable attributes to the stimulus in- a lot of things than to have a lot of in- creases the channel capacity, but at a formation about a small segment of the EFTA00307519 THE MAGICAL NUMBER SEVEN 89 environment. If a compromise was nec- find out. There is a limit, however, at essary, the one we seem to have made is about eight or nine distinctive features clearly the more adaptive. in every language that has been studied, Pollack and Ficks's results are very and so when we talk we must resort to strongly suggestive of an argument that still another trick for increasing our linguists and phoneticians have been channel capacity. Language uses se- making for some time (11). According quences of phonemes, so we make sev- to the linguistic analysis of the sounds eral judgments successively when we of human speech, there are about eight listen to words and sentences. That is or ten dimensions—the linguists call to say, we use both simultaneous and them distinctive features—that distin- successive discriminations in order to guish one phoneme from another. These expand the rather rigid limits imposed distinctive features are usually binary, by the inaccuracy of our absolute judg- or at most ternary, in nature. For ex- ments of simple magnitudes. ample, a binary distinction is made be- These multidimensional judgments are tween vowels and consonants, a binary strongly reminiscent of the abstraction decision is made between oral and nasal experiment of Kiilpe (14). As you may consonants, a ternary decision is made remember, Kiilpe showed that observers among front, middle, and back pho- report more accurately on an attribute nemes, etc. This approach gives us for which they are set than on attributes quite a different picture of speech per- for which they are not set. For exam- ception than we might otherwise obtain ple, Chapman (4) used three different from our studies of the speech spectrum attributes and compared the results ob- and of the ear's ability to discriminate tained when the observers were in- relative differences among pure tones. I am personally much interested in this structed before the tachistoscopic pres- entation with the results obtained when new approach (15), and I regret that they were not told until after the pres- there is not time to discuss it here. It was probably with this linguistic entation which one of the three attri- theory in mind that Pollack and Ficks butes was to be reported. When the conducted a test on a set of tonal instruction was given in advance, the stimuli that varied in eight dimensions, judgments were more accurate. When but required only a binary decision on the instruction was given afterwards, each dimension. With these tones they the subjects presumably had to judge all measured the transmitted information three attributes in order to report on at 6.9 bits, or about 120 recognizable any one of them and the accuracy was kinds of sounds. It is an intriguing correspondingly lower. This Is in com- question, as yet unexplored, whether plete accord with the results we have one can go on adding dimensions in- just been considering, where the ac- definitely in this way. curacy of judgment on each attribute In human speech there is clearly a decreased as more dimensions were limit to the number of dimensions that added. The point is probably obvious, we use. In this instance, however, it is but I shall make it anyhow, that the not known whether the limit is imposed abstraction experiments did not demon- by the nature of the perceptual ma- strate that people can judge only one chinery that must recognize the sounds attribute at a time. They merely showed or by the nature of the speech ma- what seems quite reasonable, that peo- chinery that must produce them. Some- ple are less accurate if they must Judge body will have to do the experiment to more than one attribute simultaneously. EFTA00307520 90 GEORGE A. Main &INTIM° two dimensions of numerousness are I cannot leave this general area with- area and density. When the subject out mentioning, however briefly, the ex- can subitize, area and density may not periments conducted at Mount Holyoke be the significant variables, but when College on the discrimination of num- the subject must estimate perhaps they ber (12). In experiments by Kaufman, are significant. In any event, the com- Lord, Reese, and Volkmann random parison is not so simple as it might patterns of dots were flashed on a screen seem at first thought. for t/5 a second. Anywhere from 1 This is one of the ways in which the to more than 200 dots could appear in magical number seven has persecuted the pattern. The subject's task was to me. Here we have two closely related report how many dots there were. kinds of experiments, both of which The first point to note is that on pat- point to the significance of the number terns containing up to five or six dots seven as a limit on our capacities. And the subjects simply did not make errors. yet when we examine the matter more The performance on these small num- closely, there seems to be a reasonable bers of dots was so different from the suspicion that it is nothing more than performance with more dots that it was a coincidence. given a special name. Below seven the THE SPAN OF IMMEDIATE MEMORY subjects were said to subitize; above seven they were said to estimate. This Let me summarize the situation in is, as you will recognize, what we once this way. There is a clear and definite optimistically called "the span of atten- limit to the accuracy with which we can tion." identify absolutely the magnitude of This discontinuity at seven is, of a unidimensional stimulus variable. I course, suggestive. Is this the same would propose to call this limit the basic process that limits our unidimen- span of absolute judgment, and I sional judgments to about seven cate- maintain that for unidimensional judg- gories? The generalization is tempting, ments this span is usually somewhere but not sound in my opinion. The data in the neighborhood of seven. We are on number estimates have not been ana- not completely at the mercy of this lyzed in informational terms; but on limited span, however, because we have the basis of the published data I would a variety of techniques for getting guess that the subjects transmitted around it and increasing the accuracy something more than four bits of in- of our judgments. The three most im- formation about the number of dots. portant of these devices are (a) to Using the same arguments as before, we make relative rather than absolute judg- would conclude that there are about 20 ments; or, if that is not possible, (b) or 30 distinguishable categories of nu- to increase the number of dimensions merousness. This is considerably more along which the stimuli can differ; or information than we would expect to (c) to arrange the task in such a way get from a unidimensional display. It that we make a sequence of several ab- is, as a matter of fact, very much like a solute judgments in a row. two-dimensional display. Although the The study of relative judgments is dimensionality of the random dot pat- one of the oldest topics in experimental terns is not entirely clear, these results psychology, and I will not pause to re- are in the same range as Klemmer and view it now. The second device, in- Frick's for their two-dimensional dis- creasing the dimensionality, we have just play of dots in a square. Perhaps the considered. It seems that by adding EFTA00307521 THE MAGICAL NUMBER SEVEN 91 more dimensions and requiring crude, a lot of different kinds of test materials binary, yes-no judgments on each at- this span is about seven items in length. tribute we can extend the span of abso- I have just shown you that there is a lute judgment from seven to at least span of absolute judgment that can dis- ISO. Judging from our everyday be- tinguish about seven categories and that havior, the limit is probably in the there is a span of attention that will thousands, if indeed there is a limit. In encompass about six objects at a glance. my opinion, we cannot go on compound- What is more natural than to think that ing dimensions indefinitely. I suspect all three of these spans are different as- that there is also a span of perceptual pects of a single underlying process? dimensionality and that this span is And that is a fundamental mistake, as somewhere in the neighborhood of ten, I shall be at some pains to demonstrate. but I must add at once that there is no This mistake is one of the malicious objective evidence to support this sus- persecutions that the magical number picion. This is a question sadly need- seven has subjected me to. ing experimental exploration. My mistake went something like this. Concerning the third device, the use We have seen that the invariant fea- of successive judgments, I have quite a ture in the span of absolute judgment bit to say because this device introduces is the amount of information that the memory as the handmaiden of discrimi- observer can transmit. There is a real nation. And, since mnemonic processes operational similarity between the ab- are at least as complex as are perceptual solute judgment experiment and the processes, we can anticipate that their immediate memory experiment. If im- interactions will not be easily disen- mediate memory is like absolute judg- tangled. ment, then it should follow that the in- Suppose that we start by simply ex- variant feature in the span of immediate tending slightly the experimental pro- memory is also the amount of informa- cedure that we have been using. Up tion that an observer can retain. If the to this point we have presented a single amount of information in the span of stimulus and asked the observer to name immediate memory is a constant, then it immediately thereafter. We can ex- the span should be short when the indi- tend this procedure by requiring the ob- vidual items contain a lot of informa- server to withhold his response until we tion and the span should be long when have given him several stimuli in suc- the items contain little information. For cession. At the end of the sequence of example, decimal digits are worth 3.3 stimuli he then makes his response. We bits apiece. We can recall about seven still have the same sort of input-out- of them, for a total of 23 bits of in- put situation that is required for the formation. Isolated English words are measurement of transmitted informa- worth about 10 bits apiece. If the total tion. But now we have passed from amount of information is to remain an experiment on absolute judgment to constant at 23 bits, then we should be what is traditionally called an experi- able to remember only two or three ment on immediate memory. words chosen at random. In this way Before we look at any data on this I generated a theory about how the span topic I feel I must give you a word of of immediate memory should vary as a warning to help you avoid some obvi- function of the amount of information ous associations that can be confusing. per item in the test materials. Everybody knows that there is a finite The measurements of memory span in span of immediate memory and that for the literature are suggestive on this EFTA00307522 92 GEORGE A. Main 50 - question, but not definitive. And so it was necessary to do the experiment to CONSIANT see. Hayes (10) tried it out with five O f 40 NOOS OF ! /. z different kinds of test materials: binary 5 digits, decimal digits, letters of the al- ffi so phabet, letters plus decimal digits, and with 1,000 monosyllabic words. The it 20 lists were read aloud at the rate of one Ct. I.ETTEF4 item per second and the subjects bad as AtED much time as they needed to give their piGn‘ responses. A procedure described by 1 4 S Woodworth (20) was used to score the mrFORmATiON PER ITEM IN NITS responses. The results are shown by the filled Fm. 8. Data from Pollack (16) on the circles in Fig. 7. Here the dotted line amount of information retained after one indicates what the span should have amount presentation plotted as a function of the of information per item in the test been if the amount of information in the materials. span were constant. The solid curves represent the data. Hayes repeated the There is nothing wrong with Hayes's experiment using test vocabularies of experiment, because Pollack (16) re- different sizes but all containing only peated it much more elaborately and English monosyllables (open circles in got essentially the same result. Pol- Fig. 7). This more homogeneous test lack took pains to measure the amount material did not change the picture sig- of information transmitted and did not nificantly. With binary items the span rely on the traditional procedure for is about nine and, although it drops to scoring the responses. His results are about five with monosyllabic English plotted in Fig. 8. Here it is clear that words, the difference is far less than the amount of information transmitted the hypothesis of constant information is not a constant, but increases almost would require. linearly as the amount of information per item in the input is increased. C44maN. 15(1103 'PA APIS DTI I SOIOITS WORDS And so the outcome is perfectly clear. 80 In spite of the coincidence that the magical number seven appears in both 40 places, the span of absolute judgment %CONSTANT and the span of immediate memory are t INFORMATION quite different kinds of limitations that E 50 a 1 are imposed on our ability to process a information. Absolute judgment is lim- 20 ited by the amount of information. Im- mediate memory is limited by the num- ber of items. In order to capture this dis- tinction in somewhat picturesque terms, 00 2 1 6 8 10 12 I have fallen into the custom of distin- INFORMATION PER ITEM IN BITS guishing between bits of information and chunks of information. Then I can Pro. 7. Data from Hayes (10) on the span of immediate memory plotted as a function say that the number of bits of informa- of the amount of Information per item in the tion is constant for absolute judgment test materials. and the number of chunks of informa- EFTA00307523 THE MAGICAL NUMBER $EVEN 93 tion is constant for immediate memory. achieved at different rates and overlap The span of immediate memory seems each other during the learning process. to be almost independent of the number I ant simply pointing to the obvious of bits per chunk, at least over the fact that the dits and dahs are organ- range that has been examined to date. ized by learning into patterns and that The contrast of the terms bit and as these larger chunks emerge the chunk also serves to highlight the fact amount of message that the operator that we are not very definite about what can remember increases correspondingly. constitutes a chunk of information. For In the terms I am proposing to use, the example, the memory span of five words operator l

Entities

0 total entities mentioned

No entities found in this document

Document Metadata

Document ID
18edfb73-98dc-430b-bd6a-e60eed79362f
Storage Key
dataset_9/EFTA00307512.pdf
Content Hash
0c44030001b7280d7354dcf3eeb19886
Created
Feb 3, 2026