EFTA01125977.pdf
dataset_9 pdf 2.0 MB • Feb 3, 2026 • 51 pages
Entropy, Power Laws,
and Economics
Tom Carter
Complex Systems Summer School
SFI, 2007
http://astarte.csustan.edur tom/
Santa Fe
June, 2007
1
EFTA01125977
Contents
Mathematics of Information 6
Some entropy theory 13
A Maximum Entropy Principle 17
Application: Economics I 20
Fit to Real WorldTm 26
A bit about Power Laws 30
Application: Economics II 40
References 47
2
EFTA01125978
The quotes 4<-
@ Science, wisdom, and counting
@ Surprise, information, and miracles
@ Information (and hope)
@ H (or S) for Entropy
To topics <-
3
EFTA01125979
Science, wisdom, and
counting
"Science is organized knowledge. Wisdom is
organized life."
- Immanuel Kant
"My own suspicion is that the universe is not
only stranger than we suppose, but stranger
than we can suppose."
- John Haldane
"Not everything that can be counted counts,
and not everything that counts can be
counted."
- Albert Einstein (1879-1955)
"The laws of probability, so true in general,
so fallacious in particular ."
- Edward Gibbon
4
EFTA01125980
Surprise, information, and
miracles
"The opposite of a correct statement is a
false statement. The opposite of a profound
truth may well be another profound truth."
- Niels Bohr (1885-1962)
"I heard someone tried the
monkeys-on-typewriters bit trying for the
plays of W. Shakespeare, but all they got was
the collected works of Francis Bacon."
- Bill Hirst
"There are only two ways to live your life.
One is as though nothing is a miracle. The
other is as though everything is a miracle."
- Albert Einstein (1879-1955)
5
EFTA01125981
Mathematics of Information
• We would like to develop a usable
measure of the information we get from
observing the occurrence of an event
having probability p . Our first reduction
will be to ignore any particular features of
the event, and only observe whether or
not it happened. Thus we will think of an
event as the observance of a symbol
whose probability of occurring is p. We
will thus be defining the information in
terms of the probability p.
The approach we will be taking here is
axiomatic: on the next page is a list of
the four fundamental axioms we will use
Note that we can apply this axiomatic
system in any context in which we have
available a set of non-negative real
numbers. A specific special case of
interest is probabilities (i.e., real numbers
between 0 and 1), which motivated the
selection of axioms . . .
6
EFTA01125982
• We will want our information measure
/(p) to have several properties:
1. Information is a non-negative quantity:
I(p) ≥ 0.
2. If an event has probability 1, we get no
information from the occurrence of the
event: /(1) = 0.
3. If two independent events occur
(whose joint probability is the product
of their individual probabilities), then
the information we get from observing
the events is the sum of the two
informations: gm. * p2) = gpi) ± I (p2).
(This is the critical property . . . )
4. We wil l want our information measure
to be a continuous (and, in fact,
monotonic) function of the probability
(slight changes in probability should
result in slight changes in information).
7
EFTA01125983
• We can therefore derive the following:
1. gp 2 ) = gp * p) = gp) ± gp) = 2* gp)
2. Thus, further, /(pn) = n * I(p)
(by induction . . .)
)m
3. gp) = g(P ilm ) = m* I(1)1179, SO
I(pl /m) = 1 * .I(P) and thus in general
gp n I m) = n * gp)
wi
4. And thus, by continuity, we get, for
0 < p < 1, and a > 0 a real number:
gp a ) = a */(p)
• From this, we can derive the nice
property:
/(p) = — logb(p) = logb(l/p)
for some base b.
8
EFTA01125984
• Summarizing: from the four properties,
1. ./(p) ≥ 0
2 . /(7)1*/)2) = /(M.)± I(P2)
3. .1(p) is monotonic and continuous in p
4. /(1) = 0
we can derive that
.1-(p) = logb(l/p) = — logb(p),
for some positive constant b. The base b
determines the units we are using.
We can change the units by changing the
base, using the formulas, for bi,b2,x > 0,
logb, (x)
x = bi ±
and therefore
b_ (x)
lOgb2(x) = lOgb2 (blo
i g 1 ) = (lOgb2 (bi))( 1Ogbi (x)).
9
EFTA01125985
• Thus, using different bases for the
logarithm results in information measures
which are just constant multiples of each
other, corresponding with measurements
in different units:
1. log2 units are bits (from 'binary')
2. log3 units are trits(from 'trinary')
3. loge units are nats (from 'natural
logarithm') (We'll use In(x) for loge(x))
4. log10 units are Hartleys, after an early
worker in the field.
• Unless we want to emphasize the units,
we need not bother to specifiy the base
for the logarithm, and will write log(p).
Typically, we will think in terms of log2(p).
10
EFTA01125986
• For example, flipping a fair coin once will
give us events h and t each with
probability 1/2, and thus a single flip of a
coin gives us — log2(1/2) = 1 bit of
information (whether it comes up h or t).
Flipping a fair coin n times (or,
equivalently, flipping n fair coins) gives us
— log2((1/2)n) = log2(2n) = n * log2(2) =
n bits of information.
We could enumerate a sequence of 25
flips as, for example:
hthhtththhhthttththhhthtt
or, using 1 for h and 0 for t, the 25 bits
1011001011101000101110100.
We thus get the nice fact that n flips of a
fair coin gives us n bits of information,
and takes n binary digits to specify. That
these two are the same reassures us that
we have done a good job in our definition
of our information measure . . .
11
EFTA01125987
Information (and hope) ,
"In Cyberspace, the First Amendment is a
local ordinance."
- John Perry Barlow
"Groundless hope, like unconditional love, is
the only kind worth having."
- John Perry Barlow
"The most interesting facts are those which
can be used several times, those which have a
chance of recurring. . . .Which, then, are the
facts that have a chance of recurring? In the
first place, simple facts."
H. Poincare, 1908
12
EFTA01125988
Some entropy theory <—
• One question we might ask here is, what
is the average amount of information we
will get (per observation) from observing
events from a probability distribution P?
In particular, what is the expected value
of the information?
• Suppose we have a discrete probability
distribution P = {Pi, p2, • • • ,PO, with
pi > 0 and 1 pi = 1, or a continuous
distribution p(x) with p(x) > 0 and
p(x)dx = 1, we can define the expected
value of an associated discrete set
F = {fi, f2, • • • fn} or function F(x) by:
< F >= fipi
i=1
or
< F(x) >= f F(x)p(x)dx.
13
EFTA01125989
With these ideas in mind, we can define
the entropy of a distribution by:
H(P) =< .1(p) > .
In other words, we can define the entropy
of a probability distribution as the
expected value of the information of the
distribution.
In particular, for a discrete distribution
P = {Pi,P2, • • . ,pn}, we have the entropy:
1
H(P) = pi log (—) .
i=1 pi
14
EFTA01125990
Several questions probably come to mind at
this point:
• What properties does the function H(P)
have? For example, does it have a
maximum, and if so where?
• Is entropy a reasonable name for this? In
particular, the name entropy is already in
use in thermodynamics. How are these
uses of the term related to each other?
• What can we do with this new tool?
• Let me start with an easy one. Why use
the letter H for entropy? What follows is
a slight variation of a footnote, p. 105, in
the book Spikes by Rieke, et al. :-)
15
EFTA01125991
H (or S) for Entropy
"The enthalpy is [often] written U. V is the
volume, and Z is the partition function. P
and Q are the position and momentum of a
particle. R is the gas constant, and of course
T is temperature. W is the number of ways
of configuring our system (the number of
states), and we have to keep X and Y in case
we need more variables. Going back to the
first half of the alphabet, A, F, and G are all
different kinds of free energies (the last
named for Gibbs). B is a virial coefficient or a
magnetic field. I will be used as a symbol for
information; J and L are angular momenta. K
is Kelvin, which is the proper unit of T. M is
magnetization, and N is a number, possibly
Avogadro's, and O is too easily confused with
0. This leaves S . . ." and H. In Spikes they
also eliminate H (e.g., as the Hamiltonian). I,
on the other hand, along with Shannon and
others, prefer to honor Hartley. Thus, H for
entropy . . .
16
EFTA01125992
A Maximum Entropy
Principle <—
• Suppose we have a system for which we
can measure certain macroscopic
characteristics. Suppose further that the
system is made up of many microscopic
elements, and that the system is free to
vary among various states. Then (a
generic version of) the Second Law of
Thermodynamics says that with
probability essentially equal to 1, the
system will be observed in states with
maximum entropy.
We will then sometimes be able to gain
understanding of the system by applying a
maximum information entropy principle
(MEP), and, using Lagrange multipliers,
derive formulae for aspects of the system.
17
EFTA01125993
• Suppose we have a set of macroscopic
measurable characteristics f k,
k = 1,2, . . . ,M (which we can think of as
constraints on the system), which we
assume are related to microscopic
characteristics via:
EPi * f i(k) = fk-
Of course, we also have the constraints:
pi > 0, and
Epi = 1.
We want to maximize the entropy,
Ei pi log(1/pi ), subject to these
constraints. Using Lagrange multipliers Ak
(one for each constraint), we have the
general solution:
pi = ex p ( —A — Akfi(k)) .
18
EFTA01125994
If we define Z, called the partition
function, by
Z (Ai, • • • , AM) exp Akfi(k)) ,
then we have eA = Z, or A = In(Z).
19
EFTA01125995
Application: Economics I (a
Boltzmann Economy) <—
• Our first example here is a very simple
economy. Suppose there is a fixed
amount of money (M dollars), and a fixed
number of agents (N) in the economy.
Suppose that during each time step, each
agent randomly selects another agent and
transfers one dollar to the selected agent.
An agent having no money doesn't go in
debt. What will the long term (stable)
distribution of money be?
This is not a very realistic economy —
there is no growth, only a redistribution
of money (by a random process). For the
sake of argument, we can imagine that
every agent starts with approximately the
same amount of money, although in the
long run, the starting distribution
shouldn't matter.
20
EFTA01125996
• For this example, we are interested in
looking at the distribution of money in
the economy, so we are looking at the
probabilities {pi} that an agent has the
amount of money i. We are hoping to
develop a model for the collection {pi}.
If we let ni be the number of agents who
have i dollars, we have two constraints:
Eni * i = M
and
Eni = N.
Phrased differently (using pi = *), this
says
Epi *. = N
and
Epi = 1.
21
EFTA01125997
• We now apply Lagrange multipliers:
L= In( 14i) A
—
from which we get
01,
ape = —[1+ In(pi)] — Ai — µ = o.
We can solve this for pi:
In(pi) = —Ai— (1+ µ)
and so
-A0 -Ai
pi = e e
(where we have set 1 + µ = A0).
22
EFTA01125998
• Putting in constraints, we have
1 E Pi
e —A0e —Ai
E
= e—Ao E e—Ai,
i=0
and
N
e—A° e —Ai *i.
i=0
We can approximate (for large JVi)
e —Ai ti foll// eA
—x dx 1
E
i=0 A'
and
Al 1
xe— Axdx —2.
i.0 0
23
EFTA01125999
From these we have (approximately)
e Ao = 1
A
and
M 1
e =
N A2.
From this, we get
N
A= —= — ct
and thus (letting T = 4
1 ) we have:
—Ao —Ai
pi = e e
1 i
—e T .
T
This is a Boltzmann-Gibbs distribution,
where we can think of T (the average
amount of money per agent) as the
"temperature," and thus we have a
"Boltzmann economy" . . .
Note: this distribution also solves the
functional equation
p(ml)p(m2) = p(mi m2)•
24
EFTA01126000
• This example, and related topics, are
discussed in
Statistical mechanics of money
by Adrian Dragulescu and Victor M.
Yakovenko,
http://arxiv.org/abs/cond-mat/0001432
and
Statistical mechanics of money: How
saving propensity affects its distribution
by Anirban Chakraborti and Bikas K.
Chakrabarti
http://arxiv.org/abs/cond-mat/0004256
25
EFTA01126001
Fit of this model to the Real
WorldTM
• How well does this model seem to fit to
the Real World?
For a fairly large range of individuals, it
actually does a decent job. Here is a
graphical representation of U.S. census
data for 1996:
18%
16%
14%
a
12%
10%
2 8%
20 40 60 80 100
6% Individual annual income, k$
4%
2%
0%
0 10 20 30 40 50 60 70 80 90 100 110 120
Individual annual income, k$
The black line is p(x) = Rem.
26
EFTA01126002
• However, for the wealthy it doesn't do
such a good job. Here are some graphical
representations of U.K. and U.S. data for
1996-2001:
(h)
1.6..
Income distribution USA 2001 - -•- •
1.4%
1.2%
i 10•.-
Zr y 1.(S4 •
>.. • 0.48wn nexp1 0.03w)
•- ,>,.
1 Income
13
i 1g :168:
V
:-.. veto
•-4. 0.1%- 0.4°4
a
g 0.2%
10 100 1000 ISO 200 250
Income or wealth is (Kpounds) Income w (Kdollars)
Figure I. (a) Cumulative wealth (1996) and income (1998-9) distributions in the United Kingdom; for the upper class (almost I% of the
population) the distribution is well fitted with a Pareto's law. (b) Income distribution (2001) in the USA; for the low-middle classes (almost
99% of the population), the distribution is well fitted with a gamma distribution, equation (15).
As can be seen on the left graph, the
wealth distribution for the U.K. wealthy in
1996 is close to a linear fit in log — log
coordinates.
Can we modify the model somewhat to
capture other characteristics of the data?
27
EFTA01126003
• There are a wide variety of important
distributions that are observed in data
sets. For example:
— Normal (gaussian) distribution:
x2
p(x) r•-, exp(
2a2)
Natural explanation: Central limit
theorem; sum of random variables
(with finite second moment):
n
Xn = E xi
i=i
Many applications:
* Maxwell: distribution of velocities of
gas particles
* IQ
* heights of individuals
Distribution is thin tailed — no one is
20 feet tall . . .
28
EFTA01126004
— Exponential distribution:
p(x) ,,, exp( — x/x0)
Natural explanation 1: Survival time
for constant probability decay.
Natural explanation 2: Equlibrium
statistical mechanics (see above —
maximum entropy subject to constraint
on mean).
Many applications:
* Radioactive decay.
* Equilibrium statistical mechanics
(Boltzmann-Gibbs distribution)
Characteristic scale is x0; distribution
is thin tailed.
— Power law (see below):
p(x) ^~ x- '
29
EFTA01126005
A bit about Power Laws <-
• Various researchers in various fields at
various times have observed that many
datasets seem to reflect a relationship of
the form
p(x) x'
for a fairly broad range of values of x.
These sorts of data relations are often
called power laws, and have been the
subject of fairly intensive interest and
study.
An early researcher, Vilfredo Pareto,
observed in the late 1800s that pretty
uniformly across geographical locations,
wealth was distributed through the
population according to a power law, and
hence such distributions are often called
Pareto distributions.
30
EFTA01126006
A variety of other names have been
applied to these distributions:
— Power law distribution
— Pareto's law
— Zipf's law
— Lotka's law
— Bradford's law
— Zeta distribution
— Scale free distribution
— Rank-size rule
My general rule of thumb is that if
something has lots of names, it is likely to
be important . . .
31
EFTA01126007
• These distributions have been observed
many places (as noted, for example, in
Wikipedia):
— Frequencies of words in longer texts
— The size of human settlements (few
cities, many hamlets/villages)
— File size distribution of Internet traffic
which uses the TCP protocol (many
smaller files, few larger ones)
— Clusters of Bose-Einstein condensate
near absolute zero
— The value of oil reserves in oil fields (a
few large fields, many small fields)
— The length distribution in jobs
assigned supercomputers (a few large
ones, many small ones)
— The standardized price returns on
individual stocks
32
EFTA01126008
— Size of sand particles
— Number of species per genus (please
note the subjectivity involved: The
tendency to divide a genus into two or
more increases with the number of
species in it)
— Areas burnt in forest fires
• There are a variety of important
properties of power laws:
— Distribution has fat / heavy tails
(extreme events are more likely than
one might expect . . .). Stock market
volatility; sizes of storms / floods, etc.
— A power law is a linear relation
between logarithms:
p(x) = Kx'
log(p(x)) = —alog(x) + log(K)
33
EFTA01126009
— Power laws are scale invariant:
Sufficient:
p(x) = Kx —ce
x —> cx
p(x) —> Kc'x' = c—a p(x)
Necessary: Scale invariant is defined as
p(cx) = K(c)p(x)
Power law is the only solution (0 and 1
are trivial solutions).
• Power laws are actually asymptotic
relations. We can't define a power law on
[0, OO]:
If a > 1, not integrable at 0.
If a <= 1, not integrable at 00.
Thus, when we say something is a power
law, we mean either within a range, or as
x—› 0 or as x —› 00.
34
EFTA01126010
• Moments: power laws have a threshold
above which moments don't exist. For
p(x) ,,-, x -(a +1) , when a > m,
00
7(m) = f xmp(x)dx
a
= f oo xinx —(" -i) dx
a
= OO
• The lack of moments is conserved under
aggregation . . .If a(x) is the tail exponent
of the random variable x (the value above
which moments don't exist), then
cx(x ± y) = min (cx(x), a(y))
cx(xy) = min (cx(x), a(y))
ce(xk) = cx(x)I k.
35
EFTA01126011
• Power laws are generic for heavy / fat
tailed distributions. In other words, any
"reasonable" distribution with fat tails
(i.e., with moments that don't exist) is a
power law:
P(X > x) = 1 — l a(x)
= 1— exp(—x —a)
---- 1— (1— x—a)
= X -a
(there is some discussion of extreme value
distributions that goes here, with
discussion of Frechet, Weibull, and
Gumbel distributions - specifically
Frechet distributions (with fat tails)
. . . perhaps another place or time).
36
EFTA01126012
• Some mechanism for generating power
laws:
— Critical points and deterministic
dynamics
— Non-equilibrium statistical mechanics
— Random processes
— Mixtures
— Maximization principles
— Preferential attachment
— Dimensional constraints
37
EFTA01126013
• Multiplicative (random) processes
generate log-normal distributions, which
can look like power law distributions
across various ranges of the variable. If
a(t) is a random variable:
x(t + 1) = a(t)x(t)
t-1
x(t) = a(i)x(0)
H
i=0
t-1
log(x(0) = E log(a(0)+ log(x(0))
i=0
f (x) =
1 e—(log x —µ)2 /2o-2
fTro-x
(log( )2
log(f(x)) = 2 x) ± ( 1) log(x) ± const
2 a2
In particular, if a is large in comparison
with log(x), then it will look like
log( f (x)) -,-:-, log(x-1),
which is a one-over-x power law
distribution . . .
38
EFTA01126014
• Other distributions that have power-law
appearing regions:
— Mixed multiplicative / additive
processes (Kesten processes):
x(t + 1) = a(t)x(t) + b(t)
— Stable multiplicative random walk with
reflecting barrier.
Both of these will look log-normal in their
bodies, and like power laws in their tails.
(Various pieces of this section draw from
lectures / notes by Doyne Farmer on
power laws in financial markets — my
thanks to him . . . )
39
EFTA01126015
Application: Economics II (a
power law) <—
• Suppose that a (simple) economy is made
up of many agents a, each with wealth at
time t in the amount of w(a, 0. (I' ll leave
it to you to come up with a reasonable
definition of "wealth" — of course we will
want to make sure that the definition of
"wealth" is applied consistently across all
the agents.) We can also look at the total
wealth in the economy W(t) = Ea w(a,t).
For this example, we are interested in
looking at the distribution of wealth in
the economy, so we will assume there is
some collection {wi} of possible values for
the wealth an agent can have, and
associated probabilities {pi } that an agent
has wealth wi . We are hoping to develop
a model for the collection {pd.
40
EFTA01126016
• In order to apply the maximum entropy
principle, we want to look at global
(aggregate/macro) observables of the
system that reflect (or are made up of)
characteristics of (micro) elements of the
system.
For this example, we can look at the
growth rate of the economy. A reasonable
way to think about this is to let
Ri = wi(ti)/wi(to) and R = W (ti)/W (to)
(where to and t1 represent time steps of
the economy). The growth rate will then
be In(R). We then have the two
constraints on the pi:
Epi * In(Ri) = In(R)
and
Epi = 1.
41
EFTA01126017
• We now apply Lagrange multipliers:
L= pi In(l/pi ) — A [ E pi In(Ri ) — In(R)
µ[ 11
-]
from which we get
aL
api = In(Ri ) — = 0.
We can solve this for pi:
pi = e —Ao e—A In(Ri ) = e —AoRi—A
(where we have set 1 + µ _= A0).
Solving, we get A0 = In(Z(A)), where
Z(A) Ei Ri- A (the partition function)
normalizes the probability distribution to
sum to 1. From this we see the power law
(for A > 1):
-A
Pi = Z(A).
42
EFTA01126018
• We might actually like to calculate
specific values of A, so we will do the
process again in a continuous version. In
this version, we will let R = w(T) /w(0) be
the relative wealth at time T. We want to
find the probability density function f(R),
that is:
rrliflx H ( f) = - f i. f (R) In( f (R))dR,
subject to
f(R)dR = 1,
1: O
ii.'Dc f (R)In(R)dR = Cln(R),
where C is the average number of
transactions per time step.
We need to apply the calculus of
variations to maximize over a class of
functions.
43
EFTA01126019
When we are solving an extremal problem
of the form
f
we work to solve
F[x, f(x), fi(x)]dx,
aF d ( aF )
= o.
a f (x) dx a fi(x)
Our Lagrangian is of the form
L — f l c° f (R) In( f (R))dr — µ, (rf(R)dR — 1)
i.
A(f1
c)° f(R) In(R)dR — C * In (R)) .
Since this does not depend on f'(x), we
look at:
a[—f(R) In f(R) — µ(f(R) - 1) - A( f (R) In R— R)]
a f (R)
=0
from which we get
f (R) = e —(a0 —A In(R)) =. R—Ae—Ao,
where again Ao 1±µ.
44
EFTA01126020
We can use the first constraint to solve
for eAo:
00 [R—A+1 " 1
eA° = i R—dR= =
1 —A 1 A — l'
assuming A > 1. We therefore have a
power law distribution for wealth of the
form:
f (R) = (A — 1)R-A.
To solve for A, we use:
C* In(R) = (A — 1)c R-A In(R)dR.
Using integration by parts, we get
R1-AiOO
C*In(R) = (A-1) [In(R)
1— Ai l
OO R-A
—(A — 1) dR
fi. 1 — A
+ Ri—A OO .
= (A — 1) [In(R)R1 Al cx) [
1—A 1 1— A 1
45
EFTA01126021
By L'HOpital's rule, the first term goes to
zero as R —> 00, so we are left with
[R1—Al c° 1
C*In(R) = =
1 — A i. A — 1'
or, in other terms,
A— 1 =C*In(R-1).
For much more discussion of this
example, see the paper A Statistical
Equilibrium Model of Wealth Distribution
by Mishael Milakovic, February, 2001,
available on the web at:
http://astarte.csustan.edur tom/SFI-
CSSS/Wealth/wealth-Milakovic.pdf
46
EFTA01126022
References
[1] Bar-Yam, Yaneer, Dynamics of Complex Systems
(Studies in Nonlinearity) , Westview Press,
Boulder, 1997.
[2] Brillouin, L., Science and information theory
Academic Press, New York, 1956.
[3] Brooks, Daniel R., and Wiley, E. 0., Evolution as
Entropy, Toward a Unified Theory of Biology,
Second Edition, University of Chicago Press,
Chicago, 1988.
[4] Campbell, Jeremy, Grammatical Man,
Information, Entropy, Language, and Life, Simon
and Schuster, New York, 1982.
[5] Cover, T. M., and Thomas J. A., Elements of
Information Theory, John Wiley and Sons, New
York, 1991.
[6] DeLillo, Don, White Noise, Viking/Penguin, New
York, 1984.
[7] Feller, W., An Introduction to Probability Theory
and Its Applications, Wiley, New York,1957.
47
EFTA01126023
[8] Feynman, Richard, Feynman lectures on
computation, Addison-Wesley, Reading, 1996.
[9] Gatlin, L. L., Information Theory and the Living
System, Columbia University Press, New York,
1972.
[10] Greven, A., Keller, G., Warnecke, G., Entropy,
Princeton Univ. Press, Princeton, 2003.
[11] Haken, Hermann, Information and
Self-Organization, a Macroscopic Approach to
Complex Systems, Springer-Verlag, Berlin/New
York, 1988.
[12] Hamming, R. W., Error detecting and error
correcting codes, Bell Syst. Tech. J. 29 147,
1950.
[13] Hamming, R. W., Coding and information theory,
2nd ed, Prentice-Hall, Englewood Cliffs, 1986.
[14] Hill, R., A first course in coding theory Clarendon
Press, Oxford, 1986.
[15] Hodges, A., Alan Turing: the enigma Vintage,
London, 1983.
[16] Hofstadter, Douglas R., Metamagical Themas:
Questing for the Essence of Mind and Pattern,
Basic Books, New York, 1985
48
EFTA01126024
[17] Jones, D. S., Elementary information theory
Clarendon Press, Oxford, 1979.
[18] Knuth, Eldon L., Introduction to Statistical
Thermodynamics, McGraw-Hill, New York, 1966.
[19] Landauer, R., Information is physical, Phys.
Today, May 1991 23-29.
[20] Landauer, R., The physical nature of information,
Phys. Lett. A, 217 188, 1996.
[21] van Lint, J. H., Coding Theory, Springer-Verlag,
New York/Berlin, 1982.
[22] Lipton, R. J., Using DNA to solve NP-complete
problems, Science, 268 542-545, Apr. 28, 1995.
[23] MacWilliams, F. J., and Sloane, N. J. A., The
theory of error correcting codes, Elsevier Science,
Amsterdam, 1977.
[24] Martin, N. F. G., and England, J. W.,
Mathematical Theory of Entropy,
Addison-Wesley, Reading, 1981.
[25] Maxwell, J. C., Theory of heat Longmans, Green
and Co, London, 1871.
49
EFTA01126025
[26] von Neumann, John, Probabilistic logic and the
synthesis of reliable organisms from unreliable
components, in automata studies(
Shanon,McCarthy eds), 1956 .
[27] Papadimitriou, C. H., Computational Complexity,
Addison-Wesley, Reading, 1994.
[28] Pierce, John R., An Introduction to Information
Theory — Symbols, Signals and Noise, (second
revised edition), Dover Publications, New York,
1980.
[29] Roman, Steven, Introduction to Coding and
Information Theory, Springer-Verlag, Berlin/New
York, 1997.
[30] Sampson, Jeffrey R., Adaptive Information
Processing, an Introductory Survey,
Springer-Verlag, Berlin/New York, 1976.
[31] Schroeder, Manfred, Fractals, Chaos, Power
Laws, Minutes from an Infinite Paradise, W. H.
Freeman, New York, 1991.
[32] Shannon, C. E., A mathematical theory of
communication Bell Syst. Tech. J. 27 379; also
p. 623, 1948.
[33] Slepian, D., ed., Key papers in the development of
information theory IEEE Press, New York, 1974.
50
EFTA01126026
[34] Turing, A. M., On computable numbers, with an
application to the Entscheidungsproblem, Proc.
Lond. Math. Soc. Ser. 2 42, 230 ; see also Proc.
Lond. Math. Soc. Ser. 2 43, 544, 1936.
[35] Zurek, W. H., Thermodynamic cost of
computation, algorithmic complexity and the
information metric, Nature 341 119-124, 1989.
To top
51
EFTA01126027
Entities
0 total entities mentioned
No entities found in this document
Document Metadata
- Document ID
- 4126737c-53c6-4e52-b5d8-ad86047997a9
- Storage Key
- dataset_9/EFTA01125977.pdf
- Content Hash
- 156377f072c9815186a864e95e2fd7d1
- Created
- Feb 3, 2026