Documente Academic
Documente Profesional
Documente Cultură
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/
info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content
in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.
For more information about JSTOR, please contact support@jstor.org.
Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The American
Mathematical Monthly.
http://www.jstor.org
This content downloaded from 213.233.188.210 on Sat, 07 Nov 2015 11:14:07 UTC
All use subject to JSTOR Terms and Conditions
THE EVOLUTIONOF...
Edited by:Abe Shenitzer
NorthYork,OntarioM3J1P3, Canada
Mathematics,YorkUniversity,
Planck:
Reprinted with kind permission of Birkhauser Verlag AG, from Deuelopmentof Mathematics
by Jean-Paul Pier, Basel, 1994, ISBN 0-8176-2821-5
1900-1950,
edited
586
THE EVOLUTION OF . . .
[Aug.-Sept.
This content downloaded from 213.233.188.210 on Sat, 07 Nov 2015 11:14:07 UTC
All use subject to JSTOR Terms and Conditions
view of the above quotationsit is not surprisingthat acceptanceof this mathematization was slow and faced resistance.In fact even now some probabilistsfear that
mathematizationhas removedthe intrinsiccharmfrom their subject.And they are
based
rightin the sense that the charmof the old, vague probability-mathematics,
on nonmathematicaldefinitions,has split into two quite differentcharms:those of
real world probabilityand of mathematicalprecision.But it must be stressed that
manyof the most essentialresultsof mathematicalprobabilityhave been suggested
by the nonmathematicalcontext of real world probability,which has never even
had a universallyacceptable definition. In fact the relation between real world
probabilityand mathematicalprobabilityhas been simultaneouslythe bane of and
inspirationfor the developmentof mathematicalprobability.
2 VVhatis the real world (nonmathematical)problem? What is usually called
(real world) probabilityarises in many contexts. Besides the obvious contexts of
gamblinggames, of insurance,of statisticalphysics,there are such simple contexts
as the following.Suppose an individualrides his bicycle to work. The riderwould
be surprisedif, when the bicycleis parked,the valve on the front tire appearedin
the upper half of the tire circle 10 successive days, just as surprised as if 10
successivetosses of a coin all gave heads. However,it is clear that (tire context)if
the ride is very short, or (coin context) if the coin starts close to the coin landing
place and the initial rotational velocity of the coin is low, the surprise would
decrease and the probabilitycontextwould become suspect. The moral is that the
specific context must be examined closely before any probabilisticstatement is
made. If philosophyis relevant,an arguablequestion,it must be augmentedby an
examinationof the physicalcontext.
3 The law of large numbers. In a repetitivescheme of independenttrials,such as
coin tossing,what strikesone at once is what has been christenedthe law of large
numbers.In the simple context of coin tossing it states that in some sense the
numberof heads in n tosses dividedby n has limit 1/2 as the numberof tosses
increases.The key words here are in some sense. If the law of large numbersis a
mathematicaltheorem,that is, if there is a mathematicalmodel for coin tossing,in
which the law of large numbersis formulatedas a mathematicaltheorem, either
the theorem is true in one of the variousmathematicallimit concepts or it is not.
On the other hand, if the law of large numbersis to be stated in a real world
nonmathematicalcontext, it is not at all clear that the limit concept can be
formulatedin a reasonableway. The most obvious difficultyis that in the real
worldonly finitelymanyexperimentscan be performedin finite time. Anyonewho
tries to explainto studentswhat happenswhen a coin is tossed mumbleswordslike
in the long run, tends, seems to clusternear, and so on, in a desperate attempt to
give form to a cloudy concept. Yet the fact is that anyone tossing a coin observes
that for a modestnumberof coin tosses the numberof heads in n tosses dividedby
n seems to be getting closer to 1/2 as n increases.The simplestsolution, adopted
by a prominent Bayesian statistician, is the vacuous one: never discuss what
happenswhen a coin is tossed. A more commonequallysatisfactorysolution is to
leave fuzzy the question of whether the context under discussion is or is not
mathematics.Perhapsthe fact that the assertionis called a law is an example of
this fuzziness. The following statements have been made about this law (my
emphasis):
Laplace: (1814) This theorem,impliedby commonsense, was difficultto prove by
analysis.
1996]
THE EVOLUTION OF . . .
This content downloaded from 213.233.188.210 on Sat, 07 Nov 2015 11:14:07 UTC
All use subject to JSTOR Terms and Conditions
587
Ville:
Bauer:
588
THE EVOLUTION OF . . .
[Aug.-Sept.
This content downloaded from 213.233.188.210 on Sat, 07 Nov 2015 11:14:07 UTC
All use subject to JSTOR Terms and Conditions
the
U B) 2 +(A n B),
(5.1)
whereasadditivityof + would implyequalityin (5.1). The point is that the left side
of (5.1) is the probabilitythat the sequence x hits both A and B, a probabilityat
least equal to, and in general greater than, +(A n B), the probabilitythat the
sequence hits A n B. The inequality(5.1), the strongsubadditivityinequality,is
satisfiedalso by the electrostaticcapacityof a body in R3, and this fact hints at the
close connection between potential theory and probability,developed in great
detail in the second half of the century with the help of Choquet'stheory of
mathematicalcapacity.
6 The developmentof measuretheory. Recall that a Borelfield(-(r al,kebra)of
subsetsof a space is a collectionof subsetswhich is closed under the operationsof
complementationand the formation of countable unions and intersections.The
class of Borelsets of a metric space is the smallest set (r algebracontainingthe
open sets of the space. A measurablespace is a pair, (S, S), where S is a space and
s is a (r algebraof subsetsof S. The sets of S are the measurable
sets of the space.
In the following,if S is metric,the coupled ff algebramakingit into a measurable
space will alwaysbe the cr algebraof its Borel sets. In particular(RN,RN) denotes
N dimensionalEuclideanspace coupledwith its Borel sets. The superscriptwill be
omitted when N = 1. A measurablefunction from a measurablespace (S1,S1)
into a measurablespace tS2, S2) iS a function from S1 into S2 with the property
that the inverseimage of a set in 52 is a set in S1.
Measure theory started with Lebesgue's thesis (1902), which extended the
definitionof volume in RN to the Borel sets. Radon (1913) made the furtherstep
1996]
THE EVOLUTION OF . . .
This content downloaded from 213.233.188.210 on Sat, 07 Nov 2015 11:14:07 UTC
All use subject to JSTOR Terms and Conditions
589
variables, (t, s) x(t, s), fromOxS into the state space. Thefunction x(t,.)
X(S)Xt}@
The joint distributionof finitely many random variables defined on the same
probabilityspace is obtained by making x into a vector and specifying' and S'
correspondingly.A stochasticprocessis a familyof randomvariables{x(t,), t E >}
from some probabilityspace (S, X, P), into a state space (S', '). The set o is the
index set of the process. Thus a stochastic process defines a function of two
from S into S' is the tth randomvariableof the process;the function x(, s) from
into S' is the sth samplefunction,or samplepath, or sample sequence if o is a
sequence.
Borel (1909) pointed out that in the dyadic representationx = xtx2 . . . of a
numberx between 0 and 1, in which each digit Xj is either 0 or 1, these digits are
functions of x, and if the interval [0,1] is provided with Lebesgue measure, a
probabilitymeasureon this interval,these functionsmiraculouslybecome random
590
THE EVOLUTION OF . . .
[Aug.-Sept.
This content downloaded from 213.233.188.210 on Sat, 07 Nov 2015 11:14:07 UTC
All use subject to JSTOR Terms and Conditions
THE EVOLUTION OF . . .
This content downloaded from 213.233.188.210 on Sat, 07 Nov 2015 11:14:07 UTC
All use subject to JSTOR Terms and Conditions
591
little immediate influence because it was published in a journal which was not
widely distributed.It was an aspect of his genius that he carriedout his Brownian
motion researchthen and later without knowledgeof the slang and some of the
useful elementarymathematicaltechniquesof probabilitytheory.
Steinhaus (1930) demonstrated that classical arguments to derive standard
probabilitytheorems could be placed in a rigorous context by taking Lebesgue
measureon a linearintervalof length 1 as the basic probabilitymeasure,interpreting random variables as Lebesgue measurable functions on this interval, and
expectationsof randomvariablesas their integrals.No new proofs were required;
all that was requiredwas a proper translationof the classicalterminologyinto his
context. If this were all mathematizationof probabilityby measure theory had to
offer, the scorn of rigorous mathematicsexpressed by some nonmathematicians
would be justified.
8 Kolmogorov's1933 monograph.Kolmogorov(1933) constructedthe following
mathematicalbasis for probabilitytheory.
(a) The context of mathematicalprobabilityis a probabilityspace (S, X, P). The
sets in S are the mathematicalcounterpartsof real world events;the points
of S are counterpartsof elementaryevents, that is of individual(possible)
real world observations.
(b) Random variables on (S,S,P), are the counterpartsof functions of real
world observations.Suppose {x(t,),t e>} is a stochastic process on a
probabilityspace (S, X, P), with state space S'. A set of n of the process
randomvariableshas a probabilitydistributionon stn. Such finite dimensional distributionsare mutuallycompatiblein the sense that if 1 < m < n,
the joint distributionof x(t1,),. . ., x(tm,) on S'm is the m-dimensional
distribution induced by the n-dimensional distribution of x(t1,),....
x(tn,) onS
(c) Conversely,Kolmogorovproved that given an arbitraryindex set X, and a
suitablyrestrictedmeasurablespace (S",') (for example, the measurable
space can be a completeseparablemetricspace togetherwith the (r algebra
of its Borel sets) and a mutuallycompatibleset of distributionson stn, for
integers n 21, indexed by the finite subsets of X, there is a probability
space and a stochasticprocess {x(t,), t E >} defined on it, with state space
S', with the assigned joint random variable distributions.To prove this
result he constructeda probabilitymeasureon a (r algebraof subsetsof the
productspace S'>, the space of all functionsfrom > into S', and obtained
the requiredrandomvariablesas the coordinatefunctionsof S'>.
(e) The expectationof a numericallyvalued integrablerandomvariable is its
integralwith respect to the given probabilitymeasure.
(f) The classicaldefinitionof the conditionalprobabilityof an event (measurable set) A, given an event B of strictlypositive probability,is P(A r) B)/
P(B). In this way, for fixed B, new probabilitiesare obtained,and expectations of randomvariablesfor given B are computedin terms of these new
conditionalprobabilities.More generally, given an arbitrarycollection of
random variables, conditional probabilities and expectations relative to
given values of those randomvariablesare needed, functionsof the values
assigned to the conditioningrandomvariables.If (S, , P) is a probability
space, and if a collection of randomvariablesis given, let IFbe the smallest
sub (r algebra of S relative to which all the given randornvariables are
592
THE EVOLUTION OF . . .
[Aug.-Sept.
This content downloaded from 213.233.188.210 on Sat, 07 Nov 2015 11:14:07 UTC
All use subject to JSTOR Terms and Conditions
THE EVOLUTION OF . . .
This content downloaded from 213.233.188.210 on Sat, 07 Nov 2015 11:14:07 UTC
All use subject to JSTOR Terms and Conditions
593
THE EVOLUTION OF . . .
[Aug.-Sept.
This content downloaded from 213.233.188.210 on Sat, 07 Nov 2015 11:14:07 UTC
All use subject to JSTOR Terms and Conditions
Ito
as integrands,
processes
in 1944withstochastic
integral
Protter By developinghis
techwithpurelyprobabilistic
diffusions
wasableto studymultidimensional
overtheanalyticmethodsof Feller.
niques,an improvement
The following remark on the convergence of a sum of orthogonal functions
illustratesthe difficultyin separating(mathematical)probabilityfrom the rest of
analysis.The measure space is a probabilityspace, but with trivial changes the
discussionis valid for any finite measurespace.
If x is an orthogonalsequence of functions,on a probabilitymeasure space,
and if xn2has integral (rn2,then (Riesz-Fischer)Exn convergesin the mean if
E
<
(13.1)
n < + oo,
( 13.2)
or (Levy, 1937) the condition (13.1) is kept but the orthogonalitycondition is
strengthenedto the conditionin Section 12.
The reader shouldjudge which of these results is measuretheoretic and which
is probabilistic,whether there is any point in evicting mathematicalprobability
from analysis,and if so whether measuretheoryshould also be evicted.
E
(rn2 log2
Road,Apt.1104
101 WestWindsor
Illinois61801-6663
Urbana,
math.uiuc.edu
doob@symcom.
1996]
THE EVOLUTION OF . . .
This content downloaded from 213.233.188.210 on Sat, 07 Nov 2015 11:14:07 UTC
All use subject to JSTOR Terms and Conditions
s9s