Complete Solutions PDF

SPRINGER TEXTS IN STATISTICS
JIM PITMAN
PROBABILITY
Instructor's Manual
Springer-Verlag
New York Berlin Heidelberg London Paris
Tokyo Hong Kong Barcelona Budapest
@ 1993 Springer-Verlag New York, Inc.
All rights reserved. This work may not be translated
or copied in whole or in part without the written
permission of the Publisher: Springer-Verlag New York, Inc.,
175 Fifth Avenue, New York, NY 10010, USA.·Use in connection
with any form of information storage and retrieval, electronic
adaption, computer software, or by similar or dissimilar
methodology now known or hereafter is forbidden.
I
©1993 Springer-Verlag New York, Inc. I
Section 1.1
1. a) 2/3 'b) 66.67% c) 0.6667 d) 4/7 e) 57.14% f) 0.5714
2. a) 7/10 = 0.7 b) 4/10 = 0.4 c) 4/10 = 0.4

3. a.) If the tickets a.re drawn with replacement, then, as in Example 3, there are n 2 equally likdy
outcomes. There is just one pair in which the first number is 1 a.nd the second number is 2. so
P(first ticket is 1 and second ticket is 2) = 1/n 2 •
b) The event (the numbers on the two tickets are consecutive integers) consists of n- 1 outcomes:
(1, 2), (2, 3), ... (n- 1, n). So its probability is (n- I)/n 2 •
c) Same as Problem 3 of Example 3. Answer: (1 - 1/n)/2
d) If the draws are made without replacement, then there are only n 2 - n equally likdy possi-
ble outcomes, since we have to exclude the outcomes (I, 1), (2, 2), ... , (n, n). So replace the
denominators in a.) through c) by n(n- 1).
4. a.) 2/38
b) 1 - (2/38) = 36/38
c) 1, since P(both win) = 0.
5. a) #(total) = 52 x 51 = 2652 possibilities.
b) #(first card ace)= 4 x 51 = 204. Thus P(first card ace) = 4/52 = 1/13.
c) This will be exactly the same calculation as in part b) -just substitute "second" for "first".
Thus, P(secoud card ace) = 1/13. Because you have all possible ordered pairs of cards, any
probability statement concerning the first card by itsdf must also be true for the second card
by itsdf.
_ #(both aces) _ 4 xJ _ 1
d) P(bot h aces ) - #( tota:I) - 2652 - 221 •
e) P(a.t least one ace)= P(first card ace)+P(second card a.ce)-P(both cards aces)=
= . .
6. a.) 52 X 52 = 2704
b) (52 X 4)/(52 X 52)= 1/13
c) Same as b)
d) ( 4 X 4}/(52 X 52) = 1/169
e) P(a.t least one ace)= P(first card ace)+P(second card ace)-P(both cards aces)= !J+ 113 -
1 9
!
- 25
- 169 •
7. a.) ?(maximum :52)= P(both dice :52)= 4/36 = 1/9

b) ?(maximum $ 3) = P(both dice $ 3) = 9/36 = 1/4
c)
e) Since this covers all the possible outcomes of the experiment, a.nd these events are mutually
exclusive, you should expect P(l) + P(2) + P(J) + P(4) + P(S) + P(6) = 1.
8. a.-d) As in Example 3, the outcome space consists of n 2 equa.lly likely pairs of numbers. each number
between 1 a.nd n. The event (the maximum of the two numbers is less than or equal to r) is
represented by the set of pairs having both entries less than or equal to r. There a.re r 2 possible
pairs of this type, so for r = 0 ton: ?(maximum $ r) = r 2 fn 1 and for r = I ton
P(r) = P(ma.ximum is exactly r)
=?(maximum $ z)- ?(maximum :5 r - 1)
= (2r- l)/n 1 ,
Section 1.1
e) As in the previous exercise, 2:;= 1 P(x) = 1.

Remark. It follows that 2::= 1
(2x - 1) = 2
n • In other words. the sum of the first n odd numbers
is n 2 , a fact which you can check in other ways.
9. 1/11. 1/6
10.
Payoff Odds =
Play Chance of win r,... 11 to 1 House Percentage
A 18/38 1 to 1 5.26%
B 12/38 2 to 1 5.26%
c 12/38 2 to 1 5.26%
D 6/38 5 to 1 5.26%
E 5/38 6 to 1 7.89%
F 4/38 8 to 1 5.26%
G 3/38 11 to 1 5.26%
H 2/38 17 to 1 5.26%
I 1/38 35 to 1 5.26%
Use the formula: House percentage = (1 - P( win) x {rpa 11 + 1)) x 100%
11. Call the event A. By definition of fair odds, we have Tfair = (1- P(A))/P(A). Solve for P(A) =
1/(rfair+ I) and substitute in the formula. for the house percentage: House percentage =(
1 - P(A) . ( rpay + 1)) x
100%.
2
Section 1.2
Section 1.2
1. The relative frequency interpretation ma.kes no sense here. Presumably, what is meant is "more likely
than not in the subjective opinion of the judge".
2. Less than IIIOO. Since r,.. 11 < Tfair (we assume the bookma.ker wants to ma.ke a. profit), it follows
that
P(win) = Il(r/air +I) < Il(rpa 11 +I).
3. a) Let ri to I be the fair odds against horse i winning, say in the opinion of the bookmaker, and
let Pi be the probability of horse i winning. Then (presuming the bookmaker wants a profit)
Ti < ri, so
10 10 10 10
E = "'p; = " ' _ I _ > " ' _ I _ = "'Pi = 1.
LJ LJ r; + I LJ r; +1 LJ
i=l i=l i=l i=l
b) Yes. Bet a. total of $Bon the horses, with proportion p;IE of this total, i.e., Sp;BIE on horse
i. If horse i wins, you get back an amount $(r; + 1)p;BIE = $BIE >$B. So, no matter which
horse wins (and regardless of any you get back more than you bet. A golden
opportunity rarely found a.t the races!
4. a) Overall gain = ($8) x IO- ($I) x 90 = -$10.

b) Average gain per bet = (overall gain) I(# bets) = = -10 cents.
c) Gambler's average gain per bet = (overall gain) I (#bets)
_ r,..11 x (#wins)- 1 X (#losses) _ rpa 11 - r#

- #wins+#losses - I + r#
The house's average financial gain per bet is therefore (r#- rpa 11 )l(r# +I) over this sequence of
bets. Recall that if the fair (chance) odds against the gambler winning are Tfair to 1, then the
house percentage is (rJair- r,.. 11 )l(rtair + 1) X 100%. As the number of bets made increases, we
should expect (according to the frequency interpretation of probability) r# to approach Tfair,
and therefore the house's average financial gain per bet should approach the house percentage.
So we can interpret the house percentage as the long run average financial gain on a one dollar
bet.
3
Section 1.3
Section 1.3
1. The cake is (2 x 2) +2+1= 7 times as large as the piece your neighbor gets, so you get 4/7 of the
cake.
2. a) (ABc)u(AcB)
b) AcBece
c) Exactly one : (AWGc) U (Ac BCc) U (Ae BeG)
Exactly two: (ABGc) U (ABcC) U (AcBG)
Three: ABC
3. Take n= {1, 2, ... , 500}.
a) {17,93,202}
b) {17,93,202,4,101,102,398}c
c) {16,18,92,94,201,203}
4. a) {0,1}
b) Yes: {1}
c) No. This is a subset of the event {1} but it is not identical to {1} because the event (first toss
tails, second toss heads) also is a subset of {1}.
d) Yes: {1, 2}
5. a) first coin lands heads
b) second coin lands tails
c) same as a)
d) at least two heads
e) exactly two tails
f) first two coins land the same wa.y.
outcome 1 2 4 6 7 8
6. a)
probability 1/10 1/5 3/10 1/5 1/10 1/10
outcome 1 2 3
b)
probability 3/5 1/5 1/5
7. a) As in Example 3, using the addition and the symmetry assumption, P(l) = P(6) = p/2
and P(2) = P(3) = P(4) = P(5) = (1 - p)/4
b) Use the additivity property: P(3 or 4 or 5 or 6) = J(l;p> +f =
8. a) P(A u B) = P(A) + P(B)- P(AB) = 0.6 + 0.4- 0.2 = 0.8

b) P(Ac) = 1- P(A) = 0.4
c) Similarly P(Bc) = 0.6
d) P(Ac B)= P(B)- P(AB) = 0.4- 0.2 = 0.2
e) P(A u Be)= P[(Ac B)c] = 1- 0.2 = 0.8
f) P(Ac Be)= P[(A U Bt] = 1- 0.8 = 0.2
9. a) 0.9 b) 1 c) 0.1
10. a) P(exactly 2 of A, B, C)= P(ABCc) + P(ABcC) + P(AcBC)
= (P(AB)- P(ABC)) + (P(AC)- P(ABC)) + (P(BC)- P(ABC))
= P(AB) + P(AC) + P(BC)- 3P(ABC)
4
Section 1.3
b) P(exactly 1 of A, B, C)= P(AB"C") + P(A" B"C) + P(A" B"C)

= P(A)- P(An (B uC)) + P(B)- P(B n (Au C))+ P(C)- P(Cn (Au B))
= P(A) + P(B) + P(C)- 2 (P(AB) + P(AC) + P(BC)) + 3P(ABC)
c) P(exactly none)= 1- P(AU BUG)

= 1- (P(A) + P(B) + P(C)) + (P(AB) + P(AC) + P(BC))- P(ABC)
11. By inclusion-exclusion for n = 2,
P(A u B uC) = P[(AU B) u C)= P(Au B)+ P(C)- P[(Au B)C]

= P(A) + P(B)- P(AB) + P(C)- P[(A u B)C].
Now (AU B)C = (AC) U (BC), which is clear from a Venn diagram. So by another application of
inclusion-exclusion for n = 2,
P[(A u B)C] = P[(AC) u (BC)) = P(AC) + P(BC)- P[(AC)(BC)]
= P(AC) + P(BC)- P(ABC)
Substitute this into the previous expression.
12. Formula has been proved for n = 1, 2, 3. Proceed by mathematical induction. We should first prove
(uf=tA;) n An+l = u;=l (A;An+l),

which is easily checked, e.g. by another mathematical induction using the n = 2 case: (AU B)C =
(AC) U (BC). Next, assume that our hypothesis is true for n. Then
P(u';:11 A;) = P((u;'= 1 A;) u An+t] = P(Ui=tAi) + P(An+t) - P(uf=t (A;An+t )]
(by inclusion-exclusion for n = 2)
n
= 2:: P(A;)- L P(A;A;) + 2:: P(A;A;Ak)- ...

•=•
(by applying the induction hypothesis to the first and third term)
n+l
=L P(A;)- 2:: P(A;A;) + L P(A;A;Ak)- ...
i=l
(by regrouping terms.) So the claim holds for n + 1.

13. For n = 2, P(A 1 U A2) = P(At) + P(A2) - P(A1 n A2) P(AI) + P(A2) Now use induction by
assuming true for n.
P (g A;) =P ( [Q A;] u An+ I) (Q

P A;) + P(An+t) t P(A;) + P(An+t)
14. Use the identity P(A u B) = P(A) + P(B)- P(AB) and the fact that P(A u B) 1.
15. Let A;= Bf_ Then
n
P(B1 B2 ·--En) = P(n Af) = P((U A;)") = I - P(U A;)

i=l i=l
n n n
1-2:: P(A;) =I- L(l- P(R)) = L P(B;) - (n- 1)

i=l i=J i=l
5
Section 1.3
16. a) Use mathematical induction. The claim holds for n = 1 and 2. Suppose the claim holds for
unions of n sets. To show the claim for n + 1 sets, use inclusion-exclusion for 2 sets to write
Use the induction hypothesis to bound the first term:
P(ui'=tA;) 2::: P(A;)- 2::: P(A;A;)

i$n i<j$n
and use Boole's inequality to bound the last term:
P ((uf=tA;)An+d = $ 2::: P(A;An+d

Regroup to see the claim holds for n + 1.
b) Define for m 1 and sets At, ... , An
S(m;At,··· ,An)
= 2::: P(A;)- 2::: P(A;A;) + ... + (-1)"'+t
The claim is that

Claim: For each m 1 and each n 1: If At, ... , An are n sets then
( -1)m A;)- S(m; At, ... , An)) O.
Proof: Use mathematical induction. The claim holds for m = 1 (by a)) (and also for m = 2,
by b)). Say the claim holds form. Now show that this implies the claim holds for m + 1, i.e.,
that:
Subclaim: (m is fixed) For each n 1: If At,··· ,An are n sets then
( -1)m+t A;)- S(m + 1; At, ... , An)) 0.
Proof of subclaim: Again use mathematical induction. This subclaim holds for n = 1, ... , m
because the inclusion-exclusion formula (Exercise 12) shows that the left side equals zero. Sup-
pose the claim holds for n. Now show that this implies the subclaim holds for n + 1, i.e.,
that
( -1)m+t A;)- S(m + 1; At, ... , An+ d) 0.
To do this, first observe that
S(m + 1; At, ... , An)+ P!An+t)- S(m; AtAn+t, ... , AnAn+d
= S(m + 1;Ax, ... ,An+t). (**)

Therefore
(-1)"'+t ,An+t))
= (-l)"'+t (P(Uf A;)+ P(An+d- P(Uf A;An+t)- S(m + 1; At, ... , An+ d)
(by inclusion-exclusion for 2 sets)
= {-1)'"+t {P(ufA;)
-P(u;' A;An+d + S(m; AtAn+t, ... , AnAn+d- S(m + 1; At, ... , An)}
(by identity ( **))
= (-l)"'+t {P{Uf A;)- S(m + 1; At, ... , An)}
+{-1)"' {P(U!A;An+d- S{m:AtAn+t···· ,AnAn+t)}
The first term is nonnegative by the induction hypothesis on n; the second term is nonnegative
by the induction hypothesis on m. So (•) holds, and we're done!
6
Section 1.4
Section 1.4
1. Let r denote the proportion of women in the population. Then the proportion r of righthanders is
given by r = .92 x r + .88 x (1 - r) = .88 + .04r.
a.) (iii) ; since r is unknown.

b) (i) ; since 0 r 1 implies .88 r .92.
c) (i) ; .88 + .04 x .5 = .9.
d) (i) ; solve .88 + .04 x r = .9, you'll get r = .5.
e) (i) ; if r .75, then r .91.
2. Pick a light bulb at random. Let D be the event (the light bulb is defective), and B be the event (the
light bulb is made in city B). Then P(B) = 1/3, P(DIB) = .01, and
P(Dc B) = P(DciB)P(B) = (1- P(DIB)JP(B) = 0.99 x 1/3 = 0.33

. . _ P(rain today and tomorrow) _ 30% _ 3
3. P(ram tomorrow I ra.m today) - P(rain today) - 40% - 4
4. Call the events A (having probability .1) and B (having probability .3).
a) P(AcSC) = P(Ac)P(Bc) = 0.9 x 0.7 = 0.63
b) 1- P(Ac Be)= 0.37
c) P(ABc u AcB) = P(ABc) + P(AcB) = 0.1 x 0.7 + 0.9 x 0.3 = 0.34
5. a.) Let U1 =(urn 1 chosen), U2 =(urn 2 chosen), B =(black ball chosen), W =(white ball chosen).
2
5
u[ B
1 3
-
2 5
1
- -4
7
2
u2 3
w
7
b) P(Ut) = t = P(U2); P(WIUt) = i; P(BIUt) = P(WIU2) = P(BIU2) =

c) P(B) = P(BUt) + P(BU2) = P(BIUI)P(Ut) + P(BIU2)P(U2) = f x t + x t=
6. P(second spade I first black)
P first black and second spade
= P first black
_ P(first spade a.nd second spade} + P(first club a.nd second spade)
- P(first black)
(13/52)(12/51) + (13/52}(13/51) 25
{26/32) = 102.
Or you ma.y use a. symmetry argument a.s follows:
P(second black I first black)= 25/51,
and P(second spade I first black) = P(second club I first black)
by symmetry. Therefore ?(second spade I first black) 25/102. =
Discussion. The frequency interpretation is that over the long run, out of every 102 deaJs yielding
a. black card first, about 25 will yield a. spade second.
7
Section 1.4
7. a) P(B) = 0.3
b) P(A u B)= P(A) + P(B)- P(A) · P(B)
__..._ P(B)- P(AUB)-P(A) = 0.3 = 0 6
--.- - 1-P(A) 0.5 •
8. Assume n cards and all 2n faces are equally likely to show on top.
P(white on bottom I black on top)
_ P(white on bottom and black on top) _ so"x _
- P{black on top} - so" x +20" - 519
9. By scheme A, P(student selected is from School 1) = 100/1000 1/10, while by scheme B, that
chance is just 1/3, which is the chance that school 1 is selected. So these two schemes are not
probabilistically equivalent.
Consider a particular student x, and suppose she is from school i. The chance that x will be selected
by scheme A is 1/1000. The chance that x will be selected by scheme Cis
P(x is selected) = P(School i is selected, and then x) = p; · ( . / I .) .

SlZe 0 C ass I
For scheme C to be equivalent to scheme A, this chance should be 1/1000. Therefore,

Pi = (size of class i)/1000. So PI = .1, P2 = .4, PJ = .5.
10. a) Let A; be the event that the ith source works.

P(zero work) = P(AfA2) = 0.6 x 0.5 = 0.3
P(exactly one works)= + P(AfA 2) = 0.4 x 0.5 + 0.6 x 0.5 = 0.5
P(both work) = P(A1A2) = 0.4 x 0.5 = 0.2
b) P(enough power)= 0.6 x 0.5 + 1 X 0.2 = 0.5
11. Let B1 = {firstborn is a boy), G1 = (firstborn is a girl), similarly for B2 and G2.
a) P(B1B2) = P(B1B2I idcntical)P(identical) + P(B 1B2 I fraternal)P(fraternal)

=!.. p +!.. (1- p) = 1 + p.
2 4 4
b)
c) Note that the chance that the firstborn is a boy is 1/2 whether identical or fraternal, so P(B1) =
1/2. Similarly P(B2) = P(G1) = P(G2) = 1/2. So
P(G 2 IB 1 )= P(B1G2) = 1/4·(1-p) = !.(1 _ )

P(B!) 1/2 2 P •
d) .
P(G 2IG 1)= P(G1G2) = 1/4·(1+p) = !.( 1 )
P(G!) 1/2 2 + P .
P(F)-P(FG)
12 • 1 P(G)
8
Section 1.5
Section 1.5
1. a) P(black) = P(black I odd)P(odd) + P{black I even)P(even) = tx t+ x t = -J.

b) P(even I white)= P(white I even)P(even) _ c2 t 3 )x(l/ 2 l _ .!...
P(white) - 17/24 - 11
2.
7
a) P(W2) = P(Wt W2) + P(B1 W2) = • ts + 6

1 0 • 1; =
b) P(B 1 IW2 ) -- P(B, W2) - .:&:1i.- 6
P(W ) - 1.i - IT·
2
c) Repeat a.), with symbols:

P{W2) = • + • =
3. Pick at chip at random. Let B = (chip is bad), let A= (chip passes the cheap test). Then P(B) = .2,
P(AIB) = .1, and P(AIBc) = I, which imply P(B") = .8, and P(AciB) = .9.
.02
A(Pti•l
B (Bad)
.2 Ac(Fail) .18
.8
A (Pass) .80
clA) _ P(AIB.,)P(B•) _ o.ao _ 40 O B • 1 r dd

a ) P(B - P(AIB•)P(B•)+P(AIB)P(Bl - o.02+0.ao - i t · ruse ayes rue aor o s.
b) P(BIA) = 1- P(BclA) = 1/41.
P(T.IR
0 1) _ _
a)
4. _ 1
- P(RdTo-p(T0 )+ RdT1 )P(T.)- (.o1)(.s)+ .98)(.s) - 99
b) (.01)(.2)
(.01)(.2)+(.98)(.8) = 3931
9
Section 1.5
c) P(error in transmission)= U RtTo)
= + P(RtTo)
= + P(Rt!To)P(To)
= (.02)(.5) + (.01)(.5) = 3/200
d) (.02){.8) + (.01){.2) = 9/500
5. Let D denote the event (person has the disease),+ denote the event (person is diagnosed as having the
disease), and- denote the event (person is diagnosed as healthy). Then P(D) = .01, P(+IDc) = .05,
P( -I D)= .2, which in turn imply that P(Dc) = .99, P( -IDe)= .95, P( +I D)= .8 .
+ .008
D
.01 .002
.99 + .0495
a) P(+) = P(+ID)P(D)+P(+IDc)P(Dc) = .0575.

b) P(- n D)= P( -ID)P(D) = .002
< 5
.9405
c) P(- n DC) = P( -IDC)P(Dc) = .9405

d) P(DI+)--
P(+!D)YD) = .ll..
P(+ 115'
Or we may ar ue using odds ratios, as follows:
_flQl_ +D !6 __._ 1 6 = 99
= 1
99 p + DC) = o.os
I
0.8
= - = 991 X
16
Hence P( Dl+) = 99t_:.16 = 1 1; .

1
e) Yes. See the explanation after Example 3.
6. a) The experimenter is assuming that before the experiment, H 1 , H2, and H3 are equally likely.
That is, prior probabilities are given by P(Ht) =
P(H2) = P(H3) = 1/3.
b) No, because the above assumption is being made. Since the prior probabilities are subjective,
so are the posterior probabilities.
c) Prior probabilities: P(Ht) = .5, P(H2) = .45, P(H3 ) = .05.
Likelihoods of A: P(AIHt) =
.1, P(AIH2) =
.01, P(AIH3) = .39.
[Now all these probabilities have a long run frequency interpretation, so the posterior probabil-
ities will as well.]
Posterior probabilities are proportional to: .05, .0045, .0 195. So If 3 is no longer the most likely
hypothesis; H1 is, and P(HJIA) = = .263.
7. a) As in Example 1,
2 3
P(Box i and white) 1/6 2/9 1/4
P(Box i and black) 1/6 1/9 1/12
P(Box i I white) 6/23 8/23 9/23
P(Box. i I black) 6/13 4/13 3/13
10
Section 1.5
Since P(Box i I white) is largest when i =

3 and P(Box i I black) is largest when i 1, the =
strategy is to guess Box 3 when a white ball is drawn and guess Box 1 when a black ball is
drawn. The probability of guessing correctly is
P(Box 3 and white)+ P(Box 1 and black) = 5/12.
b) Suppose your strategy is to guess box i with probability p; (i = 1, 2, 3; Pl + P2 + p 3 = 1)

whenever a white ball is drawn, and suppose I am picking each box with probability 1/3. Then
in cases where a white ball is drawn, the probability that you guess correctly is
6 8 9 9 3 1
Pl 23 + Pl 23 + PJ 23· = 23 - Pt 23 - p 2 23.
Clearly the probability of your guessing correctly is greatest when PJ = 1; that is, when you
guess Box 3 every time that you see a white ball. A similar argument for the case where a black
ball is drawn shows that the probability of your guessing correctly is greatest when you guess
Box 1 every time that you see a black ball.
c) Here P(Box 1) = 1/2, P(Box 2) = 1/4, P(Box 3) = 1/4, so that
1 2 3
P(Box i and white) 1/4 1/6 3/16
P(Box i and black) 1/4 1/12 1/16
P(Box i I white) 12/29 8/29 9/29
P(Box i I black) 12/19 4/19 3/19
If you continue to use the strategy found in a), i.e. guess Box 3 if a white ball is seen, and guess
Box 1 if a black ball is seen, then the probability of guessing correctly is
P(Box 3 and white)+ P(Box 1 and black) = :6 + = 176 .

d) If you are convinced that I am using either the (1/3, 1/3, 1/3) strategy or the (1/2, 1/4,
1/4) strategy, then you can decide which one it is by observing the long-run proportion of
trials that your guesses are correct. If I am using the (1/3, 1/3, 1/3) strategy then, by the
frequency interpretation of probability, the proportion of trials that you guess correctly should
be approximately 5/12, while if I am using the (1/2, 1/4, 1/4) strategy then this proportion
should be closer to 7/16. If I am in fact using the (1/2, 1/4, 1/4) picking strategy, you can
improve upon the guessing strategy found in a) as follows: Since P(Box i I white) is largest
when i = 1 and P(Box i I black) is largest when i = 1, the strategy is to always guess Box 1.
The probability of guessing correctly is then
• 1
P(Box 1) = .
2
Remark: If you're not sure at all what strategy I am using, you can determine it approximately,
as follows: Suppose I am using a (p1, P2, PJ) strategy to pick the boxes, and suppose you decide
that you will always pick (say) Box 1 when you see a white ball, Box 2 when you see a black ball.
It is possible for you to determine my strategy, provided you can keep track of the proportion
of trials where a white ball was chosen from Box l and the proportion of trials where a black
ball was chosen from Box 2. (This you can do, because you can see the color of the ball drawn,
and you are told whether your guess is correct.) By the frequency interpretation of probability,
these two proportions should in the long run approximate the probabilities
P(Box I and white) = !.Pt

2
and P(Box 2 and black) = !.Pl
3
respectively. Thus you should be able to determine (approximately) the values p 1 and P2 (and
hence PJ ).
8. a) The prior probabilities (of the boxes that I pick) are as follows:
11
Section 1.5
i 1 2 3
6/23 9/23 8/23
to determine the posterior probabilities:
i 1 2 3
P(Box i I bl=k) 3/8 3/8 1/4
So you should guess Box 1 or 2. For either choice, P(you guess correctly I bl=k) = 3/8.
P(Box i·and white) = •;·ffi
b) Use Bayes' rule P(Box i I white) = P{white) •a·!+,.2
T+,..3 ·f to determine the
posterior probabilities:
1 2 3
P(Box i I white) 1/5 2/5 2/5
[You can use P(white) = 1 - P(bl=k) = 15/23.] So you should guess Box 2. For either choice,
P(you guess correctly I white) = 2/5.
c) P(guess correctly)
= P(guess correctly I bl=k)P(bl=k) + P(guess correctly I white)P(white)
283 :a·
This is your chance of winning, no matter which of the two best choices you make in e=h case.
For instance, you could simply always guess Box 2, in which case the event of a win for you is
the same as the event that I pick Box 2.
To see why the probability of your guessing correctly is at most 9/23 whatever your strategy may
be, suppose that your strategy whenever a black ball is drawn is to guess Box i with probability
p;. Then
P(guess correctly I bl=k)
= Pl P(Box 1 I black)+ P2P(Box 2 I black)+ paP(Box 3 I bl=k)
= Pl + P2 tb3 $
Similarly, whatever your strategy whenever a white ball is drawn,
P(guess correctly I white)$ Therefore
P(guess correctly)
= P(guess correctly I black)P(bl=k) + P(guess correctly I white)P(white)
$ X + X = ia .
d) P(guess correctly I Box 1)
= P(guess 1 I bl=k & Box 1)P(bl=k & Box 1 I Box 1)
+ P(guess 1 I white & Box l)P(white & Box 1 I Box 1)
Similarly P(guess correctly I Box 2) =

P(guess correctly I Box 3) = 0 x + t x
x k+
=
*
i3 •
x = -!3 and
12
Section 1.6
Section 1.6
1. This is just like the birthday problem:

P(2 or more under sa.me sign) = 1 - P(all different signs).
11
n = 2 : chance = 1 - -
12
= -121 1
< -2
11 10 17 1
n = 3 : chance = 1 - - x
12 12
= -
72
< -2
11 10 9 123 1
n = 4: chance= 1 - 12 x 12 x 12 = 288 < 2
11 10 9 8 178 1
n = 5 : chance= 1 - 12 x 12 x 12 x 12 = 288 > 2
So, the answer is 5.
2. Assume the successive hits are independent, each occurring with chance 0.3.
a) 1- (0.7) 2 = 0.51
b) 1- (0.7) 3 = 0.657
c) 1- (0.7)"
3. a)
_ P a.t lea.st two H _ 3(2{3) 2 (1(3)±J213)s _ .rn _ 7695
- P a.t lea.st one H - 1-{1/3) - .963 - • •
b) P( exactly one H I at lea.st one H) + P( at lea.st two H I at lea.st one H) = 1,

because (at lea.st one H) = (exactly one H) U (at lea.st two H), and the two events are disjoint.
So P( exactly one H I at lea.st one H) = 1 - . 7695 = .2305.
4. a) fa x io x 1
20 = 9/8000
b) (fa X io Cio X 210) io X 210) = 353/8000
c) P(jackpot) = {0 x io x = 9/8000, sa.me a.s before;
P(two bells) = (fa- x x io
+ x xUo x x fa-) = 273/8000 io
The chance of the jackpot is the sa.me on both machines, but the 1-9-1 machine encourages you
to play, because you have a better chance of two bells. It will seem that you are "dose" to a
jackpot more frequently.
5. a) P(at lea.st one student ha.s the sa.me birthday a.s mine)
= 1- P(all n- 1 other students have different birthdays from mine)
= 1 - {364/365)"- 1
[My birthday is one particular day: in order for the event to occur, each of the other students
must have been born on one of the remaining 364 days .]
b) Use the argument in Example 3. The desired event ha.s probability at lea.st 1/2 if and only if its
complement ha.s probability at most 1/2. This occurs if and only if (364/365)"- 1 $ 1/2 <==>
log(1{2) _ So ill d
n- 1 log( 364136.!>) - 253.6. n = 254 w o.
c) The difference between this and the standard birthday problem is that a particular birthday
(say January 1st) must occur twice in the class. This is a much stricter requirement than in the
usual birthday problem.
6. a.) Ps = pg = ··· = 0. (Never need to roU more than seven times)

PI= 0
P2 = 1/6
PJ =
(5/6) · (2/6) (2nd different from first, third the same a.s first or second)
13
Section 1.6
p, = (5/6) · (4/6) · (3/6)

pr, = (5/6). (4/6). (3/6). {4/6)
P6 = (5/6) · (4/6) · (3/6) · (2/6) · (5/6)
pr = (5/6) · (4/6) · (3/6) · {2/6) · {1/6) · 1
b) p: .+ · · · + p 10 = 1: you must stop before the tenth roll, and the events determining p 1 , p 2 , etc.,
are mutually exclusive.
c) Of course you can compute them and add them up. Here's another way. In general, let Ai be
the event that the first i rolls are different, then Pi = P(Ai-d- P(Ai) for i = 2, ... , 7, with
P(Al) = 1, and P(Ar) = 0. Adding them up, you can easily check that the sum is 1.
7. a) P3(Pt + P2 - PtP2) = P3Ptq2 + P3qtP2 + P3PtP2.

b) p 4 +P(flows along top) -p( ·P(fiows along top) where P(flows along top) was calculated in a).
8. a) P(B12 and B23) = P(all three have the same birthdates) = !:; · • !
3 5 = (3615 )>.
P(Bt2) = · !
= 3 5 = P(B23).
So P(B12 and B23) = =! !
3 5 • 3 5 = P(Bt2)P(B23).
Therefore, B12 and B2a are independent!
b) No. If you tell me B12 and B2J have occurred, then all three have the same birthday, so B 13
also has occurred. That is, P(Bt2B23Ba1) = 36152 =I= = P(Bt2)P(B2a)P(B3t).
c) Yes, each pair is independent by the same reason as a).
14
Chapter 1: Review
Chapter 1: Review
1. P(both defective I item picked at random defective)

_ P both defective _ 3 z = _6
- P(item picked at random defective - 3'lliH'llixif2 11
2.
2
3
0
3 •
From the above tree diagram,
=
P(black) = P(whlte) 1/2,
P(black, black) = P(whlte, whlte) = 1/3,
P(black, whlte) = P(white, black) = 1/6.
So P(white I at least one of two balls drawn was white) = = 3/4.
3. False. Of course P(H H H or TIT) = 1/4. The problem with the reasoning is that while two of the
coins at least must land the same way, which two is not known in advance. Thus given say two or more
H's, i.e. H H H, H HT, HTH, or T H H, these 4 outcomes are equally likely, so P(H H H or TIT I
at least two H's) = 1/4, not 1/2. Similarly given at least two T's.
4. a) P(black I Box 1) = 3/5 = P(black I Box 2)

=
and P(red I Box 1) 2/5 = P(red I Box 2).
Hence P(black) = P(black I Box i) and P(red) = P(red I Box i) for i = 1, 2, and so the color
of the ball is independent of which box is chosen. Or yon can check that P(black, Box 1) =
P(black)P(Box 1), etc, from the following table.
Box 1 Box 2
black 3/10 3/10 3/5
red 1/5 1/5 2/5
1/2 1/2
b)
Box 1 Box 2
black 3/10 5/18 26/45
red 1/5 2/9 19/45
1/2 1/2
Observe that P(black, Box I)#; P(black)P(Box 1), so color of ball is not independent of which
box is chosen.
15
Chapter 1: Review
5. For either of the permissible orders in which to attempt the tasks, we have
where S; denotes the event {task i performed successfully}; you don't have to worry abc• ·1e third
task if you already have performed the first two tasks successfully. For the first order y, hard,
easy) the probability of passing the test is therefore
P(pass test)= P(S1S2) + P(SfS2S3) = zh + (1- z)hz = zh(2- z).

By symmetry, the probability for the second order (hard, easy, hard) is hz(2- h).
the second order maximizes the probability of passing the test.
6. a) Since P(B) = P(AB) + P(A"B),
P(A" B)= P(B)- P(AB) = P(B)- P(A)P(B) = P(B)(1- P(A)) = P(B)P(A")
b) Just reverse the roles of A and Bin part a).

c) P(A" B") = P((A u B)")= 1- P(A u B)
=1- P(A)- P(B) + P(AB)
= 1 - P(A)- P(B) + P(A)P(B) (by independence of A and B)
= (1- P(A))(l- P(B)) = P(A")P(B")
7. a) The probability that there will be no one in favor of 134 is the probability that the 1st person
and the 4th person doesn't favor 134. This is just x x x !:

= .021*
doesn't favor 134 and the 2nd person doesn't favor 134 and the 3rd person doesn't favor 134
b) The probability that at least one person favors 134 is 1- the probability that no one favors 134;
but the probability that no one favors 134 was done in part a). Thus the answer is 1-.021 = .979.
c) The probability that exactly one person favors 134 is G)
times the probability that the 1st person
favors and the 2nd person doesn't and the 3rd person doesn't and the 4th person doesn't. This
• (() 30 19 18 17 - 0 126
IS 1 X 50 X 49 X { i X H - • •
d) The probability that a majority favor 134 is the probability that three people favor 134 plus the
probability that four people favor 134. This probability is G) x x x x + x x
X = 0.472
8. a) 0.08228 b) 0.7785 c) 0.1866
9. a) By independence, P(A and B and G)= P(A) · P(B) · P(G) = = .01666 ...
b) P(A orB or G)= 1- P(A"B"C")
= 1- P(A") · P(B") · P(G") = 1- t · ·i = = .6.
Or use inclusion-exclusion.
c) P(exactly one of the three events occurs)= P(AB"G") + P(A"BG") + P(A"B"G)
= k. . i + t .t . i + t . . = = = .433 ...
10. a)
P{same type) = P(both A or both B or both AB or both 0)

= P{both A)+ P(both B)+ P{both AB) + P(both 0)
= (.42) + (.10) + (.04) 2 + {.44) 2 = .3816;
2 2
P{different types) = 1 - P(same type)= 1 - .3816 = .6184
b) Since P(1) + P{2) + P{3) + P(4) = 1, we need only evaluate three of these; the fourth can be
obtained by subtraction.
P(1) = P(all A or all B or all AB or all 0)

= P(all A)+ P(all B)+ P(all AB) + P(all 0)
= (.42)• + (.10)• + (.04)• + (.44)i = .0687
16
Chapter 1: Review
P(4) = (# arrangements)(A2)(.10)(.04)(.44) = (4!){.0007392) = .01774

2 2 2 2 2
P(2) = ((.42} 2 (.10) + (.42} (.04) + (.42) (.44)
2 2 2 2 2 2
+ (.10) (.04) + (.10) (.44) + (.04) (.44) ] X 6
3 3
+ [(.42) (1- .42) + (.10) {1- .IO)
3
+ (.04) 3 (1- .04) + (.44) (I- .44)] X 4
= .5973
So P(3) =I- .5973-.0687- .01774 = .3163.
11.
. I HT) _ P(fair and HT)
P(fau - P(HT)
P(HT I fair)P{fair)
= P(HT I fair)P(fair) + P(HT I biased)P(biased)
(1/4) x (ffn) 9/
(1/4) X (f{n) + (2/9) X (bfn) = -:-9/-:-+::....8:-:-b
12. a) If n 7, the chance is 0: the die has only six different faces! If n = I, the chance is 1.
5
n = 2 : chance =
6
5 4
3 : chance = - · -
6 6
5 4 3
4 : chance = - · - · -
6 6 6
5 4 3 2
5 : chance = - · - · - · -
6 6 6 6
5 4 3 2 1
6 : chance = - · - · - · - • -
6 6 6 6 6
b) 1- above : the events in a) and b) are complementary.
13.
n n
P(AIB) = P(ABIB) = L P(AB;IB) = L P(AB;)j P(B)

i=l i=l
n n
= L P(AIB;)P(B;)j P(B) = LP(AIB;)P(BdB) since B;B = B;

i::::l
14. a) Since the prior probabilities of the various boxes are the same, choose the one with the greatest
lilcelihood: Box 100.
b) Here are the prior probabilities and likelihoods given a gold coin is drawn:
Box 1 Box 2 Box 3 Box 98 Box 99 Box 100

prior: 2/I50 1/150 2/150 1/I50 2/I50 I/150
likelihoods: 1/100 2/IOO 3/IOO 98/100 99/100 100/100
Clearly, the product (prior odds x likelihoods) is greatest for Box 99. So Box 99 is the choice
with the greatest posterior probability given a gold coin is drawn.
15. P(Box 1 I gold)= = l
17
Chapter 1: Review
16. a) Student 1 can choose any of the remaining n - 1, student 2 can choose any of the eligible n - 2,
student 3 can choose any of the eligible n- 2 except student 1, student 4 can choose any of the
eligible n - 2 other than students 1 and 2, and so on. So
(n-1) (n-2) (n-3) (n-4) (n-r)

Pr = (n- 1) X (n- 2) X (n- 2) X (n- 2) X ••• X (n- 2)"
b) Use the exponential approximation:
log(pr) = log (1 - -n-2

1 2-)
-) +log (1 - -n-2 + · ·. + log (1 - n-22)
r-
(....=.!.._) + (-2.) + ... +

2
(.-(r- )) since log(!+ z)::::: z for small z
n-2 n-2 n-2
1 (r-2)(r-l)
=-n-2x 2 ·
So for n = 300 and r = 30,

28 X 29
log(p3o) - -- - = -1.362416
298 X 2
so PJo e-1. 362416 = .256041.
17. False. They presumably send the same letter to 10,000 people, including the name of the winner, the
name of the person to whom the letter is sent, and the name of one other person. In that case, I learn
nothing from the information given, so my chances of winning the Datsun are still 1 in 10,000.
18
@1993 Springer-Verlag New York. Inc.
+
Section 2.1
1. a) G); b) G)<1/6l•<sf6}
3
2. P(2 boys and 2 girls) = G)(l/2)• = 6/2 4 = 0.375 < 0.5. So families with different numbers of
boys and girls are more likely tha.n those having a.n equal number of boys and girls, and the relative
frequencies are (respectively): 0.625, 0.375.
3. a) P(2 sixes in 5 rolls)= G)(l/6) 2 (5/6) 3 = 0.160751

b) P(a.t lea.st 2 sixes in 5 rolls) = 1 - P(at most 1 six)
= 1 - [P(O sixes)+ P(l six))
= 1 - (5/6) 5 - (1/6)(5/6). = 0.196245
(This is shorter than adding the chances of 2, 3, 4, 5 sixes.)
c) P(a.t most 2 sixes)= P(O sixes)+ P(l six).P(2 sixes)
= (5/6) 5 + (1/6)(5/6)• + (i){1/6) 2 (5/6) 3 = 0.964506
d) The probability that a single die shows 4 or greater is 3/6 = 1/2.
P(exa.ctly 3 show 4 or greater)= 0.3125 =
e) P(at lea.st 3 show 4 or greater)= ( + (!) + {!) j (1/2) 5 = 0.5
(The binomial (5,1/2) distribution is symmetric about 2.5)
4.
P( 2 sixes in first five rolls 13 sixes in all eight rolls)
P(2 sixes in first 5, and 3 sixes in all eight)
=
P(3 sixes in all eight)
P(2 sixes in first five, a.nd 1 six in next three)
=
P(3 sixes in all eight)
2 3
(i){l/6) (5/6) • GH1/6)(5/6) 2
mm 10 x 3
= .;::.:.'-'--....,....,---.:.:'"'----'--- =- -=
m --
56
= 0.535714
5. a)
b) ) 1- { +5 X }
c
6. a) P(exa.ctly 4 hits)= (:)(0.7) 4 (0.3) 4 = 0.1361367

b) P( tl 4 h'tsl
1 t 1 t 2 h'ts)
1 = P(exa.ct.,ly 4 hits & at least 2 hits)
exa.c Y a. eas P( at least 2 hits)
_ P(exa.ctly 4 hits\ _
- 1-P(exa.ctly 0 hlts)-P(exa.ctly 1 hltl - · 1363126
0
c) P(exa.ctly 4 hitslfirst 2 shots hit)
_ P(exa.ctly 4 hits & first 2 shots hit)
- P(first 2 shots hit)
_ Pdirst 2 shots hit & exactly 2 hits in last 6 shots)
- P(flrst 2 shOts hit)
= P(exa.ctly 2 hits in last 6 shots) (since shots are independent)
= = 0.059535
7. The chance that you win in this game is 15/36 = 5/12 (list a.ll 36 outcomes!), so the chance you win
at least four times in five plays is
(:) (5/12}.(7/12) +G) (5/12)

5
=
8. Let n be a positive integer. Using the formula. for the mode of the binomial ( n, p) distribution:
If np +p < 1 then the most likely number of successes is int (np + p) = 0 .

Section 2.1
If np + p = 1 then the most likely number of successes is 0 or 1 (both equally likely).

• If np + p > 1 then the most likely number of successes is at least 1.
Hence the most likely number of successes is zero if and only if np+p 1, and the largest p {or which
zero is the most likely number of successes is p = •
9. a) Most likely number = int (326 x 1/38) = 8.

P(B) = P(6). P(7) . P(B)
P(6) P(7)
= . 104840 . 325 - 77 + 1
. l_ .
37
325
-
8
8
+
1
. _!__
37
= . 138724
b) P(1o) ==
== .138724
P<s>.
X
325
.p)t9°/
"99± 1 • ff · 1
• 37 == .1127847
c) P(10 wins in 326 bets)
37
== P(9 wins in 325 bets) x _!__ + P(10 wins in 325 bets) x
38 38
1 37
== .132058 X + .1127847 X
38 38
= .1132919
10. a) kf(n + 1) b) (n- k + 1)/(n + 1).
11. a) The tallest bar iri the binomial (15, 0.7) histogram is at int ((n + 1)p) = int(16x0.7) = int(11.2) =
11
b) P(Il adults in sample of 15) = GD(o.7) 11 (0.3} 4
== (15) (0. 7 )11 (0.3 )4 = 15 X 14 X 13 X 12 = (O. 7 )11 (0. 3 )4 = 0. 218623

4 4x3x2x1
12. a.) P(makes exactly 8 bets before stopping)

= P(wins at 8th game, and has won 4 out of the previous seven)
= G)
(18/38} 4 (20/38} 3 • (18/38) = o.1216891
b) P(plays at least 9 times)
= P( wins at most four bets out of the first eight)
= {20/38)' 2 6
(18/38) {20/38) +G) {18/38) 3 {20/38) 5 +(!){18/38) 4 {20/38) 4
= 0.6926167
13. a) No! Since the child must get one allele "from his/her mother, which will be the dominant B,
he/she must have brown eyes.
b) Taking one allele from each parent, we see that there are four (equally likely) allele pairs: Bb,
Bb, bb, bb. Thus the probability of the child having brown eyes is 50% == 0.5.
c) Once again there are four possibilities: Bb, Bb, Bb, bb. Thus there ia a 75% = 0.75 chance that
the child will have brown eyes.
d) Given that the mother is brown eyed and her parents were both Bb, the probability she is Bb
is 2/3 and BB 1/3. Thus
P(child brown-eyed)= P{child brown-eyedlmother Bb) · P(mother Bb)
+P(child brown-eyedlmother BB) · P(mother BB)

1 2 I 2
=-x-+Ix-=-
2 3 3 3
So
.ld Bb)
P( woman Bbl c h I
= P(child Bblwoman Bb)P(woman Bb) ( 1/2)(2/3) I
P(child Bb) 2/3 =2
20
Section 2.1
14. a) (Ts, Pw); tall and purple.

Genetic Combinations Probability
(TI, PP) 1/16 tall and purple
(TI, Pw) 1/8 tall and purple
(TI, ww) 1/16 tall and white
(Ts, PP) 1/8 tall and purple
b)
{Ts, Pw) 1/4 tall and purple
(Ts, ww) 1/8 tall and white
(ss, PP) 1/16 short and purple
(ss, Pw) 1/8 short and purple
(ss, ww) 1/16 short and white
tall and purple tall and white short and purple short and white
Probability 9/16 3/16 3/16 1/16
c) 1- (7/16) 10 - 10(9/16)(7/16) 9 •
15. a) U 0 <p < 1, then int (np + p) = np, since np is an integer.

b) Note that np = int (np) + [np- int (np)].
U [np- int (np)] + p 1, then int (np + p) = int (np) + 1, which is the integer above np.
U 0 < [np- int (np)] + p < 1, then int (np + p) = int (np), which is the integer below np.
c) Consider the case n = 2, p = 1/3, where 1 is the closest integer to np and the integer above np,
but 0 is also a mode. Also 0 is the integer below np, but 1 is also a mode.
21
Section 2.2
Section 2.2
1. The number of heads in 400 tosses has binomial {400, 0.5) distribution. Use the normal approximation
with ll = 400 X {1/2} = 200 and u = J400 X t X t = 10.
a) P{190 $ H $ 210}::::: 4>{1.05} -(-1.05} = 0.7062
b) P{210 $ H $ 220}::::: 4>{2.05}- 4>(0.95} = 0.1509
c) P(H = 200} ::::: 4>(0.05) - 4>{ -0.05) = 0.0398
d) P(H = 210)::::: 4>{1.05)- 4>{0.95) = 0.0242
2. Now ll = 204 and u = 9.998.
a) P(190 $ H $ 210)::::: 4>{0.65)- 4>{ -1.45) = 0.6686

b) P(210 $ H $ 220) ::::: 4>(1.65)- 4>( -0.55) = 0.2417
c) P(H = 200) ::::: 4>(-0.35)- 4>( -0.45) = 0.0368
d) P(H = 210) ::::: 4>(0.65)- 4>(0.55) = 0.0333
3. a) Law of large numbers: the first one.
b) Binomial {100, .5) mean 50, SD 5:
54 50
1- 4> ( ·\- ) = 1- 4>(.9) = 1-.8159 = .1841
Binomial ( 400, .5) mean 200, SD 10:

200
1 _ 4> ) = 1 _ 4>(1.95) = 1 _ .9744 = .0256
4. Let X be the number of patients helped by the treatment. Then E(X) = 100, SD(X) = 8.16 and
P(X > 120) = P(X 120.5)::::: 1- (2.51) = .006.
5. Want the chance that you win at least 13 times. The number of times that you win has binomial {25,
18/38) distribution. Use the normal approximation with ll = 11.84, u = 2.50:
P(13 or more wins)= 1- P(12 or fewer wins)::::: 1- 4> {1 2 -!>2-:-;;· 84 ) = 1- 4>{0.26) 0.3974 =
G. The number of opposing voters in the sample has the binomial {200, .45) distribution. This gives
I'= 90 and u = J200 x .45 x .55= 7.035. Use the normal approximation:
a) The required chance is approximately

.
 ( 90.5- 90) -of> (89.5- 90)
7.035 7.035
= .5283- .4717 = .0567 (about 6%)
b) Now the required chance is approximately
100.5- 90)
1 - 4> ( _
7 035
= 1- 4>{1.49) = 1 - .9319 = 0.0681 (about 7%)
7. a.) The city A sample has 400 people, the city B sample has 600 people, so city B a.ccura.cy is
ftl4 = 1.22 times greater.
b) Both have the same accuracy, since the absolute sizes of the two samples are equal.
c) The city A sample has 4000 people, the city B sample has 4500 people, so the city B sample
is J4500/4000 =
1.06 times more accurate, even though the percent of population sampled in
city B is smaller than the percent sampled in city A.
8. Usc the normal approximation: u = ...jitjiq = 9.1287. Wa.nl relative area. between 99.5 a.nd 100.5
under the normal curve with mean 600 x {1/6) 100, a.nd u = =
9.1287. That is, wa.nt area between
-.055 and .055 under the standard normal curve. Required area =
4>(.055) -[I - 4>(.055)] = .0438.
22
Section 2.2
9. a.) Think of 324 independent trials, each a. "success" (the person shows up) with probability 0.9. So
the number of arrivals ha.s binomial (324, 0.9) distribution. We wa.nt P(more tha.n 300 successes in 324 trial
Use the normal approximation with p. = 324 x 0.9 = 291.6 and t1 = v'324 x 0.9 x 0.1 = 5.4 to
compute the required probability:
1- tp = 1 - 4J(1.65):::::: 0.0495.
b) Increase: in the long run, ea.ch group must show up with probability .9. So effectively, traveling
in groups reduces the number of trials, keeping p the same. So the histogram for the proportion
of successes ha.s more mass in the tails, since n is smaller.
c) Repeat a.), with the 300 seats replaced by 150 pa.irs, and the 324 people replaced by 162
pairs. So the number of pairs that arrive ha.s binomial (162, 0.9) distribution. Use the nor-
mal approximation with p. = 162 X 0.9 = 145.8 a.nd t1 = v'162 x 0.9 X 0.1 = 3.82. We wa.nt
1 - tp ( = 1 - 4J(1.23) = 0.1093.
10. We have 30 independent repetitions of a binomial (200, ). For each of these repetitions, the proba-
bility of getting exactly 100 heads ca.n be gotten by normal approximation where p. = (200 x = 100
a.nd t1 = J'200 7.07.
The chance of exactly 100 heads can be approximated by
100 100 100 100
P(100 heads)= tfJ ( ( ) - tfJ ( ( ) :::::: .5282-.4718 = .0564
Since the students are independent, the probability that all 30 students do not get exactly 100 heads
is (1 - P(100 heads )) 30 and the normal approximation gives us
{1- P(100 heads )) 30 :::::: (1- .0564t0 = 0.175
11. The number of hits in the next 100 times at bat ha.s binomial (100, 0.3) distribution. Use the normal
approximation with p. = 30 a.nd t1 = v'100 x 0.3 x 0.7 = 4.58.
a) P("?:. 31 hits):::::: 1- tp ( 30:.s&30 ) = 1- cll(O.ll) = 0.4562

b) P("?:. 33 hits):::::: 1 _ tp = 1 - cli(0.545) = 0.2929
c) P($ 27 hits):::::: tfJ e 7 30
;_;;, ) = tfJ( -0.545) = 0.2929
d) No, independence would be lost, because if the player has been doing well the la.st few times at
bat, he's more likely to do well the next time. Similarly, if he's been doing badly, then he's more
likely to continue doing badly. This will increase all the chances above.
e) From part b), we see that the player has about a. 29% chance of such a. performance, if his form
stayed the same (long run average .300) and if the hits were independent. So it is reasonable to
conclude that this performance is "due to chance," since 29% is quite a. large probability.
12. For n = 10, 000 independent trials with success probability p = 1/2,
p. = np = 5000 and t1 = ..filiiii = 50
Hence, by the normal approximation, the probability of between 5000- m and 5000 +m successes is
approximately
tp ( m : 1/2) _ tp ( -m 1/2) = 2 tp ( m : 1/2) _ 1
0 0
This equals 2/3 if and only if tfJ((m + 1/2)/50) =
5/6 which implies that (m + 1/2)/50 = 0.97 and m
= 48. In other words, there is about a. 2/3 chance that the number of heads in 10,000 tosses of a. fair
coin is within about one standard deviation of the mean.
13. First find z such that ( -z, z) = 95%. This means <l>(z) = 0.9750 and by the normal table, z = 1.96.
95% = P (p in p ± l.96J'i!) $ P (p in p ± 1.9& Jn)

We want 1.96(.5/..fii) $ 1%, so
2
1.96(.5)) -
n '?:. ( .Ol -9604
23
Section 2.2
14. a) # working devices in box has binomial ( 400, .95) distribution. This is approximately normal with
Required percent :::::: 1 -
IJ = 380, tr :::::: 4.36. = 1 - 4>(2.18) = 0.0146 (This normal
approximation is pretty rough due to the skewness of the distribution. The exact probability
is 0.0094 correct to 4 decimal places. The skew-normal approximation is 0.0099 which is much
better.)
b) Using the normal approximation, want largest k so that 1 - 4>( 0.95 so ,,.s =
-1.65, so k = 373.
15. a) 1/l'(z) = = -zl/l(z)

b) ifl"(z) = -1/l(z)- z(-zl/l(z)) = (z 2 -1)1/l(z)
c) Sketch: Outside ( -4, 4), they are close to zero.
d) Let f(x) = Want /"(x)
e) For - tr < x < IJ + tr, the second derivative in d) is negative. So the curve is concave in this
region. For x > IJ + tr or x < 1J- tr. The second derivative is positive, so curve is convex.
16. a) Using the results of Exercise 15,
1/J"'(z) = 2zl/l(z)- (z 2 - l)zifl(z) = ( -z2 + 3z)ifl(z)

b) Fundamental theorem of calculus; 1/J" vanishes at -oo. The first equality of integrals is because
1/1 111 is an odd function.
21/l(VJ) = ((Va)l- 1)1/l(-Va) = 1/l"(Va).
1../3 1/l"'(z)dz) = 1: 1/l"'(z)dz -1° 00

1/l"'(z)dz) = 21/l(Va)- (-1/1(0))
c) t/J"' switches signs at -VJ and .;3. 1/1(0) + 21/!VJ is the biggest area possible over an interval
because areas are negative when the curve is negative.
17. a) I - 4>(z) = 1- t/J(x)dx = 1: t/J(x)dx t/J(x)dx = joo ifl(x)dx.

b) Over the range of the integral z < x, so x/z is greater than 1. Multiplying a positive integrand
by a number greater than 1 gives an integral with a larger value.
c)
!
00 •
1 .rl
1- 4>(z) < --e-! x dx
r ../2iz
1 2
[u = 2x , du = x dx]
24
Section 2.3
Section 2.3
1. Let p = P(O), then P(k) = R(k) · R(k -1) · ·· R(1)p.

Use =
P(k) 1- p to get the value of p.
2. Put n = 10,000, p = 0.5, k = 5000 in Formula (3). Then p. = 5000, cr = 50, and the desired
probability is approximately
1
tn=
V ..:x- X 50
= 0.008 ::::: 0.01
3. a.) Usc P(m + 1 in 2m) = P(m in 2m)· R(m + 1) where R(m + 1) = = (1-
Similarly, P(m- 1 in 2m)= P(m in 2m)/R(m) where R(m) = 2 m-mm±l. {This could also have
been deduced by symmetry.)
b)
P(m + 1 in 2m+ 2)
= P(m- 1 in 2m, then HH) + P(m in 2m, then HT or TH) + P(m + 1 in 2m, then TT)
= -1 in 2m)+ in 2m)+ + 1 in 2m)
c) Substitute P(m- 1 in 2m)= P(m in 2m) (1- and P(m + 1 in 2m)= P(m in 2m) (1-
into (b), and simplify. This ca.n also be checked by cancelling factorials.
d) Write
. )_ P(m in 2m) P(m- 1 in 2m-2) P(2 in 4) P( . )

P( mmm-
2 · ····· ·1m2
P(m- 1 in 2m-2) P(m- 2 in 2m- 4) P(l in 2)
= ·( 1- · (l- 2(m1-l)). ···. (l- 2 2). (l- 2

e) Use the fact that 1 - x ::; e-:r with equality iff x = 0 (see diagram).
y =I -.x
Then the inequalities follow easily:
0 < P(m in 2m)< < e-tlog(m) = ,Jm

(For the last inequality, draw a graph of 1/x and remember that the area under this graph from
1 to m is log(m)).
f) By part (c),
0'm-1 2m
Square this, substitute, and simplify:
m+t = 2m+l (2m-1)

2
=(2m -1)(2m+l) = 1 _ _I __
m- 1 +t O'm-1 2m- 1 2m (2m)2 4m 2
25
Section 2.3
g) As in (d), write
2 (m+.!-)a 2 = 2 •••
2 m (m-l+t)a;,_ 1 (m-2+t)a;,_ 2 (l+t)ai
=( 1- (l- 1)2) ··· (1-

(since ( 0 + t) = 1/2). The result follows from factoring
h)
Om= P(m in 2m) = P(mode)""'
v21r.,.
with 0' = tv'2m. So Om ""' Tm where K = -j;:. But this means
= 2K2 = 2 lim (m)a;. = 2 lim (m+!.)a;. = lim 2m+ 1. 2m -1. 2m- 1 2m-3 .....
1r m-oo m-<><> 2 m-oo 2m 2m 2(m- 1) 2(m- 1) 2 2·
26
Section 2.4
Section 2.4
1. a) Approximately Poisson(!).
b) Approximately Poisson(2).
c) Approximately Poisson(0.3284).
d) This is the distribution of the number of successes in 1000 independent trials, where the success
probability is p = .998. The distribution of the number of failures has binomial (1000, .002)
distribution, which is approximately Poisson(2}'. Since #successes+ #failures= 1000, it follows
that the histogram for the number of successes (i.e., the desired histogram) looks like the left-to-
right mirror image of a Poisson(2) histogram, with the block at 0 being sent to 1000, the block
at 1 being sent to 999, etc.
2. a) The number of successes in 500 independent trials with success probability .02 has binomial
(500, .02) distribution with mean 1-1. = 10. By the Poisson approximation,
P(1 success)::::: e-"'t;.1. = ,_,.e-"' = .000454.

1
b) P(2 or fewer successes)= P(O or 1 or 2 successes)
2
= e-"'(1 + 1-1. + ) = .002769
c) P(4 or more successes)= 1- P(3 or fewer successes) = .989664.
3. The number of times you see 25 or more sixes has binomial distribution with 1-1. = 365 x .022 = 8.03.
a) P(at least once)= 1- P(exactly 0 times)::::: 1- e-"' = .999674.

b) P(at least twice) = 1 - P(exactly 0 times)- P(exactly 1 time) ::::: 1- e_,..- p.e_,.. = .997060.
4. Here 1-1. = 365 x 0.00068 = 0.2482, and
a) 1 - e-"' = .219796;
b) 1 - e-"'- ,.,.e-"' = 0.026150.
5. The number of wins has binomial (52, 1/100) distribution with mean p. = 52/100, and by the Poisson
approximation P(k wins)::::: e-"',.,."/k!.
k = 0: P(O wins)::::: e-"' = .594521;
k = 1: P(1 win)::::: ,_,.e-"' = .309151;
k = 2: P(2 wins)::::: .ye-"' = .080379.
2
6. a) The number of black halls seen in a series of 100 draws with replacement has binomial (1000,
2/1000} distribution with mean 1-1. = 2. By the Poisson approximation,
0 1
P(fewer than 2 black balls):::::
0.
+ 1.
= e-"'{1 + p) = .406006.
p.2
P(exaclly 2 black balls)= e-"'2! = .270671.
Calculate the probability of more than 2 black balls by subtraction,-conclude that getting fewer
than 2 black balls is most likely.
27
Section 2.4
b) P(both series see same number of black balls) = "L::o P(both series see k black 'oalls)
00 k
= e _, E (:!)'2 = 0.207002
k=O '
7. a) 2 b) .2659 c) .2475 d) .2565 e) m = 250. Normal approximation J.0266 f) m = 2.

Poisson approximation: .2565
8. k
P(k) =
/c-1
P(k- 1) = e-"' (: _ 1)!
so R(k) = P(k)/ P(k- 1) is given by
R(k) =I
which is decreasing as k increases. The maximimum probability will occur at the largest value of k
for which R(k) 1; after this P(k) will decrease. Thus the maximum occurs at int(f.'). There is a
double maximum if and only if R(k) = 1 for some k. This can only occur if f.' is an integer. The two
values of k that maximize are then I' and f.' - 1. There can never be a triple maximum since this
would "imply that R(k) = 1 and R(k- 1) = 1 for some k.
9. Assume that each box of cereal has a. prize with chance .95, independently of all others. The number
of prizes collected by the family in 52 weeks has the binomial distribution with parameters 52 and
.95. This gives f.'= 52 x .95 = 49.4, and fT = v'52 x .95 X .05 = 1.57. Since fT is very small {less than
3), the normal approximation is not good. Use the Poisson instead. The number of "dud" boxes has
=
the Poisson distribution with paramter 52 x .05 2.6. We want the chance of 46 or more prizes, that
is, 6 or less duds. This is
e -'2.s { 1 + 2.6 + 2.62 /2 + 2.6 3 /3! + 2.6' /4! + 2.65 /5! + 2.66 /6!} = .982830.
[For comparison, the exact binomial probability is .985515. The normal approximation gives .993459.]
10. Distribution of the number of successes is
binomial (n, 1/N)::::; Poisson (n/N)::::; Poisson (5/3).
P(at least two) =1 - P(O) - P(1) ::::;.1- e- 5 / 3 (1 + 5/3) = 1- e- 5 / 3 • 0.49633::::; 0.5
28
Section 2.5
Section 2.5
2. a) · M- · b) 3x a) c) 1 - · ·
3. a) G(fif d) 0
2:10045 ({0,000) (80,000)

"=
4
. The exact chance is " lOO-k
For the approximation, use the normal curve with ll = 40, u 4.9. Chance is approximately
1- =
0.1788
5. Solve J .55 x.45/n -< -2.326, then n >- 537.
6. a)$ b)
tffifffi
- 3
l3
c) (a) + 4(b).
7. Denote by B; the event (ith ball is black), similarly for R;.
a.) = P(Bl)P(BllBI)P(B3lBlBl)P(B•lBIBlB3) = = .1456.

b) This is four times P(B1B2B3R.), so by d) equals .3716.
c) = = = .0929.
8. Let the outcome space be the set of all 3-element subsets from the set {1, ... , 100}, so that, e.g., the
3-set {23, 21, 1} means that the winning tickets were tickets #1, #21, and #23. Each of the
such 3-sets is equally likely.
a) The event (one person gets all three winning tickets) is the disjoint union of the 10 equally likely
events (person i gets all three tickets). The event (person i gets all three tickets) corresponds
to all 3-sets consisting entirely of tickets bought by person i. There a.re such 3-sets, so the e:)
desired probability is
10 X
e:> = .007421.
b) The event (there a.re three different winners) is the disjoint union of the 3°) equally likely events C
(the winning tickets were bought by persdns i, j, and k) (i,j,k all different). The event (the
winning tickets were bought by persons i, j, a.nd k) consists of x (\0 ) x (\0 ) 3-sets, so the
desired probability is
c) By subtraction, .250464. Just to be sure, use the above technique to obtain the desired proba-
bility:
10 X 9 X = .250464.
9. a.) Let A; be the event that the 1•• sample contains exactly i bad items, i = 0, 1, 2, 3, 4, 5. And let
Bi be the event that the 2nd sample contains exactly j bad items, j = 0, 1, ... , 10. Then
P(2nd sample drawn and contains more than one bad item}
= P [A1 n
=P I At] P(At}
29
Section 2.5
= P(At) · {1- P[Bo u Bt I At]}

= P(At) · {1- P(BoiAt)- P(BtiAt)}
_ en . {1 _ _ me9 6
) }
- es 0
)
= 0.431337 ° (1 - 0.079678 - 0.265592) = 0.282409

b) P(lot accepted)
= P [Ao u {At n (Bo u Bt)}]
= P(Ao) + P(At) · P[Bo u Bt I At]
= e:)
0
(\ )
+
· {
es 0
G) e9 )
)
6
•
m +
}
= 0.310563 + 0.431337 ° (0.079678 + 0.265592) = 0.459491.
10. Every sequence of k 1 good, k2 bad and k3 indifferent elements has the same probability:
This is clear if you think of the sample as n independent trials. The probability of each pattern of k 1
good, k 2 bad and ka indifferent elements is a product of k1 factors of GIN, k2 factors of BIN, and
k 3 factors of I IN. The number of different patterns of k1 good, k2 bad and k3 indifferent elements is
n!l(kt!k2!ka!), which proves the formula.
11. The set is {g : max{ 0, n- N + G} ::5 g ::5 min { n, G}}. This is because the maximum possible number
of good elements in sample is min { n, G}. And the minimum possible number of good elements in the
sample is n minus the max number of bad elements in sample, i.e.
= n- min{n, N- G} = max{O, n- N + G}
Formula is correct for all g because (:) = 0 for b < 0 or b > a.
12. There are e;) = 2, 598, 960 possible distinct poker hands.
a) There are 52 cards in a pack: A, 2, 3, .•. , 10, J, Q, K in each of the four suits: spades, clubs,
diamonds and hearts. A straight is five cards in sequence. An A can be at the begining or the
end of a sequence, but never in the middle. i.e. A, 2, 3, 4, 5 and 10, J, Q, K, A are legitimate
straights, but K, A, 2, 3, 4 (a "round-the-corner" straight) is not. So there are are 10 possible
starting points for the straight ftush, (A through 10) and 4 suits for it to be in, giving a total of
40 hands.
Thus ?(straight flush)= rJy = 0.0000154.
b) P(four of a kind)=
1
(•1r = = 0.000240
3744
c ) P(fuill1ouse ) = (as:l) = E98960 = 0.00144
d) P(flush) = P(all same suit)- P(straight flush)
=
'" 's" 5)-•xiO 5108
= E9896ii = 0.00197
. h)
e ) P( stratg t =
P( 5 consec. ran ks ) - P( stratght
. flush ) = ("s') = E98960
10200
= 0.003 9 2
4
x( x(•)
2598960 = 0 0211
)
f) P(threc of a kind)= 3 211
= ..ll2.!L.
<'1> °
13 4 4 4
( 2 )x( )x( )xllx( 2 1)
g) P(two pairs) = 2
(s,") -
- ..!lli_g_-
2598960 -
0 • 0475
4 4) x(<) x( 4)
h) P(one - 13 x( , ) X( 3 X( I t I - !.2!!!ai2. - 0 4?3
- (',") - 2598960 - 0
-
i) Since the events are mutually exclusive, the probability of none of the above is
1 -sum of (a) through (h) = 0.501.
30
Section 2.5
13.
P(pass) = 1- P(fa.il) = 1- [P(> 10 defectives)] 2
.
P(> 10 defectives)::::: ""
1-..,. ( 10.5- 500(.05)) ::::: 1- ( -2.98) = .9986
J5oo(.05)(.95)
P(pass) = 1- (.9986) 2 = .0028
31
Chapter 2: Review
Chapter 2: Review
1. a) (1/6) 4 (5/6) 6
0
b) (\ ) (1/5)' (4/5) 6
10! /610
c) i!3!2!
d)
Ph=Ph
2. 0.007
3. a) P(3H) =
P(3HI3 spots)P(3 spots)+ ... + P(3HI6 spots)P(6 spots)
= { e){l/2) 3 + (!}(1/2)' + (;){1/2) 5 + (1/2) 6 } X t
1 10 20) 1 - 1
= ( 2l" + + 2• + 26 X 6 - 6·
i
b)
P( I H) _ P(3Hj4 spots)P(4 spots) _ (D(1/2)'(1/6) _ !_
4 spots 3 - P( 3H) - (1/6) - 4
4. P( exactly 9 tails I at least 9 tails)

= P( exactly 9 tails and at least 9 tails I at least 9 tails)
= P( exactly 9 tails I at least 9 tails)
10(1/2) 10
= 10(1/2)ib+(l/2Jl0 = 10/11
6. a) (7/10) 4 - (6/10)'
b) The six must be one of the four numbers drawn. The remaining three numbers must be selected
from {0, ... , 5}. The desired probability is therefore
7. k::::: 1025
s. (1/6)k(5/6)10-lr
9. a) 80%
b) P(# of kids = xl at least 2 girls)
-
0.4.x t for x = 2
- 2 girls)
-
0.3 X t for x = 3
- 2 girls)
0.1 X tt for x = 4
2 girls)
So x = 3 is most likely.
c) P(IG,3B & pick G) +P(2G,2B & pick G) +P(3G,IB & pick G)
= -164 . -41 + -166 . -42 + -164 . -34 = 0.4375

10. a) We can use the binomial approximation to the hypergeometric distribution with n = 10, p = .15.
So 1- {1- .15) 10 =
.8031. And the histogram will be that of the binomial (10, 0.15) distribution.
b) No. Presumably some machines are more reliable than others. Then results of successive tests
on a machine picked at random are not independent. So the independence assumption required
for the binomial distribution is not satisfied.
32
Chapter 2: Review
11.
2 1 1 ')
P(bad) = 3X 100
+ 3 X = 0.0133
2 10
P(2 bad out of 12) =. C:)(.0133) (1- .0133) = 0.0102
Assume items are bad independent of each other. Reasonable since both A and B produce a large
number of items each day.
12. a) 0.423 (See solution to Exercise 2.5.12).

b) Want chance of at most 149 "one pair"s in the first 399 deals
c) Use normal approximation: Jl = 168.78, u = 9.87.

Want ( 149 •5 -: : 68 •78 ) = ( -1.95) = .0256
9 8
13. The number of "dud" seeds in each packet has the binomial (50, .01) distribution, which is very well
approximated by the Poisson (.5) distribution. The chance that a single packet has to be replaced is
therefore
1- e-o-5 {1 + .5 +.5 2 /2} = .0144.
Assuming that packets are independent of each other, the number of replaceable packets out of the
next 4000 has the binomial distribution with parameters 4000 and .0144. This gives Jl = 57.6 and
u = 7.535, which is much bigger than 3. So the normal approximation will work well. The chance
that more than 40 out of the next 4000 packets have to be replaced is very dose to
40 5 57 6
1- ( · -
7.535
· ) = 1- ( -2.33) = <1>(2.33) = .9901.
14. a) 1/5 b) 2/9
• l .
C) line ·. GJ
_
-
(n-k+t)!k!.
n! ' ore e.
n
ill _
- (n-k)!k!
(n-1)! •
15. a) eso)(0.4)5(0.6)15
6 8
b) (0.1) 2 (0.2) 4 (0.3) (0.4)
c) P(25th ball is red, and there are 2 red balls in first 24 draws)
= 0.1 24) 2 22
X
( 2 (0.1) (0.9)
16. a) ffi b) r}r c) 13 X f#- (1 3

2) X r}r
17. a) g). x G) x g) X W
18. a) G){l/6) 3 (5/6) 4 b) 6 x 5 x if). c) GWl d) 6 x Gl:s!
e) P(sum ;:::: 9) = 1 - P(sum < 9)
=1- P(seven 1's)- P(six 1'sand a 2)
=1- (1/6) 7 - 7(1/6) 7 •
19. a) (2/3)'; b) G){2/3) 4 (1/3) + (2/3) 4

20. a) (;)q.rpn-.r
b) (0.99) 8 + 8. 0.01(0.99f +
2 1. For n o dd , "(n-1)/2
L....r=O
(") .r n-.r
.r q p .
33
Chapter 2: Review
22. a) Assume each person who buys a ticket shows up independently of all others with probability 0.97.
If n tickets are sold, then the number of people who show up has binomial ( n, 0.97) distribution.
Find n so that
.95 :5 P ( 0 to 400) :::::: cl>

(
400.5 - .97n )
(.97){.03)n
J .
Therefore > 1.645.· which
400 · 5 -·97 n
yf(.97)(.03)n -
implies n <
-
407.
23.
. I bo ) P( at least one girl and at least one boy)
P(at least one gul at 1east one y = P( t t b )
a 1eas one oy
P(not all girls or all boys)
= (.2 X 1/2) + (.4 X 3/4) + (.2 X 7/8) + (.1 X 15/16)
(.4 1/2) + (.2 X 6/8) + (.1 X 14/16)
X
= (.2 X 1/2) + (.4 X 3/4) + (.2 X 7/8) + (.1 X 15/16}.
24. Denote by 1r(n), n = 0 to 5, the proportion of families having n children.
a) We assume that in every family, every child is equally likely to be a boy or a girl. In this case,
if a family is chosen at random then
P(family has n children and exactly 2 girls)
= P(n children)P(exactly 2 girls I n children)
= 1r(n) x (;) :
2
(interpret as 0 if n < k). Hence
P( family has exactly 2 girls) = t

n=O
lf( n) (;)
2
1
n = .203125.
b) Since each child is equally likely to be chosen, we have
.
P( c hild comes f rom a family h avmg . ls) #such children
exact1y 2 gn = # hild . ·
c ren m populatiOn
Suppose the population has N families, N assumed large. Denominator: There a.re N r( n)
having exactly n children, so there a.re nN 1r( n) children belonging to n-child families;
hence there are l:!=o nN r( n) children in the population.
Numerator: In part (a) we showed that proportion of families having n children and exactly
2 girls is r(n){;) 21ft. Thus there are nNr(n){;) 21.. children who come from families having n
children and exactly 2 girls, and nN1r(n>(;) 21.. children who come from families having
exactly 2 girls.
Therefore the chance that a child chosen at random from the children in this population comes
from a family having exactly 2 girls is
5
n1r(n)(n).L
"'
L.m=O 2 2ft = .294207.
nlf(n)
25. a) P(win in 3 sets)= P(win all three)= p 3
P(win in exactly 4 sets)= P(win 2 out of first three, then win fourth)= e)p2 q. p = 3p3 q
P(win in exactly 5 sets)= P(win 2 out of first four, then win fifth)= G)p2 q2 • p = 6p3 l
b) P(player A wins the match)
= P(wins in exactly 3 sets)+ P(wins in exactly 4 sets)+ P(wins in exactly 5)
= PJ + 3pJq + 6pJq2
34
Chapter 2: Review
c)
P(A won in only 3 sets)
P( rna tch 1as ts on1y 3 sets lA won ) = P( )
A won
1
1 + 3q + 6q 2
d) If p = 2/3, the answer in c) is 0.375.
e) No: if player A has lost the first two sets, he may be nervous about saving the match, and
that could affect his performance. In general, the assumption of independence of a player's
performance in successive games is rather suspect.
26. a) sf e3o)
b) Using the inclusion-exclusion formula, the probability in question is
P(A&B u B&Cu ··· u J&A)
= P(A&B)+P(B&C)+·· ·+P(J&A)- LP(n's of two pairs)+ L P(n's of three pairs)- ...
=loxs;C;) -1o;C;) =7/12
because the probability of each intersection of 3 or more pairs is 0.
27. Assume that every quadruplet of 4 exam groups appears equally often in the population of students,
so that the proportion of students having a specific set of 4 groups is 1/ e,8 ).
By equally likely outcomes: There are (!)3' quadruplets which correspond to different exam days
(count the ways to pick the 4 days, then the ways to pick an exam time from each of those 4 days).
Hence the desired proportion is (!)3'/e,8 ) = .3971.
By conditioning: Pick a student at random. Let D; denote the day of the ith exam. Then
P(Dt,D2,D3,Dt different) =P(D1,D2 different)x

P(Dt, D2, D3 differentiD,, D2 different)x
P(D,,D2,D3,Dt differentiDt,D2,D3 different)
= -15
17
12
X -
16
X -
9
15
= .3971.
1
I 1
28. a) 1 - 2! + 3f - • • · + (-1)"-
n!
b) 1 - e- 1
29. What kinds of throws are wimpouts? Firstly, they cannot have any numbers in them. Of the the
throws with only letters the only ones that contain a W either have four different other letters or 2
or more letters of the same type. Both of these are scoring throws. Thus the only throws that do not
score consist entirely of the letters A, B, C, D with no more than two of any letter. Condition on the
dice with the W because it has only A, B, C. Suppose it is A. Then the outcomes of the other dice
that lead to a wimpout are: ABCD, ABCC, ABBC, ABDD, ABBD, ACDD, ACCD, BDCC, BBCD,
BBDD, BCDD, BCCD, CCDD. Taking into account the different possible orderings and multiplying
by 3 we get a total of 450 outcomes. Since there are 6s = 7776 equally likely outcomes possible when
you roll 5 dice,
. 450 25
P(wrmpout) = = 432 = 0.05787
7776
30. Draw the curve y =log x from I ton. Between x and x + 1, draw a box with height log x. (The first
"box" has 0 height.) Connect the upper left corners of the boxes by straight lines. The area under
the curve is the sum of three pa.rts; the area of the boxes, the area of the triangles above the boxes,
and the area of the slivers above the triangles and below the curve. The area under the curve is
j" log xdx = n log n - n +1= log [ ( "] +1

35
Chapter 2: Review
The area of the boxes is n-1
L log :r; = log ( n - 1)!

z=1
The area of the triangles is
n-1
L.... 1 (log(x
"'"' + 1) -log :r:) = 21 1og n
2
By moving the slivers so that they all have their right ends on the point (2,log 2), you can see that
none of the slivers overlap and all of the slivers fit in a box between 1 and 2 with height log 2. So the
area of the slivers is some number c,. < log 2 for all n. We see that the area under the curve is also
equal to
1 1
log ( n - 1)! + -log n + Cn = log n! - -log n + Cn
2 2
By setting the two expressions ·for the area under the curve to be equal, and solving for log n!, we get
log n! = log [ ( "] +1+ n - Cn = log [ ( y'nel-c,.]
where en increases to a limit c with c < log 2. So
where C = e 1 -".
The exact probability of getting m heads in 2m coin tosses is
2m)
( m
(!.)2m = (2m)! (!.)2m
2 (m!)2 2
C(2m/e)2mV2fn (.!.)2m
C2(mfe)2=m 2
.../2
=crm
By the normal approximation, this is approximately
1 1
J2m(1/2)2.../2i =
Setting these expressions equal gives C = ../'Fi.

31. No Solution
32. Those who have computed the probabilities of all the 5 card poker hands and know that a straight
is about twice as likely as a flush may be surprised to learn that the answer to this question depends
on the size of the hand, h. There are e;)
distinct h card hands possible. Of these, 4 (I;) are jlu&hes:
4 suits (hearts, diamonds, clubs, spades) and we want to choose h cards from any one of these suits.
There are (13- h+ 1) cards that a &troight of length h could start on, and for each card in the sequence
d, d + 1, ... , d + h there are four possible suits to choose from, giving a total of (13- h + 1)4h possible
&traights. For h = 1,2,3,4, 4(1;) > (13- h + 1)4h and for 5 $ h $ 13, 4(';) < (13- h + 1)4h.
The case h = I is trivial because any single card is both a. &traight and a flush. For h > 13 neither a
straight nor a flush is possible, so they both have zero probability.
33. a) Note that P(H HIH H or HT or TH) ;:::: 1/3. So toss the fair coin twice. Report 1 if the outcome
is H H, and 0 if the ou teo me is HT or T H, keep trying otherwise.
b) Note that P(HTIHT or TH) = P(THIHT or TH) = 1/2 independent of p. So toss the biased
coin twice. Report 1 if the outcome is HT, and 0 if the outcome is TH, keep trying otherwise.
36
Chapter 2: Review
34. a) Note that the number of heads and the number of tails have the same distribution, therefore
P(# of heads from my toss=# of heads from your toss)
= P(# of heads from my toss-# of heads from your toss= 0}
= P(# of heads from my toss+# of tails from your toss- m = 0}
= P( # of heads from my toss + # of tails from your toss = m)
And the distribution of# of heads from my toss+ #of tails from your toss is binomial (2m, 1/2)
by symmetry and the independence of tosses.
b) Let Mn be the number of heads I get on n tosses, and Yn be the number of heads you get on n
tosses. Then
P(Ym+l > Mm) = P(Ym+l > Mm I your m +lilt toss is head) X 1/2
+ P(Ym+l > Mm I your m +1st toss is tail) X 1/2
= P(Ym Mm} X 1/2 + P(Ym > Mm) X 1/2
= P(Ym Mm) X 1/2 + P(Ym < Mm) X 1/2 by symmetry
= 1/2.
35. a) (IOOO)(.!.)"(i!!.)lOOO-k
L..,k=20 " 38 38
b) The SD is tT = 5.06. Use normal approximation.
ill(35.5- 26.316)- ill(19.5- 26.316)

5.06 5.06
= ill(1.815)- ill(-1.347)
:::::: .965- (1 - .9115) = 0.8765
36. c) As n-+ ex>, H(k) is asymptotic to e-!{k-np)'/npq. Hence if H(k) <£,then k satisfies approxi-
mately
e- !<k-np)'/npq < £
1 2
-2(k- np) fnpq <loge
(k- np) 2 > 2npqlog.!. {
k < np - J 2npq log or k > np + 2npqlog -.

1
£
So the a, b such that a < m < band both H(a) and H(b) are less than £ are approximately
1
a ::::::np- 2npqlog-
{
1
b::::::np+ 2npqlog -,
(
and
1
b-a::::::2 2npqlog -.
(
The run time to perform the calculation is then approximately 2K x J2npq log .
d) Given p and c, the run time to compute every probability in the binomial (n,p) distribution to
within < is proportional to .fil. Thus it should take .Jlo times as long to compute when n is
multiplied by 10. Hence if it takes 2 seconds to compute the binomial (100, 18/38) distribution
correct to 3 decimal places, it should take 2VIO:::::: 6.3 seconds to compute the binomial (1000,
18/38) distribution correct to 3 decimal places.
37. No Solution
37
Chapter 2: Review
38
@1993 Springer-Verlag New York, Inc.
+
Section 3.1
3
1. X has binomial (3, 1/2) distribution. So P(X = k) = G}(1/2) fork= 0, 1, 2, 3.
a)
:r:
P(X = :r:)
I I I I
0
1/8
1
3/8
2
3/8
3
1/8
• Y
b) Wnte = IX- 11. P(Yy= y) I I I
0
3/8 1
1/2 2
1/8
2. a) Joint distribution table for {X, Y) (with replacement)

possible values :r: /or X distn
1 2 3 4 ofY
possible 1 1/16 1/16 1/16 1/16 1/4
value.s 2 1/16 1/16 1/16 1/16 1/4
y 3 1/16 1/16 1/16 1/16 1/4
forY 4 1/16 1/16 1/16 1/16 1/4
di.stn of X 1 1/4 1/4 1/4 1/4 11 1
P(X $ Y) = 10 X liS =
b) Joint di.stribution table for {X, Y) (without replacement)
pouible values :r: for X distn
1 2 3 4 ofY
pouible 1 0 1/12 1/12 1/12 1/4
values 2 1/12 0 1/12 1/12 1/4
y 3 1/12 1/12 0 1/12 1/4
for Y 4 1/12 1/12 1/12 0 1/4
distn of X 1 1/4 1/4. 1/4. 1/4. II
P(X $ Y) =6 X
1
12 =t
3. a) { 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 }
b) Distribution Table for S:
12
P(S = .s) 1/36
4. a) Joint di.stribution table for (Xt, X2)
possible values :r: 1 for X 1 distn
1 2 3 4 5 6 of xl
possible 1 1/36 1/36 1/36 1/36 1/36 1/36 1/6
values 2 1/36 1/36 1/36 1/36 1/36 1/36 1/6
:r:2 3 1/36 1/36 1/36 1/36 1/36 1/36 1/6
for x2 4 1/36 1/36 1/36 1/36 1/36 1/36 1/6
5 1/36 1/36 1/36 1/36 1/36
- 1/36 1/6
6 1/36 1/36 1/36 1/36 1/36 1/36 I/6
distn of Xt 1 1/6 1/6 1/6 1/6 1/6 1/6 II
b) Joint di.stribution table for (Y1 , Y2)
Section 3.1
possible values y, for Y, distn

l 2 3 4 5 6 of Y2
possible 1 1/36 2/36 2/36 2/36 2/36 2/36 11/36
values 2 0 1/36 2/36 2/36 2/36 2/36 9/36
Y2 3 0 0 1/36 2/36 2/36 2/36 7/36
for Y2 4 0 0 0 1/36 2/36 2/36 5/36
5 0 0 0 0 1/36 2/36 3/36
6 0 0 0 0 0 1/36 1/36
distn of Yi 1 1/36 3/36 5/36 7/36 9/36 11/36 II 1
5. Distribution of X,X2:
z 10
P(X1X2 = z) 2/36
z 36
P(X1X2 = z} 1/36
6. There are 8 equally likely outcomes for three fair coin tosses:
outcome probability X y X+Y
HHH 1/8 2 2 4
HHT 1/8 2 3
HTH 1/8 1 1 2
HTT 1/8 0 1
THH 1/8 1 2 3
THT 1/8 1 2
TTH 1/8 0 1 1
TTT 1/8 0 0 0
a} Joint distribution table for (X, Y}
X
y 0 1 2
0 1/8 1/8 0
1 1/8 2/8 1/8
2 0 1/8 1/8
b) X and Y are not independent, since, for P(X = 2, Y = 0} = 0, while
P(X = 2}P(Y = 0} = (1/4}(1/4}.
c) P(X +Y =
z
z}
I o
1/8
1
2/8
2
2/8
3
2/8
4
l/8
7. a} (N = 2} =(ABC"} U (ABcC} U (Ac BC}
b) P(N = 2) = ab(1 -c)+ a(l - b)c + (1 - a)bc.
8. a) X I I I I
1 2 3 4
b) y 12131415
c) Let X be the number of cards until the first ace when dealing from the bottom of the deck.
Then X has the same distribution as X, and
Y = s- (X- 1) = 6- .t
40
Section 3.1
So
P(Y = y) = P(6- X= y) = P(X = 6- y) = P(X = 6- y)
9• 2:
X;o:--=-z-:-)
--:::::P.,...,( I I I I
2 3
/7::3-::-5 4 5
10. a) binomial (n,p) b) binomial (m,p) c) binomial (n + m,p)

d) yes; functions of disjoint bloclcs of independent variables are independent.
11. a) By the change of variable principle, Un + Vm has the same distribution as Sn + Tm in Exercise
10, and this is binomial (n + m,p)
b) By (a), P(Un + Vm = k) = p)n+m-1:. But also,
1: 1:
P(Un + Vm = k} = L P(Un = j, Vm = k- j) = L P(Un = j)P(Vm = k- j)
= t. (;
i=O
p)n-j (k:
j=O
j)pl:-j(l- p)m-{1:-j)
= (t, (;) (k: J]pl:(l- p)"+m-1:.
Now equate the two expressions for P(Un + V.,. = k). [To get the sum from 0 ton, note (z) = 0
Y
c) Suppose you have n+m objects, n of which are red, and m blue. You want to choose k out of the
n + m objects. Among the k selected objects, some j will be red, where j could be 0, 1, 2, ... , n.
So you could just as well pick j red objects first, then (A:- j) blue objects.
d) This is similar to b).
e) e:)
12. a) If we think of N; as just counting the number of times we get category i in n trials, we don't
care what happens when we don't get this category. We get category i with probability p;, and
don't get it with probability 1-p;, so the distribution of N; is just binomial (n,p;).
b) Similarly, N;+N; counts the number of times we get either category i or j, which happens with
probability Pi +Pi, so the distribution of Ni + N; is binomial ( n, Pi +Pi).
c) Now we just consider the three categories i,j, and everything else, which gives a joint distribution
which is multinomial (n,pi,pj, 1- Pi-p;).
13. a)
2n - 2 2n - 4 2n - 2( k - 1)
P(X > k) = P(first k balls are different color)= 1 · - - · - - · · · · ·
2n 2n 2n
b)
1:-1 1:-1 1:-1
log P(X > k) = "" n-j = ""
L..,..log(-n-) L..,..log(I - jfn)::::: ""
L..,.. -jfn =- k(k-1)
Zn
J=O j=1 j=1
So k(k- I) =(-log( t )]2n. Roughly k 2 = n log 4, so k = v'n log 4. And k = 1177 for n = 10 6 •
14. a) ( 9 ;- 1 )p3 q 9 _ 1 _ 3 • p = ( 9 ;- 1 )ptqg-t for g = 4, 5, 6, 7.

b) "'7
l..Jg=<
(g-1)
3
{ g-t
p q
c) P(A wins) = 1808 / 2187
41
Section 3.1
d) The outcome of the World series would be the same if the teams played all 7 games. Let X be
the number of times that A wins if all 7 games are played. If X 4, then A has won the World
series, since B can have won a.t most 3 games. And if A won the World Series, then A won four
games before B did, so 4. So P(A wins)= P(X 4). Of course X has binomial (7,p)
distribution!
e) G has range {4, 5, 6, 7}.
P(G =g) = P(A wins in g games)+ P(B wins in g games)
= (g;-l)p'qg-4 + (g;-1)q4pg-4.
p = 1/2 makes G and the winner independent.
15. a) P(X = Y) = P(X = 1, Y = 1) + P(X = 2, Y = 2) + ·· · + P(X = n, Y = n)

= P(X = 1)P(Y = 1) + ... + P(X = n)P(Y = n) (by independence)
= =
Notice that by symmetry, P(X > Y) = P(X < Y). Moreover,
1 = P(X > Y) +P(X < Y) +P(X = Y) = 2P(X > Y) + Ifn.
So P(X > Y) = "2-;,1 = P(X < Y).
d) P(max(X, Y) = k)
=
=
!; .
2k-1
----;;>
+ • + *
= P(X = k, Y < k) + P(X < k, Y = k) + P(X = k, Y = k)
e) P(min(X, Y) = k) = P(max(n + 1- X, n + 1- Y) = n + 1 - k)
= P(max(X", Y") = n + 1- k)
_ 2(n+l-k)-1
- n!J •
where x· and y• are copies of X and Y respectively.
f) Note that the distribution of X+ Y is symmetric about n + 1 (recall the sum of two dice). For
2 < k < n+l,
P(X +Y= k) = P(X =j)P(Y= k-j) =
On the other hand, for n + 1 < k < 2n
P(X + Y = k) = P[2(n + 1)- (X+ Y) = 2(n + 1)- k] = = 2n-;J+t .
16. a) P(X +Y = n) = P(X = k, X+ Y = n)
= P(X = k, Y = n- k)
= P(X = k)P(Y = f l - k)
b) P(X +Y = 8) = E!=o P(X = k)P(Y = 8- k) = 2:!,0 P(X = k)P(Y = 8- k)
= ;6 X + 4
;6 X 3 6 + X + 2
X 36 + 5
3 6 X ;6 = = 0.027
17. a) P(Z=k)=P(Y<k,X=k)+P(Y=k,X<k)+P(Y=k,X=k)
= + (1/21)
0 2 3 4 5 6 7 8 9 10 1I 12 13 14 15 16 17 18 19 20
b) Left tail comes from binomial: very thin. Right tail comes from uniform: thick and fiat.
18. a) The number of spots has a distribution symmetric about £(number of spots) =3 x 3.5 = 10.5,
so P{ll or more spots) = P(10 or fewer spots)= 1/2.
42
Section 3.1
b) The number of spots has a distribution symmetric about 5 x3.5 = 17.5, so P{18 or more spots) =
P(17 or fewer spots)= 1/2.
19. a) P(S = k) = Li+J=k p;r; for all k = 2 to 12, so in particular

P.(S = 2) = p1r1;
P(S = 7) =PI T6 + p2rs + P3T{ + PtTJ + psr2 + P6TI;
P(S = 12) = P6T6.
b) P(S = 7) > Jll TG + P6Tl = P(S = 2)?,- + P(S =
c) Suppose the values are equally likely. Then from (b) and calculus, 1 > ;: : : 2: contradiction!
d) Yes; for example, let X take values 1, 2 with probability 1/2 each, and Y take values I, 3 with
probability 1/2 each. Then X+ Y has uniform distribution over 2, 3, 4, 5.
20. No. Counterexample: Let Xt, X2, X3 be the indicators of the events Ht, H 2 , Sin Example 1.6.8.
21. Yes. Proof by mathematical induction.
22. a) P(X = x) = Ly P(X = x, Y = y) = L 11 /(x)g(y) = f(x) Ly g(y)

Similarly P(Y = y) = g(y) L.s: f(x).
b) If P(X = x, Y = y) = f(x)g(y), then
P(X = x)P(Y = y) = f(x) L;g(j) · g(y) L; /(i) = f(x)g(y) (E;g{i)) (LJ(i)) ·
So we just need to show (z=jg(j)) (LJ(i)) = 1:
1 =
1
2: P(X = i, Y = j) = 2:2: f(i)g(j) =
J I }
(2;::/{i))
I }
Note: In this calculation x and y are fixed. The sums are over dummy variables which are not
called x and y, to avoid confusion with the fixed values.
23. If T. then Y T, since Y Hence
T) C (Y T) and P(X T) P(Y T).
This argument still works if T is a random variable.
24. a) Let Pi = P(X = i mod 2), i = 0, 1. Then po +PI = 1,

P(X + Y is even)= .+Pi= 1- 2po(l- Po) :2:: 1-2 ·
b) Let Pi= P(X = i mod 3), i = 0,1, 2. Then Po+ Pt + P2 = 1,
P(X + Y + Z is a multiple of 3) = 6poPIP2·
Now write simply p, q, r for po, PI, P2· So
p+q+r = 1,0 p,q,r 1.

3 3
To show: p + q + r + 6pqr :2:: 3
t- Consider
Notice that p(q + r) = pq + pr 1/4, q(p + r) = qp + qr 1/4, and.r(p + q) = rp + rq 1/4.

The probability in question is thus
1 - J[p[pq +prJ+ q(qp + qr) + r[rp + rq]]
:2:: 1 - J[(p + q + r) · 1/4] :2:: 1 - 3/4 = 1/4.
43
Section 3.2
Section 3.2
1. 15 X 1 + 25 X .2 + 50 X •7 = 41.5
2. Average of list 1 = 1 x .2 + 2 x .8 = 1.8; Average of list 2 = 3 x .5 +5 x .5 =
a) 1.8 + 4 = 5.8 {distributive property of addition)

b) 1.8- 4 = 2.2
c&d) Can't do it: need to know the order of the numbers in the two lists.
3. £(#sixes in 3 rolls)= 3 x t = !·
E(# odd numbers in 3 rolls)= 3 x =
4. If 25 of the numbers are 8, then all the others must be 0, since the average must be 2. If 26 or more
of the numbers are 8 or more, there is no way the average can be 2, since some of the other numbers
would have to be negative.
5. Suppose he bets on 6, and let N be his net gain.

k I P(get k 6's) I N I n x P(N = n)
3 3
3 (1/6) 3 3 X (1 I
--
2 3{5/6){1/6) 2 2 2. 3(5/ !/6)2
3
1 3{5/6) {1/6) 1 3(5/6) ll/6)
0 {5/6) 3 -1 -{5/6) 3
Therefore E(N) = -.078705, and the gambler expects to lose about 8 cents per game in the long run.
6. X= /1 + /2 + · · · + 1r where /; is an indicator random variable indicating whether the ith card is a

spade. Then E(X) = 7 P{first card is a spade) = f
7. E(X) = p;, by linearity of E. No more assumptions required.
8. E [(X+ Y) 2 ] = E(X2 ) + 2E(XY) + E(Y 2 ) = 17.

9. E(X- Y) 2 = E(X 2 ) - 2E(XY) + E(Y 2 ) = p- 2pr + r
10. a) Write Y = (lA + /s) 2 •
If I.-.= 0 and Is= 0 (this occurs with prob (1- P(A)) · (1- P(B)) ), then Y = (0 + 0) 2 •
If /,o. = 1 and Is= 0 (this occurs with prob P(A) · (1- P(B)) ), then Y = (1 +0) 2 •
If I.-.= 0 and Is= 1 (this occurs with prob {1- P(A)) · P(B) ), then Y = (0 + 1) 2 •
If I,o. = 1 and Is= 1 (this occurs with prob P(A) · P(B) ), then Y = {1 + 1) 2 •
So
y P(Y = y)
0 (1- P(A)) · {1- P(B))
P(A) · (1 - P(B)) + (1 - P(A)) · (P(B))
4 P(A) · (P(B))
b) Use properties of expectation described in this section: Note that Y = (l.-.+lu) 2 = IA+2I.-.s+ls
{using x 2 = x for x = 0 and x = 1, so = I A and 11 = I iJ) so
E(Y) = P(A) + 2P(AB) + P(B) = P(A) + 2P(A)P(B) + P(B).

Alternatively (and this is more cumbersome), use part (a) and the basic definition of expectation.
44
Section 3.2
11. Use Markov's inequality:
' ) $ expected# of wms

P ( at least one wm . = 101 + 101 + 101 = .3
Or you can nse Boole's inequality.
Actually, P(at least one win) = 1 - P(no wins) = 1 - • : : : • : : : = .271. The bound is close
becanse the bound pretends the events (ith ticket wins) are mutually exclusive. Well, they almost
are, because P(more than 1 win) is tiny.
12.
E(X) = ExP(x) $ LbP(x) (since P(X $b)= 1)
=b LP(x) =b.
Similarly you can show the other inequality.
13. a) By linearity of expectation, 10 X 3.5 = 35.

b) Let X1,X2,X3 be the three numbers, and let Min denote the minimum of the first three
numbers.
We want E(X1 +X2 +X3- Min)= E(X1 + X2 + X3)- E(Min) = 3 x 3.5- E(Min). Exactly
as in Example 9,
= 1 + ( )3 + ( )3 + ( )3 + ( )3 + ( .!_ )3
6 6 6 6 6
441
= 216
= 2.042
where q,. = P(Min m). So the required expectation is 3 x 3.5- 2.042 = 8.458.
c) Let Max be the maximum of the numbers on the first five rolls. For each m between 1 and 6,
P(Max = m) = P(Max $ m)- P(Max $ m- 1) = ( 6) ms- (m-1)

- 6-
5
So
E( Max) = 1 · ( .!_6 )5 + 2 (< 6 ) 5
- ( .!_ )
6
5
) + ·· · + 6 (< 6 )
5
- (
6
)
5) = 5.43.
Or: bysymmetry, E(Max)=7-E(min(X1, ... ,X5)).
d) The number of multiples of 3 in the first ten rolls has binomial {10, 1/3) distribution, so its
expectation is 10/3 = 3.3333.
.
e) The number of faces which fail to appear in the first ten rolls is is 1 1 + !2 +···+Is. where l; is
the indicator of the event (face i fails to appear in the first 10 rolls). Now for each i,
E(l;) = P(face i fails to appear in the first ten rolls) = (5/6) 10 •

So the required expectation is 6 x E(I!) = 6 x (5/6) 10 = 0.969024.
f) The number of different faces in the first ten rolls equals 6 minus the number oi" faces which fail
to appear. So the required expectation is 6- 6(5/6) 10 = 5.030976.
14. We want E(N), where N is the number of floors at which the elevator makes a stop to let out one or
more of the people. N is a counting variable. It's the sum of the ten indicators
/(at least one person chooses floor i). i = I, ... , 10.
So by linearity,
10
E(N) = L P(at least one person chooses floor i).

1=1
45
Section 3.2
Now for each i
P(a.t least one person chooses ftoor i) = 1 - P(nobody chooses floor i) = 1- (9/10) 12
by the independence of the people's choices. Hence
E(N) = 10 X [1 - (9/10)
12
] ::::::: 7 .18.
15. a.) If y ::5 b, then x-y will be the profit, and ..\(b-y) will be the loss incurred from the items stocked
but unsold. So -.,;y +..\(b-y) will be the loss in this case. And, if y > b, then .,;6 will be the
profit, since there are only b items to be sold.
b) Write p(y) = P(Y = y). Then
r(b) = E[L(Y, b)]= I: (-ry +..\·(b-y)] p(y)- I: rbp(y)

= 2:::: (-.,;y + ..\ · (b- y)]p(y) + 2:::: lfbp(y)- rb
How to minimize over a.ll integers b?

Method 1: Argue that if we buy b items, then the expected loss will be
r{b) = (..\ + r) L (b- y)p(y)- .,;b.

!l<b-1
while if we buy b - 1 items, then the expected loss will be
r(b-1)=(..\+r) L (b-1-y)p(y)-r·(b-1)
ll:$b-1
Therefore
r(b)- r(b- 1) = (..\ + .,;) L p(y)- lf = (..\ + r)P(Y ::5 b- 1)- .,;,
ll<b-1
So buying b items will be better than buying b- 1 items whenever

1r
P(Y<b-1)<-,-.
- A+ lf
Let b• be the largest value of b for which ( •) holds. (Such a one exists, since the left-hand
side tends to 1 as b increases). Note that 'b• is also the sma.llest integer for which P(Y ::5 b)
.,;f(..\ + r). Argue that
r(O) > ... > r(b * -1) > r(h) ::5 r(b *+I) ::5 .. ·
so r(y) is minimized at y = b• (and possibly elsewhere).

Method 2: View r(b) as a function of a real variable b. Argue that r(b) is continuous, and that
if b is not an integer, then r is differentiable at b with derivative
r'(b) = (..\ + r) LP(Y)- = (.>. + r)P(Y ::5 b)-

lf lf.
11:Sb
The right-hand side is a nondecreasing function of b. Let h denote the smallest integer for which
the right-hand side is nonnegative. Argue that if b < b•, then r(b) > r(b•), and if b b•, then
r(b) r(b•). Hence r attains its minimum over real b a.t b• (and possibly elsewhere).
16. a) P(X, = k) = P(X, = k) = P(first k cards are non-aces, next is ace)= (<,•2a)J•·•.
;:, ll+l
b) X,+ X2 + XJ +X• +4 5E(X,) = 48 E(X;) = E(X,) = 9.6.

46
Section 3.2
c) No. For instance, P(X 1 = 30, X2 = 30) = 0, but P(Xt = 30) = P(X2 = 30} > 0.
17. a) P(D $ 9) = P(3 red balls are among the first 9
b) P(D = 9) = P(D $ 9)- P(D $ 8) =¥iff-

c) Label the blue balls and the green balls, say, b1, ... , b1o. Then
D = 3 + L::! 1
I(b; is drawn before the third red ball), so
E(D) =3+ L::: 1
P(b; is drawn before the third red ball) = 3 + 10 x = 10.5.
18. Solve the system of equations
p(a) + p(b) =1
ap(a) + bp(b) = ll
for p(a) =Sand p(b) =
19. Let x = # blues. Then # reds= 2x, #w = x, #g = 3x. So
1 2 3
Pb = Pw = 7, Pn = 7• pg = 7
a) P(X 4) = P(X = 4} = (l!l7i!2!(t)3{t)(t)) x 2 +
b) E(X) = E(Ib) + E(I..,) + E(Ir) + E(/9 ) where e.g. 19 =indicator "green appears"
= {1- + (1- ) 5) + (1 - ) 5) + (1 - (!7 f')

7 7 7
20. Write p(x) = P(X = x). Solve the system of equations
p(O) +p(1) + p(2) = 1
p(1) + 2p(2) =Ill
p(l) + 4p(2) = 1!2
for p(2) = p(l) = 21'1 -1.12, p{O) = 1- ( {2Jlt -1.12).
0
21. a) The indicator of A" must be 1 when A o<:curs and 0 otherwise; clearly this is true for 1 - 1A:
A"= 1 <=> IA =
0 <=> 1 -/A= 1.
b) The indicator of AB must be 1 when A and B both o<:cur and 0 otherwise. Since !AlB = 1 <=>
(I A= 1 and I a= 1) <=> IAIB = 1 we have lAB= /Ala.
c) We wish to show that the indicator of the union can be found by the given formula, which can
be shown by the following application of the rules in a) and b).
fA,uA,U···UAR = J{Af.A5···A,£)c
= 1- /(A\'A5···A,f)
= 1- (JA\'JA5 ···IA.f)
= 1- {1-/A,)(l-/A 2 )···(1-/AR)
d)
i=l
= £(1- (1 -/A,)(1- IA,) ... (1- /AR))

= E(L fA; - (L JAJA;} + · · · + (( -1}"+ 1 /AJA, ···ft.))
= LP(A;)- LP(A;AJ)+···+(-1)"+ 1
P(AI···An)
i i<J.
47
Section 3.2
22. a) Let l:J,i be the indicator of a run of length 3 starting at the ith trial. Let P(3, i) = E(/3 ,;) be
the probability of this event. Then for n > 3
n-2 n-2 n-2
E(R:J,n) = E(L:I3,i) = LE(la,;) = L P(3, i)

i=1 i=l i=l
n-3
= P(3, 1} + L P(3, i) + P(3, n- 2)
= p3 (1- p) + (n- 4)(1- p)(p3 }(1- p) + (1- p)p3
= 2l(l- p) + (n- 4}(p 3 }(1- p) 2
b) Similarly, let I m,i be tlte indicator of a run of length 3 starting at the ith trial. The for m < n
n-m+1
E(Rm,n} = L: P(m, i)
i=l
Form= n, E(Rn,n) = p".

c)
n
Rn= LRm,n
m=l
so
n
E(Rn} = E E(Rm,n)
m=1
n-1
= p" + L 2pm(l - p) + (n- m- l)(pm)(1- p) 2

rn=l
n-1 n-1
= p" +2(p- p") + (n- 1}(1- p)2 L(Pm)- (1- p)2 L mpm
m=l m=l
n-1
= 2p- p" + L:<n- m- l}(pm)(l- p) 2

m=l
n-1
= (n- l}p(l- p) 2 - L pm(l- p) 2 = (n- l)p(l- p) 2 - (p- p")(l- p)

m.=l
so
2: 1 = (n- l)p(l- p)- (p- p")
and finally
E(R.,) = 2p- p" + r;, :c ·• + (n- l)p(l - p)
d) Let ! 1 be the indicator that a run of some lengtl. on the jth trial. Then Rn = L;=l 11
and if we let P(j) be the probability that a run o( some length starts on the jth trial, we have
n n n
E(Rn) = E(LJj) =E E(Ij) =L P(j)

J=l
where P(l) = p and P(j) = (1- p)p for j > I. Thus
E(Rn) = p + (n- l)p(l- p)
48
Section 3.3
Section 3.3
30 31
1.
1/3 7/12
E(X) = 30.42, SD(X) = 0.86.
X 28 30 31
b) --=p,.,..(x=-=-x-:),-+.,..28,..../":":'3-:"65.,....--,4-x-3:::-:0,..-,/,....36""'5--=-7-x-:3-:-1/-;-::3:-::6-::--5 E(X) = 30.44, SD(X) = 0.88
2. E(Y 2 ) = 3, Var(Y 2 ) = 15/2
3. a) E(2X + 3Y) = 2E(X) + 3E(Y) = 5.
b) Var(2X + 3Y) = 4Var(X) + 9Var(Y) = 26.
c) E(XY Z) = [E(X)][E(Y)J[E(Z)] = [E(XW = 1.
d)
Var(XYZ) = E[(XYZ) 2
]-
2
[E(XYZW = E[X Y 2 Z 2 ] - [E(XYZ)] 2
2 2 2
= E[X ]E[Y ]E[Z ]- [E(X)E(Y)E(ZW
= [E(X 2
)]
3
- [E(X)] = [V ar(X) + [E(XWJ 3
6
- [E(X)] 6 = 26.
4. Var(X1X2) = E[(X1X2) 2
]- [E(X,X2}f
= E(xn · E(Xi)- [E(X1X2)] 2
(l'tl'2)2
= + +
5. By the computational formula for variance,
But Var(X -a) = Var(X) = u 2 , so

E[(X-a) 2 ] =u2 +(1'-a) 2 •
Thus the mean square distance between X and the constant a is u 2 + something non-negative. The
mean square distance is minimized if this 'something non-negative' is actually 0, i.e., a JL· And in =
this case, E [(X -1') 2 ] u2 Var(X). = =
6. Intuitively, I and 6 are the two most extreme values, so increasing the probability of these values
should increase the variance. •
I> P(Xp
6
2 2
Var(Xp) = = x) -[E(Xp)]
z=l
= 12!!. + 22 I - p + 32 I - p + 42 I - p + 52 I - p + 62!!. - {3.5 )2

2 4 4 4 4 2
= 2p- 4p- 9p- I6p- 25p + 72p + 4 +9+ 16 + 25 - (3.5)2
4 4
5
= Sp+ 4
7. X= X,+ X2 + XJ, where X, has binomial (n,, p,) distribution.
(a) No! It is binomial only when the p.'s are all the same.
(b) E(X) = n;p;, Var(X) = L=l n;p;q,.
49
Section 3.3
8. a) N = Xt + X2 + X3 b) E(N) = t+ t+ } =
c) Note that N is the indicator of the event At U A2 U A;'s are disjoint. So
Var(N) = (k + t + i )(1- } - t- i} =
d) Var(N) = Var(Xt) + Var(X2) + Var(X3) = t · t + t · t +} · =
e) Var(N)-
-5·
[!..J2+(l-.!.)·22+(l-.!.).
45 34 12]-(!+.!.+.!.)2=
54 3
5291
3600
9. The number Nt of individuals who vote Republican in both elections has binomial (r,1- p 1) distri-
bution, and the number N2 of individuals who vote Democratic in the first election and Republican
in the second has binomial (n- r,P2) distribution. The number of Republican votes in the second
election is then Nt + N2. So
Since Nt and N2 are independent, we have
Var(Nt + N2) = Var(Nl) + Var(N2) = r(1- pt)p1 + (n- r)P2(1- P2)
10. a) E(X") =I:;= I x"P(X = x) = L:;=l x" · = + 2" + · · · + n") = s(k, n)fn.
E [(X+ 1)"] = L:;=t (x +I)" P(X = x) = (2" + 3" + .. · + (n + I)"j = [s(J, n + 1)- I].
b) By the binomial expansion, kXk-I + +···+I= (X+ 1)"- X", so
E [kx"- 1 2
+ G)x"- + ... +I] = E [ex+ 1)"]- E(X")
= 2" + J" + · .. + (n + 1)" 1 + 2" + .. • + n" _ (n +I)"- I
n n n
c) Put k = 2 in b):
E(2X + 1) = = n + 2 => E(X) = !!fl
By a), s(1, n)/n = E(X). So s(1, n) = n(np>.
d) Putk=3inb):
E(3X 2 + 3X + 1) = (n+IJ"-t = n 2 + 3n + 3.
=> E(X 2) = [n 2 + Jn + 3- 3E(X) -1] =} [n 2 +3n +3- J(n;l) -1] = k{n+1)(2n+ 1).
By a), = E(X 2). So s(2, n) = + 1}(2n + 1).
e) Var(X) = E(X [E(XW = Hn + 1}(2n + 1)- [(n + 1)/2] 2 = ":;- 1 •
2
)-
f) For n = 6, E(X) = 7/2 agrees, and Var(X) = 35/12 agrees.

g) Putk=4inb):
Use part a) to conclude
n [
t n
3
+4n 2 +6n+4-(n+1)(2n+1)-2(n+1)-1 ] =n
2
(n + 1) 2 2
=(s(1,n)].
4
11. Y = (a - b) + bX. So
E(Y) =a -b+bE(X) =a- b+b· -n+1

- =a +b (n-1)
2
Var(Y) = b Var(X) =
2 b2 · (n -J)
-
2
- -
12
50
Section 3.3
12. a) According to the Chebychev inequality,

52
P(X 20) P(IX - 101 10)
102
= 1/4.
b) If X has binomial (n,p) distribution, then E(X) = np and SD(X) = y'np(1- p). Try np = 10,
and np(1 - p) = 25, which implies that 1- p = 25/10 = 2.5. This is impossible!
13. Let n denote the number of scores (among the million individuals) exceeding 130, and let X denote
the score of one of the million individuals picked at random. Then
P(X > 130} = 10 .

n = 106 P(X > 130).
a) We have, by Chebychev's inequality,

P(X > 130} = P(X- 100 > 30} P(IX- 1001 > 30) = .
So n 6
10 x < 111112.
b) Since the distribution of scores is symmetric about 100,
P(X - 100 > 30} = P(100 -X > 30). Therefore
1
P(X > 130} = P(X- 100 > 30) = tP(IX- 1001 > 30} 18
6 1
and n 10 x 18 55556.
c) P(X > 130) = P( X > 3) 1 - cf1(3) = 0.0013, hence n = 106 x 0.0013 = 1300.
Note: We can get a sharper bound in a) using Cantelli's inequality, which says: if X is a random
variable with mean m and variance u 2 , then for a > 0
tr2
P(X- m > a) •
tr 2
+a2
Proof of Cantelli's inequality: Without loss of generality, we may assume m = 0, u = 1. Note
that if X a then (1 + aX) 2 (1 + a 2 f 2 • So
where 1cx>a) is the indicator of the event (X> a). Take expectations, and get the result.
Application to part a): P(X > 130) = P(X- 100 > 30} =
14. a) Let X be the income in thousands of dollars of a family chosen at random. Then E(X) = 10
and, by Markov's inequality,
P(X >50) < E(X) = !..
- - 50 5
b) If SD(X) = 8 then by Chebychev's inequality
1 1
P(X 50) P(IX- 101 > 5 · 8} 52
= 25
.
15. b) 10..;8
16. a)
P(X = x)
Thus E(X) = - 2 - 1/ 0
+3 = 0 and Var(X) = (-2 )'+<-!l'+O'+J' - 02 = f = 3.5
b) Let Sn = X; where each X; is one play of this game. Then we wish to know P(S10o 25).
E(S,oo) = JOOE(X) = 0 and Var(Stoo) = lOOVar(X) = 350. the sum of 100
draws should look pretty close to normal, so
P(Stoo > 25)

-
Jaso
1 - cf1 ( 25 - 5-0) = 1 - cp ( -
350
24· 5- ) = .0952
18.71
51
Section 3.3
17. E(X) = 1/4, SD(X) = hili; E(S) =6.25, SD(S) = tffi·

a) P(S < o)::::: ( : : : .05.
b) P(S = o) ::::: q, ( o.s-6.25 ) _ ( -o.5-6.2s) 03 •
(st•>Jii (5/4)J'ii' ·
c) P(S > 0)::::: 1- .92.
18. Let X; be the amount won on the ith bet. Thus X;= 6 if you win on the ith bet, and X; = -1 if
you lose. Let S be the sum of 300 dollar bets, and we wish to find P(S > 0).
5 33·
E(X;) = 6 X 38 + -1 X 38 = -0.07895
SD(X;) = v. I36 X
5
38
+1x 33
38
- {-0.07895)2 = 2.366
E(S) = 300 X -0.07895 = -23.68

SD(S) = v'300 X 2.366 = 40.98
By the normal approximation,
23 68
P(S>O):::::l-(>(0-{- · )) =0.2817
40.98
19. Let X; be the weight of the ith guest in the sample, and let S = X 1 + X 2 + ... + X 30 • Want
P(S > 5000). Use the normal approximation, with E(S) = 4500, and
SD(S) = V3o X 55 = 301.25
Need the area to the right of {5000- 4500)/301.25 = 1.66 under the standard normal curve. This is
1- = 1 - 0.9515 = 0.0485.
20. a) The probability that your profit is $8,000 or more is the probability that your stock gives you a
profit of either $200 per $1000 invested or $100 per $1000 invested; this probability is .5.
b) Let S10o be the sum of the profits for the 100 stocks invested in, and let X be the random
variable representing one of the profits.
E(X) = = 50 and Var(X) = 200,±1oo' -too), - 502 = 12, 500.
E(Stoo) = 100E(X) = 5000, and since the investments are independent, Var(Stoo) = 100Var(X) =
1, 250,000. Using normal approximation,
P(S100 > 8000 ) =

-
1_ (8000- 50- 5000) = 1 _
,;1
.'
250
'
ooo (2950) = _0042
1118
In the numerator of the normal approximation term, note that the 8000-50 is just the familiar
continuity correction since Stoo can only take on values which are multiples of $100.
21. a) Let X; be the error caused by the ith transaction; thus X; is uniformly distributed between -49
and 50. Let S 100 be the total accumulated error for 100 transactions, and then we wish to know
P(Stoo > 500 or Stoo < -500). We have
so
E(X;)= L _J_
100
= .5
i=-i9
so
E
·2
1 2
Var(X;) =
100
-.5 = 833.25
i=-49
So E(Stoo) =50 and Var(Stoo) = 83, 325. Using normal approximation,
P(IStool > 500)::::: 1- (500 + .5-

./83, 325
(-500- .5-
v'83, 325
so))= 0.0876
52
Section 3.3
b) Using the same notation as in a), we now have that

50 •
E(X;) = LJ
] X99•75 = .3788
i=-49
except that the coefficient when j = 0 is wrong, but this does not affect the answer. Similarly,
50 ·2
Var(X;) = "'
L..J ] X .75 -.3788 2 = 631.30
99
i=-49
Thus E(Stoo) = 37.88 and Var(Stoo) = 63, 130, Using normal approximation,
P(ls100 I > 500) 1- ( .... (500 + .5- 37.88) -X. (-500- .5- 37.88)) -
'>! ./63, 130 '>! y'ii3,i30 - 0.0489
22. a) Let X; be the face showing on the ith roll; then E(X;) = 7/2, and SD(X;) = J35/12. Let Xn
be the average of the first n rolls. Then
E(Xn) = E(Xt) = 7/2, and
SD(Xn) = SD(Xt)/.fii = J35/12n, and
5 7
P (3 1 2 :::; Xn :::;3 12 ) = P (1z"1:::;
where
Zn = Xn- E_(Xn) = Xn - (7/2)
SD(Xn) J35/12n
is Xn standardized, which will be approximately standard normal for large n. When n = 105,
P (IZnl:::; P(IZI:::; 0.5) = 0.383
where Z is standard normal.

The same argument gives the numbers in the following table:
105 P(IZI :::; .5) = .383

420 P(IZI :::; 1) = .6826
1680 P(IZI :::; 2) = .9544
6720 P(IZI :::; 4):::::: 1
b)
1.0 ;
0.5 i
I
I
I
0.0
0 1680 3360 5040 6720
53
Section 3.3
c) P (IXn -7/21 1/24) P (IZnl

See that if f: is replaced by f/2, then probabilities that were previously obtained at n are now
obtained at 4n. So the scale on the n-axis is stretched by a factor of 4. In general, dividing e
by a factor off results in a graph with the scale on the n-axis stretched by a factor of f 2 •
Same curve with horizontal axis

stretched by a factor of 4.
1.0 1 l.O
i
1 0.5
E=-
12
0.0 .....,__..,-_ _ _ _ _ __
0.0
105 420 1680 105 420 1680
23. Let X; be the lifetime in weeks of the ith battery. Assume S = X 1 + · · · + X-21 is approximately
normally distributed. Using E(S) = 27 X 4 = 108 and SD(S) = ...(i7 x 1 =...tiT, we have
P(more than 26 replacements in a 2-year period)
= P(Xt + ... + X27 < 2 X 52)
= P(S < 104)
=P
s -108
( ffi <...tiT
- -4)
4>(-.77)
24. a)
P(S2x = x)
b) E(Sso) =50 and Var(Sso) = 50(.5) = 25 .•Thus by normal approximation,
P(Sso =50)::::: 4> co.5 5- 50) - 4> c9.5 5- 50) = 0.0797
c) Sn = X1 + · · · + Xn where the X; are independent, each with binomial (2, 1/2) distribution. It
follows (Exercise 3.1.11) that Sn has binomial (2n, 1/2) distribution:
P(Sn = k) = C:) G/"

25. a) k = 1:
X -10 10
P(X = x) l/2 1/2
k = 2:
X -20 0 20
P(X = x) 1/8 3/4 1/8
54
Section 3.3
k = 3:
X -30 0 30
P(X = x) 1/18 8/9 1/18
-30 -20 -10 0 10 20 30
0 D
-30 -20 -10 0 10 20 30
-30 -20 -10 0 10 20 30
b) E(X) = 1-' by symmetry of the distribution about p..

Var(X) = E[(X -p.) 2 ) = 2 · (kol· 2h +0 (1- tr) = u
2 2
P(IX- 11-l!:: ku) = P(X = 1-1 + ka) + P(X = p.- ka) = 2 · 2b = tr•
which is the Chebychev bound on the above probability.
c) P(IY- 11-l < a) = 0 P(IY- 1-'1 2:: al = 1. So absolute deviations from the mean a.re 2:: tT
with probability 1. On the other ha.nd, the average of squared deviations (variance) = u 2 • So
the deviations must actually equal ±a. Suppose P(Y -p. =a)= p, so P(Y -11 = -a) = 1 - p.
Then
0 = E(Y- p.) = pa + (1 - p)( -a),
sop= 1/2. That is, P(Y = p. + u) = P(Y = p.- u) = which is the above distribution for
k =
1.
2G. a.) 1.5 b) Note that
0 $ Var(IX - p.l) = E(IX- 1-'1 2 ) - [E(IX- ttiW. From this, the result follows immediately. That the
equality holds if and only if IX - 1-'1 is a constant follows from the first inequality.
27. a) Use the fact that expectation preserves inequalities:
(i) o :$ X :$ 1 ====> 0 $ E(X) $ 1.
(ii) Note that X 2 $X, so E(X 2 ) :$ E(X) and
=
o :$ Var(X) E(X 2 ) - [E(XW $ E(X)- [E(XW 1-1 -p. 2 • =
The inequality 1-'- p. 2 :$ 1/4 follows from calculus, or from completing the square:
Jl - Jl2 = t- (Jl - 1 :$ t.
55
Section 3.3
b) The results are t.rivial if a =b. So assume a <b. Let Y = '!.:-:. Then 0 Y 1.
(i) Use part i) of a) to conclude 0 E(Y) 1. Substitute E(Y) = 8 and rearrange.

(ii) Use part ii) of a) to conclude
o$ Var(Y) E(Y) (1- E(Y)) 1/4.
Substitute E(Y) =8 and V ar(Y) = and rearrange.

(iii) Immediate from ii).
c) Consider a random variable X taking values 0, ... , 9 with chances equal to the proportions of
each digit among the list. We have 0 X 9 .and V ar(X) = t{9) 2 • By part b), we must have
E(X
2
)- (E(X))
2
= E(X) (9- E(X)) =
The second equality yields E(X) = 9/2 and the first yields 9E(X) = E(X 2 ). But 9X- X 2 0,
so 9X - X 2 = 0 with probability 1 (see below). Hence X takes values only o or 9. Conclude
from E(X) = 9/2 that P(X = 0) = P(X = 9) = 1/2, that is, that half the list are O's and the
rest are 9's.
Claim: If Y 0 and E(Y) =
0 then Y =
0 with probability 1.
Reason: Suppose not. Then P(Y > a) > 0 for some a > 0 . But by Markov's inequality, we
have P(Y > a) =
E(Y)/a 0, contradiction!
28. Var(S) = L;Pi(1- p;) = np(I- p)- L;(Pi- p) 2 •

29. Dn = (D1 + •· · + Dn)/n, where each D; is uniformly distributed on {0, ... , 9}: P(D; = k) = 1/10
for each k.
So E(D;) = 110 (1 + 2 + ·· · + 9) = 9/2.
To get the variance of D;, note that D; + 1 is uniform on {1, 2, ... , 10}. So by a previous problem
(Moments of the Uniform Distribution), Var(Di) = Var(D; + 1) =
102 12
1; 33/4 and SD(D;) = =
.../33/2.
a) So guess int (9/2) = 4.
b) If n = 1, then the chance of correct guessing is P(D 1 = 4) = 1/10. The chance of being correct
would be the same no matter what we guessed.
If n = 2, the chance of correct guessing is
Note that the distribution of D1 + D2 is triangular with peak at (D1 + D2 = 9), so guessing
D2 = 4 indeed maximizes the chance of guessing correctly.
For large n, the standardized variable Zn = 2 B<5;i 912
l has approximately standard normal
distribution. So the chance of correct guessing is
p (4 Dn < s)::::: p
912
> ffi) = p (IZnl /!i).
The normal approximation now gives:
n = 33: P(4 D" < 5):::::: P(IZI 1) = 0.6826
n = 66: P(4 Dn < 5):::::: P(IZI ./2) = 0.8414
56
Section 3.3
n = 132: P{4 D" < 5)::.:: P(IZI 2) = 0.9544

1.0-
0.8 i
0.61
i
0.4
0.2 J
0.0
0 33 66 132 264
c) P(4 Dn 5) 0.99 {::::::::} P(IZnl 0.99 {::::::::} Vnf33 2.58 {::::::::} n 220.
30. takes the values 0, 1, 4, 9, 16, 25, 36, 49, 64, 81 with equal probability, so the X; are independent
with common distribution
= r) 11.: I l 1.: I
So E(X;) = 9/2 and Var(X;) = 9.05 => SD(X;) = 3.008 .
. a.) By the law of averages, you expect Xn to be close to E(Xn) = 4.5 for large n. So predict 4.5.
b) P(IXn- 4.51 > c)
=p (,fo!X,.-u!
3.008
> 3.008
::.:: p (IZI > 3.008
= 2P (z > 3.008 '
where Z has standard normal distribution. For n = 10000, we need t: such that
P [Z > = = 0.0025 => = 2.81,

therefore t: = 2.81 x 3.008/100 = 0.085.
c) Need n such that P(lXn- 4.51 0.01) 0.99 i.e.,
P [1z1 <
-
vnxo.ot]
3.008
>
-
0.99 => .::IE_
300.8 -
> 2.58
therefore n 602276.
d) We have calculated E(X;) = 9/2 and Var(X;) = 9.05. From the previous problem, we have
E(D;) = 9/2 and Var(D;) = 33/4 = 8.25. Since D; has smaller variance than does X;, the
value of Dn can be predicted more accurately.
e) Since E(Xtoo) = 4.5, you should predict the first digit of X100 to be 4. The chance of being
correct is
P(4 Xtoo < 5)::.:: P (I v'Wo<{Jgru> 2 tJ: 8 ) ::.:: P(IZI 1.66) = 0.903.
31. a) 9/2, ,J33j2

d)
P(ISn- 4.!_nl < bv'n} = p ( ISn- 4tnl <

2 - - VJ3
::::: 1
32. No Solution
33. No Solution
57
Section 3.4
Section 3.4
1. a) Binomial probability: (:)p5 (1- p) 4

b) P(first 6 tosses are tails, seventh is head)= (1 - p) 6 • p
c) P(exactly 4 heads among first 11 tosses and 12th toss is heads)= enp (1- p)4 7
• p
d) P(k heads among first 8 tosses and k among next 5)

= (!)P"(1 - p)B-k. (!)P"(1- p)5-lc
2. a) Dis distributed as 1+ a geometric random variable with parameter t· Whatever the first ball
is, we then wait until we draw a ball of the other color.
b) Since the expected value of a geometric random variable is E(D) = 1 + = 3.
c) The SD of a geometric random variable is :If so in this case we have SD(D) = ../2.
3. Assuming X has geometric distribution on {1, 2, ... } with p = 1/12, which must be approximately
correct, E(X) = 12.
4. a) The probability of some person being the "odd one out" is 1- the probability of having the
three coins be HHH or TTT. Thus the probability is 1- ((t) 3 + =
b) Let the length of play be the random variable X, then for r = 1, 2, ...
P(X
1
= r) = ( 4
)r-1 43
c) Since X has geometric(3/4) distribution, and the geometric (p) distribution has mean 1/p,
E(X) = 4/3.
5. a) Let q; = 1 - p;. The probability that Mary takes more than n tosses to get a head is the same
as the probability that Mary gets tails the first n times, or q2".
b) The probability that the first person to get a head tosses more than n times is the same as the
probability that everyone gets tails for the first n times, or (q1q2q3)"
c) In order for the first person to get a head to toss exactly n times, there must be no heads
in the first n - 1 tosses and then at least one head on the nth toss. Thus the probability is
(q1q2q3)"- 1(l- q1q2qa) = (q1q2qa)n-l- (q1q2q3)".
d) Condition on the first head occuring at time n. Now in order for neither Bill nor Tom to get
a head before Mary, Mary must get a head at time n; it does not matter what Bill and Tom
get. Thus we want the probability that Mary gets a head given at least one head. Let M be the
event that Mary gets a head and let H be the event that at least one head occurs. Then
P(MIH) = P(M and J/) = P2

P(H) (1- q1q2qa)
6. a) P(W = k) = = k + 1) = q<"+ 1 >- 1 p = q"p.

P(T
b) P(W > k) = P(T > k + 1) = q"+'.
c) E(W) = E(T- 1) = E(T)- 1 = i;- 1 = qfp.
d) Var(W) = Var(T -1) = Var(T) = qfp2 •
7. a) Use the craps principle. Imagine the following game: A tosses the biased coin once, then B
tosses it once. If A's toss lands heads, then say that A wins the game; if A's toss does not land
heads but B's does, then say that B wins the game; otherwise the game ends in a draw. The
chance that A wins this game is p; the chance that 8 wins this game is qp; and the chance of a
draw
58
Section 3.4
You can see that A and B are really repeating this game independently, over and over, until
either A wins or B wins, and that the desired probability is P(A wins before B does). By the
craps principle, this is pfqp = 1 !q•
Another way: In terms ofT, the number of rolls required to produce the first head, the desired
probability is
P(T is odd)= P(T =I)+ P(T = 3) + P(T = 5) + ... = p + q2 p + + ... = 1
!q>.
b) Apply the craps principle to the situation in a): the desired probability is P(B wins before A does)=
= rtq. Or compute P(T is even) Or subtract the answer in a) from 1, because
one of the players must see a head somet1me.
c) Apply the craps principle, this time with the game consisting of A tossing the coin once, then B
tossing it twice. The chance that A wins this game is p, while the chance that B wins is q- q3 •
So the chance that A gets the first head is p/(1 -l). and the chance that B gets the first head
is (q- q3 )/{1- q3 ). Note that 1- q3 = (1- q)(1 + q + q2 ), so P(A gets the first head) simplifies
to 1/(1 + q + q2 ). Alternatively, in terms of the random variable T, the chance that A gets the
first head is the probability that Tis of the form 1 + 3k for some k = 0, 1, 2, ... ,similarly for B.
d) Solve 1 +;tq> = .5 to get q = "\- 1 and p = 3 - 2..§ = .381966.

e) If pis very small, then the fact that A gets to toss first doesn't confer much of an advantage to
A. However, since B tosses twice as often as A does, you would expect that the chance that B
gets the first head is close to 2/3. Indeed, as q tends to 1:
P(A gets the first head) = 1 tq\q> - •
P(B gets the first head)-
8. a,b) Suppose that the player's point Xo is x = 4, 5, 6, 8, 9, or 10. Then P(winiXo = x) is the
probability that in repeated throws of a pair of dice, the sum x appears before the sum 7
does. In the notation of the Example 2, imagine that A and B are repeating, independently, a
competition consisting of the throw of a pair of dice. On each throw, if x appears, then A wins;
if 7 appears, then B wins; otherwise a draw results. The desired probability is the chance that
A wms. b e fore B does, wh"ICh , by t h e craps pnnctp
. . Ie, .ts P(z)+P(
P(z) ).
7
X 2 3 4 5 6 7 8 9 10 11 12
P(x) I/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36
P(win I Xo = x) 0 0 3/9 4/10 5/11 1 5/11 4/10 3/9 1 0
c) P(win) = P(winiXo = x)P(Xo = x)

36 9 36 10
. ..!..2+.2..
36 II
. .2...2+I·..!!..+1·_L
36 • 36 36
- 0±220±352±500±660±220 - 1952
- 36xl0xll - 36xl0xll
9. Let X be the payoff. Then

P(X = n 2 ) = Ct)" for n = 1, 2, ... ,.
That is, X = W 2 , where W has geometric (p) distribution on {1, 2, ... } for p = 1/2. Since W has
mean 1/p = 2 and variance qfp 2 = 2,
E(X) = E(W = Var(W) + (E(WW = 6.
2
)
Your net gain is given by X- 10, so E(net gain) = -4. That is, you are going to lose $4 per game
in the long run.
10. a) Condition on the first outcome and use the rule of average conditional probabilities: Let 5 =
(first trial results in success). F = (first trial results in failure). Then
P(X = n) = P(X = niS)P(S) + P(X = niF)P(F)
Note that
P(X = niS) = P(WF = n- l) = p"-•q, n 2,
59
Section 3.4
where W F denotes the number of tosses until the first failure, which has a geometric distribution
on {1, 2, ... } with parameter q. Similarly,
P(X = niF) = P(Ws = n- 1) = q"- 2 p, n;?: 2,
Therefore
P(X = n) = p"- 1
q +q": 1 p for n = 2,3, ....
b) Duplicate the method for finding the moments of the geometric distribution: E(X) = L:;:"= 2 np"- 1 q+
2:::"=2 nq"-1p
= q (L(1,p) -1) +P (L(1,q) -1)
= q (( 1_:p)2 -1) +P -1)
= i+ 1 , where L(1,p) = .L::;:"= 1 np"- 1
c) Similarly E(X 2
) = 2:::"= 2 n p"- 1 q+ 2::: 2 n 2q"- 1 p
2
= q (2:(2,p)- 1) + p (L:(2,q) -1)

=q ( ( - 1) + P ( ( lt.!q)s - 1)
q p
Finally, use Var(X) = E(X 2) - [E(XW .
11. Write qA = 1-p,.., qB = 1-PB·

a) P(A wins)
= P(A tosses H, B tosses T)+P(A tosses TH, B tosses TT)+P(A tosses TTH, B tosses TTT)+
= PAqB + (qAqB)PAqB + (qAqBfpAqB + ...

- 1-qAqB •
b) P(B wins) = (Interchange A and B.)

c) P(draw) = I - P(A wins)- P(B wins) = .
d) Let N be the number of trials (in which A and B both toss) required until at least one H is seen
(by either A or B). N has range {1, 2, 3, ... }. Fork= 1, 2, 3, ...
P(N = k) = P(first k- 1 trials see TT, kth trial does not see TT)
= (qAqB)"- 1(1- qAqB).
Or: Each trial is a Bernoulli trial with success corresponding to the event (at least one H
is seen among the tosses of A and B). The probability of success on each trial is then I -
P(no His seen)= 1- q,..qB. Therefore N, the waiting time until the first success, is geometric
with parameter (1- qAqB).
12. Write q1 = 1- p1,q2 = 1- P2·
00 00
PIP2
PI+P2-PIP2
b) P(W1 < W2) = '[:::, P(W1 = k, ; <

1
1
:: W2)
= '[:;;: 1 P(W1 = k, W2 > k) = .L::::: 1 =
c) By symmetry, it's Check: (a) + (b) + (c) = 1.
d) Put X = min(W 1 , W2 ). For k = 0, 1, 2, ... we have
P(X > k) = P(Wt > k and W2 > k) = P(W 1 > k)P(W2 > k) = = (q 1 q2 )k.
So X is geometric with parameter 1- q1q-J =PI + P2- P•P2·
60
Section 3.4
c) Put Y = max(W1 , W2). Y has range {1, 2, 3, ... }. For n = 0, 1, 2, ... we have
P(Y n) = P(W 1 n and W2 n) = P(Wt n)P(W2 n)
= [1- P(Wt > n}][l- P(W2 > n)] = {1- th)·

For n = 1, 2, 3, ... we then have
P(Y = n) = P(Y n)- P(Y n- 1) = (1- q;)- (1- q;- 1 ).

13. a) Let q = 1 - p. Condition on the first two trials, and use recursion, as follows: Let B = {first
draw is black), and W B = {first draw is white, second is black). Then
P(Black wins) = P(Black wins! B) x P(B) + P(Black wins!W B) x P(W B)
= 1 x p + P(Biack wins) x qp
Therefore
P{Black
P(White wins) = I - P(Black wins) = 1_!:P.

Or use the craps principle: You can imagine that Black and White are repeating over and
over, independently, a competition which consists of each player drawing once at random with
replacement from the box. This competition results in a "win" for Black with probability p, and
a "win" for White with probability q2 • The competition is repeated until one player "wins";
the chance that Black "wins" before White does is pf(p + q2 }, which is equal to the previously
calculated chance.
b) Set P{Biack wins) = 1/2. Solving the quadratic equation gives p = (3- ,fS)/2 = 0.381966
(The other root is greater than I.)
c) No, is irrational.
d) No player has more than a 51% chance of winning if and only if
.49 P(Black .51
This happens if and only if
.375139206 ... p .388805725 ...
Reason: Observe that f(p) = pf[I- (1- p}p] is an increasing function of p E (0, 1). If 0 < c < I,
then the equation f(p*) = c has one solution, namely p* = ( c + I - J(c + 1)2 - 4c2) /2c .
Calculate all possible values of p for each value of b + w . The smallest value of b + w for which
plies in the above range is b + w = 13 (with b = 5, p = 5/13 = .384615).
14. Let n ;?: 1. Vn is a random variable having range { n, ... , 2n- 1}. For k = n, ... , 2n- I we have
(Vn = k) = (Vn = k, kth trial is success) U (Vn =k , kth trial is failure)
The two events on the right are mutually exclusive. The first event is
(Vn = k, kth trial is success)= (exactly n- 1 successes in first k- 1 trials, kth trial is success)
and has probability
P{exactly n- I successes in first k- I rials, klh trial is success)
= P( exactly n - l successes in first k - 1 trials) P( kth trial is success)
= p = e=:)p"q"-" . Similarly the second event has probability .

Hence
P(Vn = k) = + q"p"-") , k = n, ... , 2n- I.
15. a) Let k ;?: 0 and m ;?: 0. Use problem 1:
P(F
P(F - k = m IF ;?: k) = P(F=m+k)
P(F<!:kl = q =q m
p = = m)
61
Section 3.4
b) Let F assume the values 0, 1, 2, •.. with probabilities po, PI, P2, •.•• We claim that
Pk = (1-po)kpo.
To prove this claim, introduce the tail probabilities
tk= P(F 2: k) = Pk + + + ....
Put m = 0 in the Property to see
P(F =kiF 2: k) = P(F = 0) for all k 2: o,
which is equivalent to
7!- = Po for all k ;::: 0.
= = t1
.
Use - tk± l and 1 - po to get
!tilt1 = t1 for all k >
- 0.
This leads to the solution
= (tt )I< for all k ;::: 0
and
Pk = (t!)kpo = (1- Po)kpo, for all k 2:: 0,
as claimed.
16. a) If k ;::: I then

- k± r-1
= ( t:.-;2)p•(1-p)k-l - kq
b,c) If m is a mode, then m ;::: 0 and

1
land P('L': 1) 1.
It follows that
(r- I)!- l m (r -1);.
So if (r -I)! is not an integer, then there is a unique mode, namely int ((r- I)!)· If (r- I)!
is zero, then there is a unique mode, namely zero. If (r - 1 )!
is an integer greater than 0, then
there are two modes, namely (r- I)! and (r- 1. l)!-
17. Let X be the number of boys in a family and Y the number of children in a family. Observe that
given Y = n, X has binomial(n, p = I/2) distribution; therefore
P(X = k) = f
n=k
P(X = kiY = n)P(Y = n) = f
n=k
p).
e; j) e; j)
Set j = n - k:
P(X = k) = (1 f
J=O
(p/2)J = (I f
J=O
(p/2)J
Set s =k +l and multiply and divide by (I - i) •:
where T. denotes the number of failures before the sth success in Bernoulli(p/2) trials. Thus the last
sum equals I, and
P(X=k)= 2(l-p)pk
(2- p)k+l'
n = 2,4, ...
18.
otherwise
62
Section 3.4
b) G = 2X where X has geometric (p2 + q2 ) distribution on {1, 2, ... }. So E(G) = 2E(X) =

c) Var(G) = 4Var(X) =
19. a) r plus expectation of negative binomial.

b) Let Sn = number of heads in n tosses.
P(Tr < 2r)= P(Tr :5 2r- 1)

= P(S2r-1 ;:::: r) = 1/2
by symmetry about r- 1/2 of the binomial (2.r- 1, 1/2) distribution.
c) From b)
1
- = P(Tr
2
< 2r) = P(Tr- r < r) =L..t
"
r-l (i + 1)
r -
r-1
·
(1/2r{l/2)',
•'=0
which gives c) with n = r- 1.

20. a)
l:P(X;:::: n) = EEP(X =x) = EEP(X =x) =ExP(X =x) =E(X)

n=l n=l r=n z=l n=l z=l
b)
c) Var(X) = E(X 2 ) - [E(XW = 2E2- Et- Ei
21. Let Fr be negative binomial (r, p) and N be Poisson(J').
a) E( Fr) = r; --+ rq = I' in the limit.

b) Var(Fr) = "?--+ rq =I' in the limit.
c)
P(F.. =k) = (k+r-1)p"q"'

r- I
(k+r-1)X· .. Xr r k
= k! p q
.p" k
k! (rq)
/!.. l'k
= (1 - q) q k!
k
- e-,..!!_
k!
= P(N k) =
22. No Solution
23. No Solution
63
Section 3.4
24. a) Let N 1 = 1, and 1 let N;+! be the additional number of trials (boxes) required t ·Jbtain
an animal different from all previous. Then Tn = N1 + ... + Nn, and N; is a geometri mdom
variable on { 1, 2, ... } with parameter p; = =
(n-i + 1)/n (i 1, ... , n). The variables ]'I .. ,Nn
are independent, so
n n n
= !!_ (j = n- i + 1)
L...J "2 L...J 0
i=l J j=l J
1 1) l/'2
Hence Un = SD(Tn) = ( n 2 Lj=l r- n
n Lj=l 1
n
b) Note that <

-
n2 -!,- for all n. The series
l..JJ= 1 J LJ,= 1
-!,-
1
converges, so there exists a constant
c (0 < c < oo) such that 2:::': 1 ? < c2 for all n. Hence Un :5 nc for all n.
c) Chebychev's inequality states that Tn is unlikely to be more than a few standard deviations
away from its expected value. Here E(Tn) nlog n, and SD(Tn) :5 en.
d) As n-+ oo,
p (Tn-: log n :5 x) -+ e_,-z
See solution of 3.Rev.41 c) for the proof.
64
Section 3.5
Section 3.5
1. The distribution is approximately binomial (200, .01), which is approximately Poisson (2). So the
1 - e -2 [ 1 + 2 + 222 + 623] = .1428.
2. Suppose cookies contain >. raisins on average. P(cookie contains at least one raisin) = 1- e-A, as-
suming a Poisson distribution for the number of raisins. We want >. so that 1 - e->. .99, or
>. -log(.Ol) 4.6. So an average of 5 raisins per cookie will do.
3. a) Use Poisson model. Let X be the number of raisins in a cookie. Then X has Poisson (4)
distribution. So P(X = 0) = e_.. Let N be the number of raisin-less cookies per bag. Then
N has binomial (12, e- 4 ) distribution, which is approximately Poisson {12 x e- 4 ). The long run
proportion of complaint bags is
P(N > 0) 1- e- 12"-• 12e_. 0.222.
b) Proceed as in a). Let a be the average required. Want
oX>
12e-"'T6 = 0.05.
Take logs to get a 44.
4. Assume the number of misprints on each page has the Poisson (1) distribution. Then
P(more than 5 misprints per page) = 1- •;!' = .0037.

Assume the number of misprints on any one page is independent of the number on the other pages.
Then in a book of 300 pages, the number of pages having more than 5 misprints has the binomial (300,
.0037) distribution, which can be approximated by the Poisson (300 x .0037) distribution. Therefore
P(at least one page contains more than 5 misprints)= 1- e-.OOJTx30o = 1- .33 = .67.
5. Assume that N, the number of microbes in the viewing field, is Poisson with parameter >.. Since the
average density in an area of 10-• square inches is 5000 x 10- 4 = 0.5 microbes, we have >. 0.5 and =
P(N 1) = 1- P(N = 0) = 1- e->. = 1- e- 0 · 5 = 0.39.
6. Assume the number of drops falling in a given square inch in 1 minute is a Poisson random variable
with mean 30. Then the number of drops falling in a given square inch in 10 seconds will be a Poisson
random variable with mean 5. So
P(O) !!- 5 = 0.00674 =
7. a) The number of fresh raisins per muffin has Poisson (3) distribution. The number of rotten raisins
per muffin has Poisson (2) distribution. The total number of raisins per muffin has Poisson (5)
distribution, assuming the number of fresh raisins per muffin is independent of the number of
rotten ones.
b) The number of raisins in 20% of a muffin has Poisson (1) distribution, so
0
P(no raisins)= e- 1 = e- 1 .3679.
8. Once again assume that the number of pulses received by the Geiger counter in a given one minute
period is a Poisson random variable with mean 10. Then the number received in a given half minute
period will be a Poisson random variable with mean 5.
e-553
P(3) = -3!- = 0.1404
9. a) P(X = 1, Y = 2} = P(X = I)P(Y = 2} = e- 1 = .09959.
65
Section 3.5
b) Note that X+ Y has Poisson (3) distribution. Therefore

P((X + Y)/2 2:: I)= P(X + Y > 2) = 1- P(X + Y I)= I- (e- 3 + e- 3 3) = .8008
c) P(X = IIX + Y = 4) = G) = .3951.
10. a) E(3X + 5) = 3 x ). + 5.
b) Var(3X + 5) = 9 X..\.
c) E I
[ I+X
]
= "'""
wr=O J+r
I
---rr--
1
= X1 "'""
wr=O (r+l)!
.1."'+
= XI [
-e -.I. + wk=O
"'"" .1.•]
-k-!-
11. a) X+ Y is Poisson with mean 2, so P(X +Y =· 4) = e- 2 2• /4!

b) Again, X+ Y is Poisson with mean 2, so
E [(X+ Y) 2 ) = Var(X + Y) + (E(X + Y)) 2 =I'+ 11 2 = 6

c) X+ Y +Z is Poisson with mean 3, so P(X +Y +Z = 4) = e- 3 34 /4!
12. The total number of particles reaching the counter is the sum of two independent Poisson random
variables, one with parameter 3.87, the other with parameter 5.41 (these numbers come from a famous
experiment by Rutherford, Chadwick, and Ellis, in the 1920's). So the total number of particles
reaching the counter follows the Poisson (9.28) distribution. So the required probability is
e- 9 •28 [I+ 9.28 + <

9
·;
812
+ <
9
·!813 + <
9
;:>•] = 0.04622.
13 • - 6.023><10n X x3 2 69 X 1019x3
a) I' ( X ) - 22.4><103 •
u(x) = 5.19 x 10 9 • x 312

b) u(x) = !frtt ::::::> x = 7.19 x 10-6 em.
14. a) Using the random scatter theorem, the number of tumors a single person gets in a week has
approximately Poisson (..\ = 10- 5 ) distribution. Since different people may be regarded as
independent, and different weeks on the same person are also independent, the total number of
tumors observed in the population over a year is a sum of 52 x 2000 independent Poisson(I0- 5 )
random variables, which is a Poisson(1.04} random variable.
b) The number of tumors observed on a given person in a year will be a Poisson random variable
with rate parameter ). =52 x 10- 5 • Thus
P(at least one tumor)= 1- P(none) = 1- e-0 .00052 = 0.00051986
Thus the number of people getting 1 or more tumors has distribution
binomial (2000, 0.00051986) (2000 x 0.00051986) (1.04).
0 2 3 4 5 6 7 . 8
The histograms are essentially the same: Poisson with parameter 1.04. The means and standard
deviations are 1.04 and ..JiJ).i 1.02 respectively.
Why are the answers to a) and b) nearly the same when they are counting different things?
The reason is that the chance of a person getting 2 or more tumors is insignificant, even when
compared with the chance of getting 1 tumor.
66
Section 3.5
15. (a.) The chance that a given pages has no mistakes is = e-.ot = .99005 and thus the
expected number of pages with no mistakes is 200 x .99005 = 198.01. The number of pages with
mistakes is distributed as a binomial(200, .99005), so the variance is 200 x .99005 x .00995 =
1.97.
(b) The mistakes found on a given page is distributed as a Poisson(.009), so the chance that a.t least
• . • . e- · 000 (.009) 0 - 009
one mtstake wtll be found on a gtven pa.ge lS 1 - o! = e · = .00896. The expected
number of pages on which at least one mistake is found is then 200 X .00896 = 1.79.
(c) The number of pages with mistakes can be well approximated as a Poisson(l.99) since 1.99 is
the expected number of pages with mistakes. Let X = number of pages with mistakes and use
the Poisson approximation to get
P(X 2} =1- (P(X = 0) + P(X = 1)} = 1 - (e-1.99 + 1.99e-1 ·99 ) = 0.59

16. a) Assume that chocolate chips are distributed in cookies according to a Poisson scatter. Let X be
the number of chocolate chips in a three cubic inch cookie. Then X has Poisson (6) distribution,
so
e- 6 6•
p (X $ 4 ) = l...Ji=O -i!- = .2851.
b) Let z,, Z2, and Z3 denote the total number of goodies (either chocolate chips or marshmal-
lows) in cookies 1, 2, and 3 respectively. Assume that marshmallows are distributed in cookies
according to a Poisson scatter. By the independence assumption between marshmallows and
chocolate chips, Zt has Poisson (6) distribution, Z2 and Z3 each have Poisson (9) distribution,
and the Zi'S are independent. We have
P(Zt = 0) = e-a, P(Z2 = 0) = P(Z3 = 0) = e- 9
•
The complement of the desired event has probability

P(Zt = Z2 = Z3 = 0) + P(Zt > 0, Z2 = Z3 = 0) + P(Z2 > o, Zt = Z3 = 0) + P(Z3 > 0, Zt =
z2 = o)
= e--a(e- 9 ) 2 + (1 - e--a)(e- 9 ) 2 + 2(1- e-9 )e-6 e- 9 = 6.27 x 10- 7
and the desired event has probability virtually 1.
17. Assuming the distribution of raindrops over a particular square inch during a given ten second period
is a Poisson random scatter, then the number of drops hitting this square inch during the ten second
period has Poisson(..X = 5) distribution.
a) So P(no hit)= = 0.006738 .

b) Argue that if N 1 denotes the number of big drops and N2 the number of small, then Nt and
i
N2 are independent and Nt has Poisson( x 5) distribution, N2 has Poisson( x 5) distribution. t
Hence
P(Nt = 4, N2 = 5) = P(N1 = 4)P(N2 = 5) = e- 1013 ( 1o2)• e- 513 = 0.003714.
18. a.) Let Sn denote the number of survivors between time n and n + 1, and let In denote the number
of immigrants between time n and n + 1. Then Xn+t = Sn +In where In has Poisson (ll)
distribution, independent of Sn, and
P(Sn = kiXn = x) = (;)(1- p)"pz-1:,0 $ k $I.
Claim: Xn has Poisson (ll 0

L:;=
(1 - p)") distribution, n = 0,1, 2 ....
Proof. Induction. True for n = 0. If the claim holds for n = m, where m 0, then, putting
A= Jl 0
L:;;'=
(1- p)", we have for all k 0
P(S.,. = k) = L P(S.,. = kiXm = I)P(Xm = r)

r=O
-_ Lo:> (I) ( )"

k
1-p pz-1: e -)."
I!
r=k
67
Section 3.5
_ e-.\(.\(1- p)]" (.\p)"'-" _ -.\( 1 -p) (.\(1- p)]"

- k! L....t (x- k)! - e k!
:r=k
so Sm has Poisson distribution with parameter .\(1- p) = 14 ):;;'(1- p)"+l. Since I ; inde-
pendent of Sm and has Poisson (14) distribution, it follows that Xm+1 = Sm + Im h 'oisson
1
distribution with parameter 1.1 + #JI::=o(1- p)"+ = #JI::=+o (1- p)". So clai:
1
,lds for
n=m+l.
b) As n - oo, p)"- .I!,
p
so the distribution tends to the Poisson (.1!)
p
.ion.
19. a) G(z) = I::o p;zi = I::o e-"'1fzi = e-"'e,.z = e-1!( 1 -•)

b) G'(z) = lle_,..+,..•, G"(z) = IJ 2 e-,..+,..•, G"'(z) = 1J3 e-,..+,..•.
So E(X) = #J, E(X(X- 1)) = 1.1 2 , E(X(X- l)(X- 2)) = 1J 3
c) E(X2) = ll'l + #J, E(XJ} = 1'3 + 3(1J2 + IJ)- 21' = #J3 + 31!2 + ll
d) E[(X- 1!) ] = E(X ) - 3E(X )1! + 3E(X)tl
3 3 2 2
- 1! 3
= ll·
Skewness (X) = ,..f1, = .},;
20. No Solution
21. c) 0.58304 d) 0.5628 e) 0.58306
68
Section 3.6
Section 3.6
1. a.) 1/13
b) 4/50
c) 4 x Mt
d) 1 - -
2. a.) 1/13 b) 1/4 c) {13 X 12 X 11 X 10 X 9)/(52 X 51 X 50 X 49 X 48} d) (48 X 47 X 46 X 45 X
4}/(52 X 51 X 50 X 49 X 48)
3. a.) 8/47 b) (12 X 11 X 10 X 9 X 8}/(51 X 50 X 49 X 48 X 47) c) 1/4 d) 1/13 e) 1/13 f) 1/4
4. a.)
P(T1n = n) 10.4
1121314
0.3 0.2 0.1
b) 11121314
P(6- nT, = n) 0.4 0.3 0.2 0.1
c)
n
P(T, = n) I I o\ I
d) Tt
1 2 3 4
2 0.1 0 0 0
3 0.1 0.1 0 0
4 0.1 0.1 0.1 0
5 0.1 0.1 0.1 0.1
e) Tt =I+ number of non-defectives before first defective, T,- Tt = I+ number of non-defectives
between first a.nd second defective, a.nd 6 - T, = 1+ number of non-defectives after second
defective. P(Tt = n1, T2 - Tt = n2, 6- T2 = n3) = .1 where it is not zero, which is a symmetric
function, so the random va.ria.bles a.re excha.ngea.ble.
f) By e), T2 - Tt has the sa.me distribution as T1, so see a.).
5. P(one fixed box empty)= )"so EX= b (bbt )"
So
Var(X) = EX 2 - (EX) 2
G. a) Consider M = 2:::': 1
/, where /, is the indica. tor of the ith ball being in the ith box. Thus
E(M) nE(Ji) = = 1.
b)
E(M
2
) =E ( I.r)
= E ( t 11) + 2£ ( l;/1 )
69
Section 3.6
=1 +2 (n) _!__I-
2 n n -1
=2
Thus SD(M) = v'2'"=!" = 1

c) For large n, the distribution of M is approximately Poisson(!). Intuitively, the distribution is
very much like a binomial( n, except for the dependence between the draws, but as the number
of draws gets large the dependence between draws becomes small, and the Poisson(!) becomes
a good approximation.
7. a) n •
b) (ll=.!!.)
52-1
.n . 26 •
52 52
8. a) k =0,1,2, ... ,26.
b) E(X) = 26 · t = 13.
c) SD(X) = .jiiijf J26 x t x t = 1.82.
d) P(X:::: 15) = P(X:::: 14.5) 1- = .2061.
The normal approximation gives a fairly good answer. You can check that
P(X = 13) = .21812, P(X = 14) = .18807;
therefore the exact answer is, by symmetry,

I- (.21812 +2 X .18807) = _
2029
.
2
9. Let b1 , b2, ••• , bs denote the B bad elements in the population, and define
if bad element b; appears before the first good element

I;= { otherwise.
Then X- 1 =#of bad elements before the first good element= It + I2 +···+Is.
a) E(X- I)= E(It) + E(I2) + · · · + E(IB)· By symmetry, all the expectations on the right hand
side are equal, and equal to P(b 1 appears before first good element). This probability equals
1/(G + 1):
Let g1,92.····9G denote the G good elements. Consider the G + 1 elements 91, 92, ... , ga,
bt. We are interested in the position of bt relative to the g's. We can choose this position
in G + 1 ways, all equally likely. Exactly one of these choices puts b1 before all the g's. So
P(b1 appears before first good element) = 1/(G + 1).
Conclude:
E(X) = E(X- 1) + 1 = G! 1 + 1 = B 7= I
b) We have Var(X) = Var(X -1) = E [(X -1) 2 ] - [E(X- IW. Now
E [ex- 1)
2
] =E + h + ... + IB) 2 ]
by symmetry and the fact that /; 2 = !,; and

1
E(/ 1 / 2) = P(b1 and 62 both appear before the first red card) = (Gil).
70
Section 3.6
(Reasoning is similar to above: Consider the positions of bt and b2 relative to the g's. There
are G + 2 positions to fill, and we can choose the pair of positions for b1 and b2 in (G; 2) ways.
Only the pair (1, 2) gives the event we want.)
So
2
B 1 ( B )
Var(X)= G+ 1 +B(B-l)(a; 2 ) - G+ 1
_ __!!___ [ 1 2(B- 1) _ _!!____]

- G+I + 0+2 G+1
= .....!!__ 2
[G + 3G + 2 +. 2BG + 2B - 2G - 2 - BG - 2B]
G+1 {G+2)(G+l)
2
B (G +G+BG) BG(N+l)
= G+1 (G+2)(G+1) = (G+1)2(G+2)
and
BG(N + 1)
SD(X) =
10. No Solution
11. a) P(xt, ... , Xn) = 1/(;) if Xt + · · · + Xn = g and 0 otherwise

b) no c) yes
12. a) There are possible ways to draw n numbers from the set {1· · · N}. Thus the process of
taking a simple unordered random sample -from this set gives the chance of any given subset of
n numbers as If we consider the process of the exhaustive sample, where the X; 's are
the draws in order, the chance of getting any particular ordering is, for example,
N-n n N-n-1
P(Xt = B ' X2 = G ' X3 = B · · · XN
' '
= B) = -N
-- -
N-1 N-2
· · ·-11
where the ordering of the numerators may be different but the product must be equal to
which is the same as it was for the sample of size n from {I··· N}.
b) This is essentially done in the previous problem;
P(Tt = t), ... , Tn = tn) = (N- n)(N- n- 1) · · · (N- n- t1

N!
+ 2)(n) · ··
_ n!{N- n)!
- N!
c) The object here is to count the number of samples have i - 1 elements less than t, t, and n - i
elements bigger than t. This number is (:::) and the total number of samples is
( t-t),(N-t)
the chance is .
d) Given D, writing out the probability of any given sequence of U<tl• Uc 2 l, ... , U(nl gives the same
formula as that in part b). P(D) = NN! NN 2 • •• N-;;+t so P(D) > ( NNn)"- 1 as N- ex> for
n fixed.
13. a) uniform over ordered (n +I)-tuples of non-negative integers that sum toN- n.
(N_:::•) n
C
) Yl' N-w
d) E(W,) = (N- E(T,) =

i((N- +I). For N =52 and n = 4,
E(W,) = 9.6, E(Tt) = 10.6, E{T2) = 21.2, E{T3) =
31.8, E{Tt) = 42.4
e) P(Wt + W2 = t) = P(T2 = t + 2) = · 0 ::::; t < N (for t = N, prob is 0 except
in the trivial case n = 0 when prob. is I).•+•
71
Section 3.6
f) Because the W; are exchanga.ble,

Dn has the same distribution as N - (Wt + W2)- 2.
So P(Dn =d)= P(Wt + Wn+l = N- 2- d). Now use e).
E(Dn) =
E(Tn)- E(Tt) -1 (n- = + 1)- 1
14. No Solution
15. a) W2 through W,. are spacing between two aces (Wt and Wn+t are not). To get two consecutive
aces, at least one of Wt, W2, •.. , Wn must be 0. The expression in the problem = 1 - P(not all
w; = 0, i = 2, ... , n).
b) Put down N places in a row, and color t of them blue, the rest red. { Nn-') is the number of ways
to place the good elements avoiding all the blue places. There is a 1-1 correspondence between
such choices and ways to make (W; t; for each i).
c) h = 0 = tn+t, t2 = t3 = ... = tn = 1. So t = n- 1.
16. No Solution
72
Chapter 3: Review
Chapter 3: Review
1. a) 1- (5/6) 10 b) 10/6 c) 35 d) =
e) t (1 - P(same number of sixes in first five rolls as in second five rolls))
= {1_ [{!)(1/6)"(5/6) -"f} 5
2. a)
P(first 6 before tenth roll)= P(at least one 6 in first 9 rolls)= 1- {5/6) 9 = .806194
b)
P(third 6 on tenth roll)= P(two sixes on first nine rolls, 6 on tenth roll)
= G) 2 7
(1/6) (5/6) (1/6) = .046507
c)
P(three 6's in first ten rolls I six 6's in first twenty rolls)
= P(three 6's in first ten rolls, three sixes in last ten rolls)/ P(six 6's in first twenty rolls)
c3o) (1/6)3(5/6)7 (130) (1/6)3 (5/6) 7 c3o) 2

____
e6o){1/6)6(5/6)H en
= - - = .371517.
d) Want the expectation of the sum of six geometric (1/6) random variables, each of which has
expectation 6. So answer: 36.
e) Coupon collector's problem: the required number of rolls is 1 plus a geometric (5/6) plus a
geometric (4/6) plus etc. up to a geometric (1/6), so the expectation is
1 + (6/5) + (6/4) + (6/3) + (6/2) + (6/1) = 14.7.
2
x)2
P(X=x)=P(X$x)-P(X$x-1)= ( 6 - ( -x -6 - I ) 2x - 1 ... ,6)
2
36 for y = 1, 2
P(Y = y, X= 3) = fG for y =3
{
• 0 else
2
for y = 1, 2
So P(Y = y I X= 3) = t for y = 3
{
0 else
2
36 for 1 $ y < x $ 6
P(X = X I y = y) = 316 for 1 $ y $ x $ 6

{
0 else
E(X + Y) = E(D, + D2) = 7
4. Use the fact that S has the same distribution as 200 - S.
73
Chapter 3: Review
a) Use the convolution formula:
P(S = n) = P(X + Y = n) = L PI X= x)P(Y = n- :r:).

all:r
If 0 $ n $ 100, then
1 1 n+
P(S = n) = P(X = r)P(Y = n- r) = 101 X 101 = 1r·
z=O z=O
If 100 < n $ 200, then 0 $ 200- n $ 100 so by the previous case

201- n
P(S = n) = P(200- S = 200- n) = ·
1012
b) P(S $ n) = P(S = k). If 0 $ n $ 100 then

n n+l
P(S<n)= = = (n+1)(n+2).
- 101 2 101 2 2. 101 2
k=O J=l
The above formula works for -2 $ n $ 100. If 100 $ n $ 200 then -1 $ 199- n $ 100 so by
the previous case
P(S $ n) = 1- P(S;?: n + 1) = 1 - P(200- S $ 199- n)
= 1- P(S $ 199- n)
= _ (199- n + 1){199- n + 2)
1
2. 10!2
(200- n)(201- n)
= 1- 2·101 2 .
5. a) Let G = gain on one spin
6 6
. . .!_
38 6 38 6
i=l i=l
= ---
1
38 · 6
L.21 = -386x7
6
- - = -0.1842
X 6
*= =
i=l
b) 2.111.
c) 12.666
6 ••
G. a) Let Y be the number of times the gambler wins in 50 plays. Then Y has binomial {50, 9/19)
distribution, and
?(ahead after 50 plays)= P(Y > 25) = cn(9/19)'{10/19) 50 -·.
Note that the gambler's capital, in dollars, after 50 plays is 100 + lOY+ ( -10)(50 - Y)
220Y- 400. So
50
P(not in debt after 50 plays) =P(20Y-400 > 0) = P(Y > 20) = '19) -'.
b) E(capital) = E(20Y- 400) = 20E(Y)- 400 = 20(50)(9/19)- 400 = 1400/19 ;

Var(capital) = Var(20Y- 400) = 400Var(Y) = = 1800000/lS
c) Use the normal approximation to the binomial: E(Y) = 23.7, and SD(Y) = 3.53, so
P(Y > 25) = .305.
= I - 4>(.51)
P(Y > 20) =I- 4>{-.91) = .8186.
7. a) P(;?: 4 heads in 5 losses)= f2 = 0.1875
74
Chapter 3: Review
b) P(O, I, or 2 heads in 5 0.5
c) poss. vals
P robs
I I I I
0
26
32
5I
32
21
32 EX = 0.21875
8. a) Let X = D1 + · · ·+Db, where D1 is the number of balls until the first black ball, D2 is the number
of balls drawn after the first black until the second black ball, and so on. When drawing with
replacement, the D; are independent random variables, so X has the negative
binomial( b. distribution on b, b + 1, ....
P(X = k) = (k-
b-1
1) (-b )" b+w
(k ?: b)
b) When drawing without replacement, it is possible to draw the bth black ball on any draw from
b to b + w.
P(X = k) = P(b-1 blacks in first k-I draws)
x P(kth draw is black I b-I blacks in first k-1 draws)
= X.,.-----:--
( b+w) b+ W- k + I
k-1
bw!(k- 1)!
(b$k$b+w)
(k- b)!(b + w)!
9. Let X, Y be the numbers rolled from the two doubling cubes, and let U, V be the numbers rolled from
two ordinary dice. Then (log 2 X, log 2 Y) has the same distribution as (U, V) .
a) P(XY < IOO) = P(log 2 XY < log 2 100) = P(U + V < 6.64) = t [I- P(U + V = 7)] = 5/12.
b) P(XY < 200) = P(U + V < 7.64) = 5/12 + 1/6 = 7/12.
c) E(X) = 2I, so by independence E(XY) = E(X)E(Y) = 441.
d) E(X 2 ) = 9IO, so Var(XY) = E((XY) 2 ] - (E(XY)Jl = 6336I9 and SD(XY}:::::: 796.
10. x (1- ifi:f:j.

b) number of matches= L:;=l !(match occurs at place j) £(number of matches)= n x = l.
11. Assume X has Poisson(2) distribution.

P(X < 2) = .406, P(X = 2) = .271, P(X > 2) = .323.
12. Following the hint,
P1 = p•-l + (1 + p + ... + p'- 2 )qPo = p'- 1 + (1- p'- 1)Po
Po= (1 +q + ... +q1 - 2 )pPI ={I- q1 - }PI,
1
hence
p - •-1 +(I •-1}{1 J-1)P - p'-I p•-I
1-P -p -q 1- 1-(1-p•-1)(1-qf-1) = p•-1 +qf-1-p• lqf I
and
and finally
p•-1(1- qf)
P(A) = pP1 + qPo = p , 1 + q-
1 1 -p , 1 q1 1
13. Note that E(X 2 ) = VarX + [E(X)F = q

2
+1, 2 , same for E(Y) 2 • By independence we have
2 2
E[(XY) 2] = E(X 2 )E(Y 2 ) = (q 2 + 1' )
E(XY) = (E(X}][E(Y}] = ,?
and so
Var(XY) = E{(XY)2]- [E(XY)f = (q2 + 1'2)2- (1'2)2 = q2(a2 + 21'2).
75
Chapter 3: Review
14. a) This is 1 minus the chance that current flows along none of the lines. The chance that current
does not flow on any particular line is the chance that at least one of the switches on that line
doesn't work. So answer:
1- {1- pl)(1- p!).
b) Let X be the number of working switches. Then X is the sum of 10 indicators, corre' :mding
to whether each switch works or doesn't work. All the indicators are independent.
E(X) =PI+ 21'2 + 3pJ + 4p,,

Var(X) = p,q, + 2p2q2 + 3p3q3 + 4p,q,,
and SD(X) is the square root of this.
Alternatively, the number of switches working in line i has the binomial distribution with parameters
i and p;, independently of all other lines. X is the sum of these ten binomials. Formulae for E(X)
and V ar(X) can be read off the binomial formulae.
15. a) Binomial (100, I/38). b) Poisson (100/38)

c) Negative binomial (3, 1/38) shifted to {3,4, ... }. d) 3 x 38
The distribution of D on {1, 2, ... , 9} is symmetric about 5, so E(DID > 0) = 5. Therefore

E(D) = E(DID = O)P(D = 0) + E(DID > O)P(D > 0) = 0+ 5 X (73/100) = 3.65.
17. P(one six 1 all different) = P(one six and all different).
P(aJl different)
a) Numerator= denominator= (t)if*, so the required probability is N/6.

b) The numerator is
5
P(one six and all different IN= n)P(N = n) = ( n -1 ) n!! = 1/6;
6" 6
n=l n=l
t
the denominator is
P(all different IN= n)P(N = n) = t (!) = .46245,
so the required probability is .3604.
18. a) The return (in cents) from the game is
X= I, + h + · · · + ltoo,
where
if the number on deal i is greater than
[; = { those of all previous cards dealt
otherwise
That is, l; is the indicator of the event that a record value occurs at deal i (a record is co:>sidered
to have occured at the first deal), and X simply counts the number of records seen. A counting
argument shows that P( l; = 1) = t, so
100
l 1
E(X)= L
i=l
..,.:::::loglOO+-r+
I 2 X 100
=5.19,
where 1' = .57721 (Euler's constant). [This approximation was used in the Collector's Problem
of Example 3.4.5]. So 5.19 cents is the fair price to pay in advance.
76
Chapter 3: Review
b) The gain (in cents) from 25 plays is
S = X1 + Xz + · · · + X2s.
where the X;'s a.re independent copies of X. The I;'s are independent so that
100 100 2
Var(X) = LVar(I;) = L 7(t-7):::::: 5.19- = 3.54 ==:> SD(X) = 1.88.

i=l i=l
Using the normal approximation, obtain P(S > 25 x 10) :::::: 1 - 4\(12.8) :::::: 0.
19.
Y: number of failures before first success, P(Y;::: y) = q 11

X : Poisson (ll) independent of Y
P(Y;::: X)=
00 "
= e-" L 00 ( )"
k=O k=O
e-"e"" = e-,.(t-q) = 0.6065 if p = 1/2, p, = 1
20. a) Let p = 1/2 + x (where x could be negative). So 1 - p = 1/2- x, and
p(1- p) = (1/2 + x)(1/2- x) = 1/4- x 2
$ 1/4
2
since x ;::: 0.
b) The margin of error in the estimate is Jp(1 - p)fn, where n is the sample size, and p is the
proportion of part time employed students. This assumes sampling without replacement. For
sampling with replacement the margin of error would be smaller, due to the correction factor.
No matter what pis,
by part a), so we should take the smallest n so that $ .05, i.e., n = 100.
21. X has negative binomial {1, p) distribution, Y has negative binomial (2, p) distribution, and X and
Y are independent. Now suppose a coin having probability p of landing heads is tossed repeatedly
and independently. Let X' denote the number of tails observed until the first head is observed; let Y'
denote the additional number of failures until the third head. Then (X, Y) has the same distribution
as does (X', Y'). So Z has the same distribution as X'+ Y' =
the number of tails seen until the third
head: negative binomial distribution on { 0,1, •.. } with parameters r = 3 and p.
22. Suppose the daily demand X has Poisson distribution with parameter..\ > 0. (Here..\= 100.) Suppose
the newsboy buys n (constant) papers per day, n;::: 1. Then min(X, n) papers are sold in a day; the
daily profit 1rn in dollars is
1
lrn = min(X, n)- n;
10
and the long run average profit per day is
E[lrn] = !_E[min(X,n)J-
4
_!_n.
10
Claim: For each n = 1, 2, 3, ... :
(l)E(min(X, n)J = >.P(X $ n- I)+ nP(X n + 1)
(2)E[min(X, n)] =L P(X 2: k ).

k=!
Part (2) holds no matter what distribution X has.
77
Chapter 3: Review
Proof: (1)
00
E[min(X, n)] =L min(k, n)P(X = k)

k=O
n oo
=L kP(X = k) + L nP(X = k)
1:=0
n k
= + nP(X n + 1)
1::1
= >. :L>-.\ (k _ 1)! + nP(X

" ,xk-1
n + 1)
1:=1
= >.P(X ::5 n- I)+ nP(X n +I).
(2) Use the tail sum formula for expectation:

n n
E[min(X, n)] = L P(min(X, n) k) = L P(X k).

k=1 k=l
a) Say >. = 100 and the newsboy buys n = 100 papers per day. Then his long run average daily
profit (in dollars) is
E[lrJoo] = .!_4 E[min(X, 100)]- _.!._
10
• 100
= [100P(X ::5 99) + IOOP(X 101)]- 10 by Claim (1)

= 25(1 - P(X = 100)] - 10
= 15- 25P(X = 100) :::::: 14
since
100
P(X = 100) = e- 100 ( 100) -
1oo! -
1
.J21r. 1oo
:::::: 0.04 by Stirling's approximation.
b) If>. = 100 and the newsboy buys n papers per day, then his long run average profit per day is
1
E[7rn] = n
10
= t k=l
P(X k) -
1
10
n by Claim (2)
= t•(P(X k)-
k=1
Since the function k --> P(X k) - is decreasing, is positive at k = 1, and is negative as

/;; __, oo, it follows that E(lrn] is maximized at n* = the largest k such that P(X k)- 0,
i.e., n* = 102. Thus the newsboy should buy 102 papers per day in order to maximize his daily
profit - assuming that demand has Poisson{lOO) distribution.
23. a) The drawing process can be described in another way as follows: think of the box as initially
containing 2n half toothpicks in n pairs. Then half toothpicks are simply being drawn at
random without replacement. The problem is to find the distribution of H, the number of halves
remaining after the last pair is broken. By the symmetry of sampling without replacement, H
has the same distribution as H', where H' is the number of draws preceding (i.e., not including)
the first time that the remaining half of some toothpick is drawn. This may be clearer by an
analogy. Think of 2n cards in a deck, 2 of each of n colors, and imagine dealing cards one by
one off the top of the deck. Then H corresponds to the number of cards remaining in the deck
after each color has been seen at least once. If the cards had been dealt from the bottom of the
deck, this number would have been the number (corresponding to H') of cards dealt preceding
78
Chapter 3: Review
the first card having a color already seen. Since the order in which we deal the cards makes no
difference, it follows that H and H' have the same distribution. Thus for 1 $ k $ n
P(H = k) = P(H' = k)
= P(first k colors are different and (k + 1)st color has been seen before)
2n 2n - 2 2n - 2( k - 1) k
- 2n · 2n - 1 ··· 2n - ( k - 1) · 2n - k
(2n)k+1 · =
Another way to obtain P(H = k) is through P(H k): For each k = 1, 2, ... ,n + 1 we have
P(H k) = P(H' k) = P(first k colors are all different)

2n 2n- 2 2n- 2(k- 1) z"(n)k
= 2n ·2n-1 ··· 2n-(k-1) = ·
b) As n - oo,
1-1. t-1.
P(H > k) = 1· _ _n_. _ _n_. •.• • n
- 1- 1- 22n 1- "2-;_.1
1
:::::: exp [( - -
n
+-1 ) + ( --
2n
2 +-
n 2n
(k
2) + ... + ( ----
n
(k - -
+- 2n
-1) -1))] ,
2
1 + 2 + ... + (k - 1)) ( k )
= exp ( - 2n :::::: exp - 4n .
Thus, putting k = r.../2n,

P( r) = P(H rv'2n) : : : exp( -r 2
/2).
This limiting distribution of H j $ is called the Rayleigh distribution (see Section 5.3).
c) This approximation suggests
E(H) = 800
P(H k)....., 8
00
exp
(
-
k2)
n = v'2n8
00
....., V2n_ 1 00
dx = v'2nv;J2
That is to say
E(H) ..;rn as n-+ oo.
d) Expect about J100 · 1r:::::: J3i4:::::: 17 or so.
24. a} If P(X = 3, Y = 2, Z = 1) = 1/3,
P(X = 2, Y = 1, Z = 3) = 1/3,
P(X = 1, Y = 3, Z = 2) = 1/3,
then P(X > Y) = P(Y > Z) = P(Z >X) = 2/3.
b) Let p = min{P(X > Y), P(Y > Z), P(Z >X)}. Then
p $ [P(X > Y) + P(Y > Z) + P(Z > X)]
= > Y) + I(Y > Z) + I(Z >X)]
::5 2/3, since /(X > Y) + l(Y > Z) + l(Z >X) ::52.
c) Say each voter assigns each candidate a numerical score 1, 2, or 3, 3 for the most preferred
candidate, 1 for the least preferred. Pick a voter at random, and let X be the score assigned to
A, Y the score of B, Z the score of C for this voter. Then P(X > Y) ="Proportion of voters
who prefer A to B" , etc. So a population of 3 voters with preferences as in a) above would
make each of these proportions 2/3.
79
Chapter 3: Review
d) Let
P(X1 = n, X2 = n- 1, XJ = n- 2, ••. , Xn = 1) = 1/n,
P(X1 = n- 1, }_ -= n- 2, XJ = n- 3, ... , Xn = n) = lfn,
X3 = n- 1, ... ,Xn = 2) = Ifn.

Then all the p bilities are ( n - 1 )/n, and the minimum probability p is bounded above by
(n-I)fn.
e) We have P(X > Y) = I -(I - pt)p2
P(Y > Z) = P2
P(Z >X)= 1- PI·
These will all equal q if I -PI = q, P2 = q, and 1 - (1- PdP2 = q, that is, if
1- q2 = q,
the equation of the golden mean. (L. Dubins)

z 4
25.
a.) P(}'i + Y2 = z) 1/36
b) 10/3
0 if X = 1, 2, 3
c) Onepossibility: f(x)= 1 if x=4,5

{
2 if x=6
26. a) We wish to find n such that (.99)" = .5, so n = 69.

b) This is essentially a. negative binomial with the roles of success and failure reversed. Let Ft
be the number of failures (successful honks) before the 4th success /failure to honk). Then
E(Ft) = to_;;Ol) = 396, but for our problem we must also add 3 non-honks to get a.n answer of
399.
27.
:; <X>
E(X) =L 100xq-"- 1 p + I)5oo + 40x)q5+"'- 1p

z=l :r=l
00 00
s sl
= 100(p+2qp+3q2 p+4q3 p+5q
• t
p)+{500q )+(40q -)
p
= 186.43 + 156.62
= 343.047
28. a) If w 1 and y E A then

P(W1 = w, Y1 = y) = P(X1 ... ,Xw-l rt A, X..,= y)
= P(X1 P(Xw-1 rt A)P(X.., = y)
=(I- P1(A))'"- 1 P1(y)
Sum over w to get P(Y1 = y) = PI(y)f P1(A) and sum over y to get P(W1 = w) = (1 1
- P1 (A))"'- P1(A:
Continue in this wa.y to show that for all k 1
P(W1 = w1, Y, = y1, ... , = = = P(W1 = wt)P(}'i = yt) · · · = wk)P(Yk =
(Idea: Express the event on the left in terms of the independent variables X 1 , X 2 , ••.•
80
Chapter 3: Review
b) By the weak law of large numbers: As n - oo,
I(X; E AB) = l(X; E AB)fn _ P1(AB) = p 1 (BIA).

L:i=l l(X; E A) Li= 1l(X; E A)/n P1(A)
29. c) unifo.rm on {0,1, ... ,n} d) no, yes e)
30. No Solution
31. No Solution
32. No Solution
33. a) If n ;::>: k ;::>: 1:

. b I ·c1c all L ed) _ P(pick all k redlall n red in ba.g)P(all n red in bag)
P(all n red m ag PI " r - P(pick all k red)
1 . (1/2)"
= L:7=o P(pick all k redlj red in bag)P{j red in bag)
- (1/2)"
- L:;=O 5 (j) (1/2)"
= 2"E(X")'
where X has binomial ( n, 1/2) distribution.
b) If k = 1: then E(X) = n/2, so
P(all n red in baglpick red) =
If k = 2: then E(X 2 ) = Var(X) + (E(X)] 2 = +n

P(all n red in baglpick both red)= / ).
2n-2 1 +
c) If the k balls are drawn without replacement from the bag, follow the argument in a), but replace
J.: .h
n" Wit
1l.1&.
(n)a:
. . (1/2)" (n)k
P(all n red m baglptck all k red)= f*()
1 = E(X)
•
L.-J=O (n " nJ {1/2)"
'\"'" 2" k
where again X has binomial (n, 1/2) distribution. On the other hand, if the k balls are drawn
without replacement from the bag, then the number of red balls seen among the k has binomial
(k, 1/2) distribution: it's as if we drew the k balls directly from the very large collection. So
P(all d ·10 b I· k all k ed) = P(pick all k redlall n red in bag)P(all n red in bag) = 1. (1/2)"
n re ag piC r P(pick all k red) (1/2)" ·
d) If k = 3: Since
(X)J = X(X- I)(X- 2) = X
3
- 3X 2 + 2X,
compute
112 112 11
E(X 3 ) = E(X)J + 3E(X 2 ) - 2E(X) = (n)J +3 + )-- 2 = ( + 3)
8 4 4 2 8
and the result in a) simplifies to
P(all n red in baglpick all 3 red) = 2n-3 / I+-;;-3 ).

81
Chapter 3: Review
e) Let p be the proportion of red balls in the original collection of balls. Repeat the argument in
c) to get
P(all n red in baglpick all k red)= n f*(:)n . 1 _ n-i =

Lj=O n)k j pJ( p)
(where X has binomial (n, p) distribution) and
P(all n red in baglpick all k red) = p:.

p
Therefore E(X)k = (n)kpk.

Check: This formula. gives E(X) = E(Xh = (n)ip = np and E[X(X -1)] = E(Xh = (n)lp 2 =
n(n- 1)p2 from which it follows Var(X) = E[X(X- 1)] + E(X)- (E(X)J2 = np(l- p).
34. No Solution
35. No Solution
36. a) The kth binomial moment bk is just given by
b)
1l=b1 = =np
u
2
= + b1 - = 2(; )p
2
+ np- (np) 2 = n(n- 1)p
2
+ np- (np) 2 2
= np- np = np(1- p)
37. a.)
38. a) Using the kth binomial moment results, we observe that the kth factorial moment is just k!
times the kth binomial moment. Let A; be the event that the ith letter gets the correct address.
Then Mn = A; and the kth factorial moment is just
1 1 1
= k! ;;x-n---1 X···Xn-k+l
'I <l:r<···-s:i.
1
= k! (n) .!_ X - - X · · · X l
k n n-1 n-k+l
=1
b) Let X be Poisson(!), then the kth binomial moment is
/k(X) = E((X)(X -l)···(X- k+ 1})

= L)> X (i- I) X ... X ( i - k + 1)P(X = i)
-1
= 2) i) X (i - 1) X ••• X (j - k + 1) I.
·=k
1
=e L....(i-k)!
t:;;k
=1
82
Chapter 3: Review
c) Use the expression of Exercise 3.4.22 for ordinary moments in terms of factorial moments.
As n -+ oo, the kth moment of Mn are equals the kth moment of X for all k $ n, so the
convergence is immediate.
d) Using the Sieve formula,
P(Mn = k) = t (0
J=k
(-l)k-ibi(Mn)
= t
J=k
(0{-l)k-j
.
( 1)k-j 1
= LJ - (k!)(j- k)!
J=l:
n-1:
= _!_ " ( -1 )j _!_
k! LJ j!
]=0
and as n -+ oo, this goes to <;!• as desired.
39. No Solution
40. The following identities hold:
PI + P2 + •· · + Pn 1
X1P1 + X2P2 + · · · + XnPn P.l
PI + + · · · + x!pn = l-'2
This is the same as p. = pM, where M is the n x n matrix
Zl
Z2
X3
xi
x2
2
2
XI
n-1
x2
n-1
__,)
XJ x3
M= ( 1
2 n-1
Xn Xn Xn
To see that M must have an inverse, it suffices to show that A1 has rank n; equivalently, that the
columns of Af are linearly independent.
Proof: Let co, c1, ... , Cn-1 be real constants such that
0 Co+ C1X1 + C2Xi + ••· +

0 CO+ C1X2 + + · · · + Cn-!X;-l
Then the polynomial f defined by f(x) = co+ctx+c2x 2 +···+cn-1X"- 1 has n distinct roots, namely
Xt, x2, ••• , Xn. But f has degree at most n - 1. Therefore f must be identically zero. Conclude: all
the c's must be zero.
83
Chapter 3: Review
41. Rephrase this problem in terms of the collector's problem: Suppose there are M objects in a complete
collection, and suppose a trial consists of selecting an object at random with replacement from the set
of M objects. So on each trial, the object selected is equally likely to be any one of the M possible,
independently of what the other trials yielded. Now find n so that P(T $ n) = 0.9, where Tis the
number of trials required to see
a) an object specified in advance

b) at least one of the possible objects twice
c) every possible object at least once
d) at least once every object in a specified set comprising half of all possible objects
e) half of all objects.
In the present context, M = 1024, since each repetition of the ten toss experiment results in one of
2 10 possible sequences.
a) T has geometric distribution on {1, 2, ... } with parameter 1/M, so
P(T>n)= (1- n=1,2,3, ....
Therefore
P(T $ n) = 0.9 <===> ( 1 - r = 0.1 <===> n =
log
log 0.1
(1- it)"
For example, if M = 1024, then require n :::: 2350.
b) Let n be a positive integer. Observe that
(T > n) =(the first n trials yielded all different objects).

By comparison with the birthday problem (see index) we obtain
1
P(T > n) = (1 - (1- ... ( 1 _ n ) .
If M is large, the right-hand side is approximately , and this approximation is good

over all values of n. Thus if M is large,
P(T > n):::: e , n = 1, 2, 3, ....
and
n(n- I) _
P(T $ n) = 0.9 <===> exp 2M - 0.1
<===> n2 - n :::: 2M log 10
1 + .../1 + 8M log 10
<=:::>n:::: 2 .
For example, if M = 1024, then require n :::: 70.
c) We claim that for each real :r,
lim P(T $ M(log M + :r)) =

M-oo
First, a heuristic justification: We have T = max(T1 • ".'"'2, ... , TM) where T; is the number of
trials {from the beginning) required to see object i. Then each T; has geometric distribution on
{1, 2, ... } with parameter 1/M, so
P(T, $ M(log M + x)) = 1 - P(T. > .M(log M + :r))

:::: _ ( _ M(log M+z)
1 1
=:::I- (e- 1 )logM+z if M large
84
Chapter 3: Review
e-"
=1-M.
Now if Mis large, the random variables Tt, T2, ... , TM are almost independent; hence
P(T $ M(log M + x))::::: IIf'! 1 P(T; $ M(log M + x))
= (1- e;) M::;:: e-c-"
Now, a more rigorous justification: Let k be a positive integer. We have
(T > k) = (one of the M objects has not been seen in the firstktrials)
= u{'! 1 (object i not seen ink trials).

By inclusion-exclusion,
P(T > k) = P { u{'! 1 (object i not seen in /,;trials)}
M
= L ?(object i not seen in k trials)- L ?(objects i, j not seen in k trials)
i=l i<j
+ ... + ( -1)M- 1 ?(objects 1, ... , M not seen ink trials)
(M;1r _ (M;2r + ... +(-1)u-J(Z)
= t(-1)i-
i=1
1
J
(1- r
Hence
(1-
Let x (real) be fixed. Replace k by M(log M +x) and investigate the behavior of the probability
as M--oo:
M (M) ( ·)
j 1-
LM(log M+.r)J
00
where
_ { (-l)'(M'\ (l- .L)LM(iogM+")j o $)$ M
au,1 - 1 J • M
0 j >M
Note that for each j = 0, 1, 2, ...
We evaluate the limit as M--oo of P(T $ M(log M + x)) as follows:
oo <XI oo ( -s 1
lim P(T $ M(log M +
M-oo
x)) = M-oo
lim
L...,
aM,J = lim aM,J
L..., M-oo
=
L...,
-c. )
)!
= e_,-r
J=O J=O J=O
the movement of the limit inside the summation is justified e.g. by the Weierstrass M-test (no
relation to the present M).
Set e_,-r = 0.9 x::::: 2.25. Then forM large, we have approximately
P(T $ n) = 0.9 <==> n::::: M(log M + 2.25).
For example, if M = 1024, then require n::::: 9400.

85
Chapter 3: Review
d) Without loss of generality, the desired subcollection consists of objects I, 2, 3, ... , M /2. By a
similar analysis to (c), we obtain
P(T > k) = P { ut!{2 (object i has not been seen in k trials)}
M 2:) -1 )i (M /. 2) ( 1 - it)
M . LM(log , .M+z)J
2 + x)) =
P(T $ M (log e-t":-r as M _.., oo.
j=O }
(Heuristically, we may view the collection of M tickets as consisting of half desirable and half
undesirable. T is the time required to obtain all the desirable tickets. Since roughly half of our
trials result in undesirable tickets, it would take twice as long to obtain all the desirable tickets
as it normally would had there been no undesirable tickets at all. In other words, if y-t denotes
the number of trials needed to obtain a complete collection of M /2 when drawing from the set
of M /2 objects, then T is distributed approximately like 2T'; hence if M is large, then
M M
P(T $ M(log
2 + x))::::: P(2T $ M(log
2 + x))
M M
= P(T' $
2 (log
2 + x))
:::::e-e-"'.)
Thus for M large,
M
P(T $ n) = 0.9 -¢:::::> n:::::: M(log 2 + 2.25).
For example, if M = 1024, then require n:::::: 8700.
e) T = Xt + x2 + ... + XM/2 where Xt = 1, and fori= 2, 3, ... , M, Xi is the additional number of
trials [after obtaining the ( i - 1 )st different object] required to obtain a new object. Then Xi
has geometric distribution on {1, 2, ... } with parameter Pi = (M- i + 1)/M, i = 1, ... , M, and
the {Xi} are mutually independent. Hence
M/2 M/2 M/2
E(T) =""'&=1
E(X;)
•
= t=l •=1
M_
= ""' M -&+1
1 1 I )
=M ( -1
-Mlog2asM-oo,
2 +I
and
Af/2 Af/2
Var(T) =L Var(Xi) = L ( i _!_) P,2
-
p,
•=1 ·=•
=M
2
+ (M 1)2 + ... + ('/1)2) - M ( ;{ M 1+ ... +
"'M
2
M{log 2)) = M(I -log 2) as M-oo.
Observe that the variables Xt' x2' ... ,X M /2 are roughly identically distributed, or at least
roughly of the same size (they have geometric distribution, with parameters between 1/2 a.nd 1;
86
Chapter 3: Review
contrast this with the full set of variables X 1 , X 2, ••• ,XM , where the latter variables have very
high expectation). Hence
T - M log 2 T - E(T)
JM(l-log2) SD(T)
has' approximately normal (0,1) distribution. In fact, it can be shown that for all x:
lim P(T$Mlog2+xJM(l-log2)) =limP( T-Mlog

M-oo M-oo
2
y'M(l -log 2}
$x) ={l(x).
Set ci>(x) = 0.9 x:::::: 1.282. Then forM larl?e, we have
P(T $ n) = 0.9 n:::::: Mlog 2 + 1.282y'M(l -log 2).
For example, if M = 1024, then require n :::::: 730.
87
Chapter 3: Review
88
+
Section 4.1
1. a) The desired probability is the area under the standard normal density y = between
z = 0 and z = 0.001. The density is very nearly constant over this interval, so the desired
1
(width of rectangle)(approx height of rectangle) = 0.001 x = Vfr = .000399.
v2r 1000 211"
b) Similarly the desired probability is approximately
0.001 X
1
tiC X e -1/2 -- 1 -- .000242.
v2r 1000v21re
2. a)
1 1
0Q
and since f(x) is a density function, it must integrate to 1, soc= 3.

.!:..dx =
x• 3x3 1
= c
3
b)
E(X) = 1 1
00
x2_dx
x4
= 1 =
2x 2 1 2
00
= 1 3 = -;-
c)
E(X ) 2 -3, 1
00
x x• dx 2
00
1
=3
Thus Var(X) = E(X 2 ) - (E(X)) 2 = 3- t = t.
3. a) Since f is a probability density, f(x)dx = 1, so
= c J01 x(1- x)dx = c ( ; l

3
1 - z3 ) [ = c = 6.
b) P(X :$ = J} 6x(I- x)dx = 3x

2
- 2:.t
3
1: =
Remark. Once you note that /(:.t) is symmetric about {draw a picture), the answer is clear
without calculation.
c) P(X :$ i) = J0S 6:.t(l- x)dx = 3x 2 - 2x 3 L=ft.

d) By the difference rule of probabilities,
P(k <X:$ = P(X :$ - P(X :$ = 27 = i) t - 7
Remark. Of course you can obtain the same result by integration.
e) E(X) = J f(x)dx = fo1 6x 2 (1 - x)dx = 2x 3

- %x{ = t.
Again, this is obvious by symmetry. But you must compute an integral for the variance:
E(X 2 ) = J01 x 2 f(x)dx = J01 6x 3 (1- x)dx = tx•- =fa

Var(X) = E(X 2
)- [E(XW = fo- i =to
4. a) We know that J 1 2 2
cx (l- x) dx = 1, and
11 11 ( I1= -
0
2 2 2 J
ex (1- x) dx =c (x - 2x + x 4 )dx = c XJ
---
X4
+-
X C
0 0 3 2 5 0 30
soc= 30.
Section 4.1
1
b)
1 1
c)
E(X) =
1 0
30x J (1- x) 2 dx = 30
0
(x J - 2x i +x )dx = 30 ( l --
-
4
2
5
+-1)
6
= -1
2
and so
2 1 1
V ar(X) = - - - = -
7 4 28
s. a) Graph:
0.5 -
0.4
0.3 "
0.2-
0.1 -
0.0
-5 -4 -3 -2 -1 0 2 3 4 5
b)
1 1
P(-1<X<2)=21 ( }_1(1-x)ldx+ 1
o (1+x)2dx ) =21 ( [1-xJ_t+[-1+x]o
1 lo 1 )
1
2
l
2
7
12
c) P(IXI > 1) = 2 X tf
00
1 (l:r)l dx = - t!.r = t
d) No, because E(IXI) = f::'oo dr = J
0
00
(l+,r)> dx = oo (see Example 3).
6. a.) t= P(X 0) = p (X;" < = (-;) ¢::::::> = -.4303;

= P(X 1) = P ( x ;" < <==::>= = .4303.
(
Subtract: -!;: = .8606 <==::> u = 1.162. Hence I' = .4303u = 0.5. Or you rna.y easily see that, by
symmetry, I' must be located halfway between 0 and 1.
b) If P(X 1) = then (t = <==::> t = .6742. This implies that ! = .6742 + .4303 =
1.1045, and u = .9054, and I'= .3896.
7. Let X be the height of an individual picked at random from this population. We know that the
distribution of X is approximately normal (I'. u 2 ), with p = iO {inches), and P(X > 72) = .1 . That
is,
.9 = P(X 72) = 72;;7o):::::: P(Z !) = (!)
(where Z has standard normal distribution). Hence 2/u = 1.28 from the normal table, and the chance
that the height of an individual picked at random exceeds 74 inches is
P(X > 74) = p ( x:
70 > 74 : 70 ):::::: p (Z > = P(Z > 2.56) = l - <1>(2.56) = .0052.
In a group of 100, the number of individuals who are over 74 inches tall therefore has binomial(lOO,
.0052) distribution. By the Poisson approximation, the chance that there are 2 or more such individ-
uals is approximately (with 11 = 100 x .0052 =.52)
I - (e-" + e-"J.I) = I - e- +.52)= .096.
90
Section 4.1
8. a)
ct>(12.2 -12)- ct>(ll.8- 12) = .1443
1.1 1.1
b)
ct>(12.2 -12) _ ct>(11.8- 12) = .9307
.11 .11
Since the Central Limit Theorem says that the average of a large number of measurements will
be normal, it is not necessary for the measurements themselves to be normal, although if the
measurements are extremely skewed then 100 may not be a. large enough number.
9. The distribution of 5 4 is approximately normal with mean 2 and variance 4 x 1

12
= 1·
P(S4 ;::: 3):::::: 1- ct> C\IJ) =
3
1
1- 4>(1.73) = 1-.9582 = .0418
10. a) ct>( - ct>( = 4>(6.45)- 4>(1.29) = 0.0985

b) = ct>( -0.19) = 1- 4>(0.19) = 0.4246
c) 4>{1.28):::::: 0.90, so the weight = 9.7800 + (1.28 x 0.0031) = 9.7840 gm.
1
11. a) 4>(0.43) :::::: so u ..- :::::: 0.43 and u:::::: 0.2325.
b) q.
0.2325
- q.
0.2325
= 24>(0.86)- 1 = 0.6102
c) 4>(0.675):::::: 0.75, so the diameter is 0.675 multiples of u below the mean. 1- (0.675 x 0.2325)::::::
0.84
12. a) Range of X: [-2, 2]

If -2 < x < 2, then
-
f( X )d X= P-(X EX=
d ) 2x(2-!.rl)dz- 1(2
4x(!x2x2) -4 -X
l)d X, I
so f(x) = {2 -lxl)/4. Elsewhere /(x) =0.
b) Range of X: [-2, 1].
If -2 < z < 0, then
f(x )d; = P(X E dx) = <f+:>.u
, 2 )(..,)(
= H2 + x )dx .
If 0 $ x $ 1, then f(x)dx = 2 1
< -;.r>.tz.
Elsewhere f(x) = 0.
c) Range of X: [-1, 2].
The density will be linear on (-1, 0], constant on [0, 1], linear on [1, 2]:
'
/
/ Area== 2h
-I 0 2
To make area= 1, h must satisfy 2h = 1, or h =

Let X denote the length of a rod produced. Note that
the probabilities of interest remain the same under a
13. linear change of scale. So, without loss of generality,
assume X has a triangular density from -1 to 1, as in
the figure.
a) P(IXI > .75) = 1/16 from the figure. .J.Q -0.5 0.0 0.5 1.0
91
Section 4.1
b) P(IXI :5 .5IIXI :5 .75) = = .8.
If the customer buys n rods, the number N of rods which meet his specifications (a-- 'tming
independence of rod lengths) has binomial (n, .8) distribution. Need n such that
P(N ;::: 100) ;::: .95 <=> P(N < 100) :5 .05.
<=> n;::: 134.
14. (continued from Exercise 13)

Again, the probabilities remain the same under a linear change of scale. Let Y be the length of a rod
produced by the current manufacturing process. To get the mean and standard deviation of Y, note
that the density of X is given by
fx(x) = { -lxl lxl :5 1

elsewhere
Therefore E(Y) = E(.\ J = 0 by symmetry; Var(Y) = Var(X) = E(X 2 ) = f x 2 (1 - ixi)dx =

*·
1
1
fo 2 3
2 (x - x )dx =
So Y has normal (0, 1/6) distribution.
a) P(IYI > .75) = 2 x [1- ct>(,ft x .75)] = .066193

b) P(IYI :5 .5) = 2 X «1>(,;6 X .5)- 1 = .779328;
therefore P(IYI :5 .5IIYI :5 .75) = 1 = .834571.
So you should choose the manufacturer of this exercise, since each rod that you buy would have
a greater chance of meeting your specifications (although not by much).
15. a) (0, 1/2) b) erf(x) = 2ct>(J2x)- 1 c) cf>(z) = (erf(z/J2) + 1)/2
92
Section 4.2
Section 4.2
1. Let X denote the lifetime of a.n atom. Then X has exponential distribution with rate -\ =log 2.
a.) P(X > 5) = e- 5 >. = (1/2) 5 = 1/32 .
b) Find t (years) such that
P(X > t) = .1 <==> e-c>. = .1 <====? t = = 3.32
c) Assuming that the lifetimes of atoms are independent, the number Nr of atoms remaining after
t years has binomial (1024, e->.t) distribution. So find t such that
->.t log 1024
E(Nt) = 1 <====? 1024e = 1 <====? t = -\ = 10.
d) N 1o has binomial (1024, 1/1024) distribution, which is approximately Poisson (1). So by the
Poisson approximation,
P(N1o = 0):::::: e- 1 = .3679.
2. a) Let X denote the lifetime of an atom, then X is exponentia.lly distributed with rate -\ = 10§2 =
log 4 per century. Now we have 1020 atoms, and we wish to find the time such that we expect 1
out of 10 18 atoms to survive, which we can do by solving P(X > t) = drr
for t. We know that
1
P(X > t) = e->.t a.nd so -(log 4)(t) = -18log 10 and fina.lly t = 1s [g 10
= 29.9:::::: 30 centuries.
og4
b) This is equivalent to saying that there is about a 50% chance that no atoms are left. Let N 1
be the number of atoms left at time t, a.nd we wish to find t such that P(N, = 0) = .5. Note
that N, will be a binomial (n, p) where n = 1020 • Note further that this will be approximately
a Poisson with Jl = np. Thus we observe that
and so np = log 2 and fina.lly p = !

10 2 • We also know that p = .5 1 since p is the probability
that a given atom will survive for t centuries, and so
.S' = log2
n
lo g (log2)
• n
t= = 66.97 :::::: 67
log.5
So after 67 centuries there is about a 50% chance that no atoms are left.
3. LetT be the time until the next earthquake, then we have in general that P(T < t) = 1 - e->.c.
a) The probability of an earthquake in the next year is 1 - e -J = 0.6321.

b) Similarly, P(T < .5) = 1- e-· 5 = 0.3935.
c) P(T < 2) = 1 - e- 2 = 0.8647
d) P(T < 10) = 1- e-•o = 0.99995
4. Let W be the lifetime of a component. Then W has exponential distribution with rate >. = I/ I 0.
a) P(W > 20) = e- 20 " = e- 2 :::::: 0.135.

b) The median lifetime m satisfies
>. {log 2)
1/2 = P(W > m) =e-rn <==> m = ->.- = 6.93.
93
Section 4.2
c) SD(W) =If>.= 10.

d) If X denotes the average lifetime of 100 independent components, then E(X) = 10 and SD(X) =
10/VlOO =
1 so by the normal approximation
P(X > 11) = 1- P(X $ 11)::::: 1 - <P(1} = .1586

e) Let W1 and W2 denote the lifetimes of the first and second components respectively. Then
P(Wt + W2 > 22) = P(N < 2) = e- 2 •2 + 2.2e-2 "2 = .35457

where N has Poisson (22.\ = 2.2) distribution.
5. a) P(W4 $ 2) = 1 - P(W4 > 2) = 1- e- 2 ::::: .86.
b) P(T4 $ 5) = P(N(O, 5];::: 4) = 1- P(N(0,5] $ 3) = 1- e- 5 (1 + 5 + 25/2 + 125/6)::::: .73
c) E(Tt) = E(Wt + W2 + W3 + Wt) = E(Wt) + E(W2) + E(W3) + E(Wt) = 1 + 1 + 1 + 1 = 4
Note. T 4 has gamma (4, 1} distribution.
6. Let N2 be the number of hits during the first 2 minutes, and N, be the number of hits during the
first 4 minutes. Then N2 has Poisson (2) distribution and Nt has Poisson ( 4} distribution, and
P(2 $ T3 $ 4) = P(T3 > 2)- P(T3 > 4)

= P(N2 $ 2}- P(N, $ 2)
22 42
= (e- 2 + 2e- 2 + e-;, ) - ( e- 4 + 4e-• + e-;, ) = 5e- 2 - 13e- 4 = 0.43857.
7. 1 - e-c,.\ =p tp = -i log(l - p)
8. To compute the density f of X, argue infinitesimally:
f(t)dt = P(X E (t, t + dt))

= P(X E (t, t + dt)i..\ = )P(..\ = + P(X E (t, t + dt)i..\ = )P(>. =
= fy,(t)dt. t + fy,(t)dt. '
where fy1 is the density of an exponential random variable Yi having rate 1/100, and Jy, is the
density of an exponential random variable Y2 having rate 1/200. Hence
f(t) = + > 0).
a) P(X;::: 200) = kP(Yi ;::: 200) + 200) = ke-2ootJoo + .29
b) E(X) = tE(Yt) + fE(Y2) = t X 100 +f 200 =

X
c) E(X 2 ) = kE(Yt2) + fE(Y22) = x 2 x 100 + f x 2 x 200 2 = 6 x 100 2 . Therefore

2
Var(X) = E(X1)- (E(X)]2 = 29oro.
9. a) r(r + 1} = J0 00
xr e-"dx = -xr e-"[ + r J0
00
xr-le-"dx = rf(r)
b) Note 11lat f(I) = J000 c-"dx =I.

So f(. · 1} = d'(r) = r(r- l}f(r- I)=···= r!f(I) = r!.
c) E(T") = [ 0
00
t"e-'dt = f(n +I)= n!.
Var(T) = E(T 1
)- (E(TW = 2- I= 1 SD(T) = 1.
d) P(..\T > u) = P(T > uf>.) = e-luf>. = e-", so >.T has exPQnential (1} distribution, and
E((..\T)"] = n! E(T") = n!/>..".
SD(..\T) =1 SD(T) = 1/..\.
94
Section 4.2
10. a) Let X= int (T). X takes values in {0, 1, 2, ... } and

P(X = k) = P(k < k + 1)T
= P(T < k + 1)- P(T < k)
=·(1- e->.(k+t))- (1- e->.k)
= e-u(l - e-"')
= lP. where p = 1- e->., q = 1 - p.
b) If T has exponential distribution on (0, oo) with rate >., then mT has exponential distribution
with rate >.fm (why?) so by a) mTm = int {mT) has geometric distribution on {0, 1, ... } with
success parameter Pm = 1 - e ->.fm. .
Conversely, if for each m = 1, 2, ... we have that int (mT) has geometric distribution on {0, 1, ... }
with success parameter Pm, then: argue that there exists >. > 0 such that Pm = 1 - e->./m
(set q1 = e->. and consider P(T 1)); next argue that for each rational t > 0 we have
P(T t) = e->.c; finally argue that this last must hold for all real t by using the fact that
P(T > t) is a nonincreasing function of t.
I )) I .!lm. I -A/m 1
c) E(Tm) = ;;;E(int (mT = m. Pm = m. (1.:.-.-.>./m)- X as m- 00
(Use l'Hopital's rule, or series expansions). .
Since Tm T Tm + ;k, we have E(Tm) E(T) E(Tm) + ;!.. Let m--+ oo to see E(T) = t·
Similarly argue that £(1!) --+ -b as m--+ oo, and therefore that = -b·
11. a) Want to show P(T t) e->-•. Write A = 10-6 seconds, and consider t = nA for n = 0, 1, 2, ....
By assumption,
P(T + 1)A IT> nA) =>.A

(n for all n = 0, 1, 2, ... ; equivalently
P(T > (n + l)AIT > nA) = 1- >.A for all n = o, 1, 2, ....

Therefore
P(T 0) = 1,
P(T > A)= P(T > O)P(T > AIT > 0) = 1 X {1 ->.A)= 1 ->.A,
P(T > 2A) = P(T > A)P(T > 2AIT >A)= (1- >.A)(1- >.A)= {1- >.A) 2 •
In general, P(T > nA) = (1 - >.A)n. Use the approximation 1 e->..o. to conclude
P(T > nA) (e->..o.)n = e-n>..o..
Put t = nA: P(T > t) e->.c.
b) P(1 <T 2) ft
2
>.e->.'dt = e->.- e- 2
>..
12. a) Differentiate the gamma (r, >.)density:
If r 1, then the derivative is negative for all t > 0, so fr.>. is maximized at 0.

If r > 1, then the derivative is zero at I* = ( r - 1)/ >., is positive to the left oft*, and is negative
to its right. So t* yields a local maximum for the density. But the density is zero at t = 0 and
tends to zero as t - oo; hence the density achieves its overall maximum at t* .
If r < 1, then the density blows up to oo as t approaches zero.
b) Fork= 0, 1, 2, ... we have :
E(r) = 1oo
0
f(r)
=
f(r)
1oo t<r+k)-le-"''dt =
r(r)
f(r
>.r+k
+ k)
=
I r(r
).k
+ k)
r(r)
0
Hence
E(T) - ! . r(r+;) -.;:;
- >. r(r - >.
95
Section 4.2
E(,.-,2) _ I r(r+j) _ (r+;)r

J. · - X" r(r - >.
Var(T) = r:tr- (x)2 = fi

SD(T) = 4.
13. a) Estimate ..\ = 1/20 = 5% per day.
b) Nd has binomial (10, 000, e-d/ 20 ) distribution. Therefore
From this calculate

E(N1o) =6065; SD(N1o) = 49;
E(N2o) = 3679; SD(N2o) = 48;
E(N3o) = 2231; SD(N3o) = 42.
14. Option b) is correct: The probability that a com'ponent fails in its first day of use is 1- e- 1 120 , which
is approximately 1/20 = 5%, because l - e--= ::::::: x as x --. 0; and is less than 5% because e--= I- x
for all x (See Appendix III). Exact value is 4.877 ... %.
15. a) 80 days
b) 40 days
3
c) P(Tcotal 60) = P(at most 3 failures in 60 days)= e- 3 (1 + 3 + 32' + 36 = 13e-3
) ::::::: .64723
since the number of failures in 60 days has Poisson (.05 x 60) distribution.
16. Say a total of k components will do. Since P(Ttotal 60} = P(N6o k- I), we require
P(N6o k- 0.9
By trial and error, we find P(N6o $ 5)::::::: 0.91608, so a total of six components (five spares) will do.
17. Redoing the satellite problem:
a) 80 days
b) days
c) Guess the answer to c) should be larger, because now 60 days is more standard deviations below
the mean of 80 days. In fact, Trotal has the same distribution as the sum of 8 independent
exponential (.1) random variables, so P(Tcotal 60) = P(N6o < 8)::::::: .744 where now N6o has
Poisson {6) distribution.
Redoing the preceding problem: Since P(N60 .; 10) ::::::: .9161, it follows that four spare components
will do.
96
Section 4.3
Section 4.3
1. a) P(T $b)= 1- P(T >b) = 1 - G(b).

b) P(a $ T $b) = P(T a)- P(T > b) = G(a) - G(b). {Since Tis continuous, P(T a) equals
P(T > a) equals G(a}.}
2. Suppose T has constant hazard rate: Say A(t) = c for all t > 0. Use (7) to get
G(t) = t > 0.
Then the density ofT is, by (5),
__ dG(t) _ -ct t
!( t } - - ce , > Q,
dt
so T has exponential distribution with rate c.
Conversely, if T has exponential distribution with rate A, then for each t > 0:
3. a) (b!rr,t>O.
b) (b:c) (b!Ja = ,t > 0.
4. (i} ==::? (ii}:
exp ( -1' A(u)du) = exp ( -1' AQ'U

0
-
1
du) = exp(-At'") = G(t).
(ii) ==::? (iii}: differentiate G with respect to t:
_!!.G(t) = = Aat"- 1 e-w• = f(t).

dt dt
(iii) ==::? (ii):
(iii) & (ii) ==::? (i):
5. Let k 0.
a) E(r) = J 00
0
t"f(t)dt = J0
00
t"Aat"- 1 e-:l.' .. dt = J 0
"" (f)" 1"e-zdx = A-k/a fo 00
xkfae-zdx =
(Substitute x =At".)
b) By (a),
E(T) = A-lfor + 1);
E('J'l) =r 2
tor + 1);
hence V ar(T) = ,\ -l/a { r ( + 1) - [f ( + I) r} .

6. We have A(t) = 1/20 if 0 $ t $ 10, and A(t) = 1/10 if t > 10.
97
Section 4.3
a) P(T > 15) = G(15) = exp (- fo15 >.(u)du) = exp { -[fo10 (1/20)du + = e- 1
.3679.
b) If 0 $ t $ 10 then G(t) = exp (- J:(l/20}du) = e- 1 120 ;
if t- > 10 then G(t) = exp { -lJo

rrto (1/20)du +
10
(1/IO)du] }= e- (1+•-•o)
2 --ro .
1.0
0 5 10 15 20 25
le-1120 < t < 10

c) f(t) = - r
= { 2 e-<rftO-t/2)
10
0
t > 10
0 5 10 15 20 25
d) E(T) = J
00
0
G(t)dt = J
0
10
e-•1 20 dt + J:: e-(!+ = 20 fo
112
e-"du + 10c J 00
0
e-vdv =
13.93.
(Substitute u = t/20, v = (t- 10)/10 .)
7. a) Integrate by parts the relation E(TZ) = J000 t 2 / ( t)dt.
b) £(7"2) is 400. So SD(T) is v'400- 10011' 9.265.
c) If 1' denotes the average lifetime of 100 components, then E(T) = E(T1 ) = 17.7245 a; 'J(T) =
S D(T.)f VfOO = 0.9265 so by the normal approximal;
P(T > 20) 1- = 0.007.
8. a.) Only for a ?: 0, b ?: 0, and either a > 0 or b > 0.
b) G(t) =exp- +bt).
98
Section 4.3
c) f(t) =(at+ b) exp- bt)
d) E(T) = fooo G(t)dt

= .fo00
exp- + bt) dt
= eb'f'2c. J0
00
exp - (t +
2
dt
= eb'J2c. Jbf.,(Q
roo e-•'/'2 dz
7o
( z = V"'
'a (t + £) )
a
= v /2f<eb'f2a
7
roo
Jbf.,(Q 7h
e-•'/2dz I
= y"l!eb'f2a (1- 4>(bf..;a})

e) Compute E(y'l) using
E + bT) = Ioco (at+ bt) (at+ b) exp- ( + bt) dt = fooo ue-udu = 1;

then use V a r(T) = E(y'l) - [E(TW.
9. a) Put (5) in (6) to get

f(t) 1 dG(t) d
A(t) = G(t) =- G(t) -;It=- dt logG(t).
b) I: .X(u)du =I:- ddu log G(u)du = -log G(t)
<==> logG(t)=-1'.x(u)du
<==> G(t) = exp { -1' .X(u)du}
10. a, b) Smaller, since for each .s, t positive:
P(T = P(T > .s + t, T > s)

> 5 + tiT > 5 } P(T > s)
P(T > s+ t)
P(T > s)
= exp ( -1•+• .X(u)du) I -1'

cxp ( .X(u)du)
= exp (- 1•+\(u)du)
= exp ( -1' .X(x + .s)dx)
:S exp (- 1' -\( x )dx) since A is increasing
= P(T > t).

c) If A is decreasing, then the inequality reverses.
99
Section 4.4
Section 4.4
1. Exponential (>.fc).
2. Use the linear change of variable formula. If T has gamma (r, >.)distribution, with density
)..r tr-1 -:l.t 0
f r,:l. (t ) = r( r) e ,t > ,
then the density of cT (c > 0) is
If { / ) (>.Jet
f.cT (U ) = -;; r,.\ U C =-;;I f{r)
)..r Ur-I
cr-l e
-:l.ufc
= r(r) U
r-1
e
-(:1./c)u
,U
O
> .
So cT has gamma{r, Afc) distribution. Apply this twice to see that T has gamma (r, >.) distribution
iff >.T has gamma {r, I) distribution.
3. The range of Y = U 2 is {0,1). The function y = u 2 is strictly increasing and has derivative
for u E {0, 1). So by the one to one change of variable formula for densities, the density of Y is
= 2u *
Jy(y) =lu(u) I= L = y E {0, 1).
4. The density of X is lx(x) = 1/2 for x E ( -1,1), so by Example 5 the density of Y is
I y (y )
= lx(,fii) + lx( -.;Y)
2../Y
= t2..jY
+t = _1_
2..jY. y
E {0 1)
' .
Note: This distribution is called the beta (1/2, I) distribution.

5. The range of y = X 2 is {0, 4). The density of X is lx(x) = 1/3 for X E (-I, 2), so by Example 5 the
density of Y is
_ lx(.;Y)+Ix(-..jY)
I y ()
y - 2../Y .
If y E (0, 1) then this simplifies to tJ = ./y. 3
If y E {1, 4) then this simplifies to = 6

./y.
For other y this simplifies to 0.
6. Notice that Y Its range is (-oo,oo). The function y is strictly increasing with
derivative sec2 for 1/J E ( -7r /2, 7r /2). So by the one to one change of variable formula for densities,
the density of Y is
Jy(y) = l·(4>) I
= !.../sec
7r
2
1
= 1r(l + yl)' y E ( -oo, oo).
The Cauchy distribution is symmetric since fy(y) = fy(-y).
The expectation of a Cauchy random variable is undefined since
i: lyfy(y)ldy = j_ao 1r{l Y2 ) dy
!
00
1
> --dy
-
1
2rlyl
= 00
so the required integral docs not converge absolutely.

100
Section 4.4
7. If U has uniform {0, 1) distribution, then -,;U has uniform {0, 1r) distribution, so 1rU- 1rj2 has uniform
{-i, i) distribution. (See Example 1.) Now apply Exercise 6.
8. a) Apply the many-to-one change of variable formula for densities: If z = g(y) = ;
1 112
, then the
density fz(z) of Z = g(Y) is given by
fz(z) = (y:E=z} fy(y) I

I: fy(y)/i-2y/(I+y
2 2
) i
fy ( -Jr=l) + fy ( Jr=l)
2z../z- z 2
1
-,;jz(l- z}
Soc= 1/7r, a= -1/2, {3 = -1/2.

b)
P( Z < x) =
-
1z
o 1rjz(l- z)
dz
Now if we let z = sin 2 (u) then dz = 2sin(u)cos(u)du and we have
arcsin ..;; . ( ) ( )d
P(Z$z)=.!.
-,; o 1 arcsin..;; 2sm
2smu cosu u
Jsin 2 (u)(l- sin 2 (u))
. ( u ) cos ( u ) du
= .!_
1r 1 o )sin 2(u) cos2( u)
=- 21
1r 0
arcsin..;;
du
2 . '-
= -arcsm vx
7r
NOTE: using a different trigonometric substitution will give answer that may look very different.
t t
For example, using z = + sin u gives P(Z $ x) = {2x- 1) + t·
It is true, though
not obvious, that this is equal to vz.
c)
E(Z) = E (1-+1-Y2)
=;
2100 0 (1
dy
+ y2)2
=- 21"'2
X" 0
cos 2 8d8
= 1/2
(substitute y =tan 8); this is obvious by the symmetry of the density of Z.
d)
101
Section 4.4
=-
21></2 cos 4
Ode
7r 0
= 3/8
(substitute y =tan B); so Var(Z) = 1/8.
9. Let Q' > 0.
a) The range of Y = T" is {0, oo). The function y = ta is strictly increasing and has derivative
= at 0 - 1 for t > 0. So by the one to one change of variable formula for densities, the density
of Y is
Jy(y) = /T(t) I
= .\crta-1 e ->.r"' I crta-1
= .\e->.r"'
= .\e-" 11 , y > 0.
b) By Example 4, X = -.\- 1 log U has exponential (.\) distribution. So it suffices to show that
T = X 1 fa has Weibull (.\,a) distribution. T has range (0, oo). The function t z 1 la is strictly =
increasing, so by the change of variable formula, the density of T is
/T(t) = Jy(y) I
, -A%/
= ... e 1 {1/a)-1
-x
Q'
= .\crta- 1e ->.r"', t > 0.
10. a) The range of Y = IZI is (0, oo). Use the change of variable formula for many to one functions
to get the density of Y: for y > 0
fy(y) = "2: /z(z) = /z(y) + fz( -y).

%:(z(=!l
Here /z(z) = </l(z) is symmetric, so
Jy(y) = 2</l(y)
{2
=V;e- 11
1 /2
, y > 0.
b) The range of Y = Z 2 is (O,oo). By Example 5, the density of Y is, for y > 0,

f y (y )
= fz(-/Y) + fz(-.JY)
2.JY
<P( -/Y)
= ,;y
I
= -.j2;
-y
-1/2 -!1/2
e
decreasing with derivative

density of Y is
= *
c) The range of Y = 1/Z is (-oo,O) U (O,oo). If z "# 0 then the function y = 1/z is strictly
-fr .
So by the one to one change of variable formda, the
102
Section 4.4
d) The range of Y = l/Z 2 is (0, oo). By Example 5, the density of Y = (1/Z) 2 is, for y > O,
f y (y )
= fttz(,fY) + fttz(-,;y)
2../Y
_ fttz(-../Y)
- .;Y
1 -3/2 _..L
= tn=Y e lv
y21r
11. Duplicate the solution to Problem 1 of Example 7 for a. sphere of radius r:
P(e E dO)= 2ir cos8rd8

A
where A = tota.l a. rca.. Integrate from 8 = -7r /2 to 1r/2 to get
Or: If we accept that the area of a sphere between two parallel planes is proportiona.l to the distance
between the planes, then for a sphere of radius r, the surface area. between two para.llel planes a
distance 1::. apart must be 2x-rl::. (take the planes very close together and near the equator). So the
tota.l surface area of a sphere of radius r is 2rr · 2r = 47rr 2 •
103
Section 4.5
Section 4.5
1. a) IC'x 0 then F(x) = J: >.e-"''dt = 1- e-""".

If x < 0 then F(x) = 0.
b)
1.0 - - - - - - - - - - - - - - - - - - - - - - -
0.5-
0.0 i
0 2 3 4
Multiples of 1()...
2. a)
X 0 1 2 3
Fx(x) 1/8 1/2 7/8
1.0.
0.5-;
0 2 3 4 5 6 7 8 9 10
b) Since P(k) = (1/2)k fork= 1, 2, 3, ... , w; have:

If 1 then F(x) = Lk:Sz P(k) = = 1- (1/2)inr (z) ;
If x 1
b) If 0 < r < 1 then

FR(r) = P(R $ r)
Area inside radius r
= Area of circle
Differentiate to get /R(r) = frFR(r) = 2r.
4. If a > 0, then
Fax+b(Y) = P(aX + b $ y)
=Fx ( - a -
y- b)
If a < 0, then
Fax+b(Y) = P(aX +b$ y)
= 1-Fx y-
( -a-
b)
assuming Fx(:r:) is a continuous function of :r:.
5. If :r: $ 0 then
Fx(:r:) = = •
If :r: 0 then
Fx(:r:) = Fx(O) + P(O <X$ :r:) = + foz = + t(l- e-z) = 1-

6. a) P(X 1/2) = 1- F(1/2) = 7/8.
:r:$0
b) f(:r:) = F(:r:) = { O$:r:<l
X 1
c) E(X) = J :r:f(x)d:r: = J
0
1
:r:3:r: 2 d:r: = fo 1
3:r: 3 d:r: = 3/4. .
d) Let Y1, Y2, YJ be independent uniform (0, 1) random variables. Then for i = 1, 2, 3
:r:$0
P(Y.$:r:)={; O$:r:$1
:r: 1
105
Section 4.5
so if X= max(Yt, Y2, YJ), then

P(X x) = P(Yi x, Y2 x, Ya x)
= [P(Yi xW
= {x
0 0x<O}
3
xl
1 1
= F(x).
7. fy(y)dy = h(t)dt where t = y
2
2
a) fy(y) = .xe->..li 2y, y 0
b) EY = fo
00
t 112 .xe->.. 1dt = .XJ0""'t 3 f 2 - 1 e->..'dt = = !£ = 0.51 when,\= 3.
c) Find the inverse c.d.f. P(Y y) = P(T $ y 2 ) =1- e->.y' = u, say. So 1 - u = e->..y> so
_ . /log(l-u)
y- v ->..
y = Jlog(l- u)
-.X
8. Let L; denote the lifetime component i, and let L denote the lifetime of the entire system.
a) The system fails when and only when both components fail, so L = max( £ 1 , £2 ). If t > 0 then
P(L > t) = 1- P(max(L 1, £2) t)
= 1 - P(£1 t, £2 t)
= 1- {1- e_,,,.., )(1 - e-' 1"' 2 ).
b) Here L = min(£ 1 , £2 ). By Example 3, L has exponential distribution with rate 1/ fl. I+ 1/ P.2· So
if t > 0 then
106
Section 4.5
c) Here L = max(Lrop, Lbouom), where, by Example 3, Lrop has exponential distribution with
rate (1/p. 1 + 1/p. 2 ) and has exponential distribution with rate (1/p. 3 + l/p. 4 ). Put
a= 1/p. 1 + 1/p. 2 and b = l/P.3 + 1/P.•· By part a), L has survival function
P(L > t) =I- (1- e-at)(l- e- 6 '), t > 0.
d) Here L =min (max(£ 1 , £2), £3). If t > 0 then

P(L > t) = P(max(£1, £2) > t, LJ > t)
= P(max(Lt, £2) > t) P(£3 > t)
= (1- {1- )(I- e-t/" 2 ))
0.0
0 2 3 4 5 6 7 8
9. a) X has the same distribution as F- 1(U), so
E(X) = E(F- 1
(U))
J
= F- 1
(y)fu(y)dy
= 11 F-1(y)dy
= shaded area
(Integrate horizontal strips.) But the shaded area can also be represented, by integrating
vertical strips, as
1""' (1- F(x)]dx = 1""' P(X > x)dx.

b) If X is a discrete random variable with values 0, 1, 2, ... , then
E(X) = 1""' P(X > x)dx
107
Section 4.5
= :t 1"
n=l n-1
P(X > x)dx
00
n=l
Or: the region above the distribution function of X consists of rectangles of width 1 and height
1- F(O) = P(X 1),

1 - F(1) = P(X 2),
1 - F(2) =: P(X 3),
etc.
c) If X has exponential distribution with rate .X, then X 0 so
E(X) = 1 00
P(X > x)dx = 1 00
e-J.zdx =
If X has geometric (p) distribution on {1, 2, 3, ... } then by b)

00 00
1
E(X) = '"""'P(X n) - = !.
= '"""'q"- 1 = -l-q p
n=l n=l
d) X= X[I(X > 0) + I(X < 0)] = XI(X > 0)- (-X)I(X < 0) =X+- X_
where X+= XI(X > 0) and X_= (-X)I(X < 0). So E(X) = E(X+)- E(X-).
Since 0,
E(X+) = 1oo P(X+ > x)dx

= 1 00
P(X > x)dx
= (+)area in diagram
Since o,
E(X-) = 1oo P(X- > x)dx
=.1 00
P(X < -x)dx
= 1° 00
P(X < t)dt
= (-)area in diagram.
108
Section 4.6
Section 4.6
1. Let Xt. X 2 , X 3 , X 4 be the arrival times, in minutes, of persons 1, 2, 3, and 4, translated so that 12
noon= 0. Then X 1 ,X2 ,X3 ,X. are independent with common normal (0,5 2 ) distribution.
a) P(X(Il < -10) = 1- P(X< 1 > -10)

= 1- P(X1 ;:: -10. x2 ;:: -10. x3 ;:: -1o, x. ;:: -10)
= 1- [P(X1 2: -10)] 4
= 1 - (.9772). = .0881
since P(X1 ;:: -10) = P(Xt/5 -2) = = .9772.
b) P(X<•> > 15) = 1 - P(X<•> :5 15)
= 1- P(X1 :5 15, x2 :5 15, x3 :5 15, x. :5 15)
=1- [P(X1 :5 15W
= 1 - (.9986). = .0056
since P(Xt :5 15) = P(Xt/5 :53)= = .9986.
c) Let f and F denote the common density and distribution function of the X;'s. The desired
probability is
P(-1/6 :5 X!ll :5 1/6) = ! 116
-1/6
fxc 21 (x)dx,
where f x< 2 > is the density of the second order statistic:
fxc 2 l(x) = 4/(x)G= F(x)]•-•
= 12/(x)F(x)[1- F(x)] 2
•
Over the interval [-1/6, 1/6], this density is roughly constant, so the desired probability is
approximately
(length of interval) X (value of /x(l) at 0) =
= 4/(0)F(0)[1- F(0)] 2
2
= 4. . (1/2). (1/2)
1
= tn= :::::: .0399.
10v2r
Remark. To compute the probability exactly, you may integrate by parts. Or you may argue
as follows: Let Nt denote the number of X's which are :5 1/6, and let N 2 denote the number
of X's which are < -1/6. Then N 1 has binomial ( 4, F(1/6)) distribution, and N 2 has binomial
(4, F(-1/6)) distribution. Now argue that
P(-1/6 :5 X(ll :5 1/6) = P(X<•> :5 1/6)- P(X(ll < -1/6) = P(Nt 2: 2)- P(N2 2).
2. b) Let k 2: I integer. If X has beta distribution with integer parameters r, s, then
1
E(X") = J.0 x" B(r,•)
1 xr- 1{1- x)'- 1 dx
= B(r,•)
t J.l xr+k-1(1- x)'-ldx
0
= B(r + k, s)
_ (rtk-1)!(•-1)! /(r-1}!(•-1)'
-
_ (rtk-1}!/(r-1)!
- (r± •± k-1)!/(r± •-1)!
_
- (rt• rt•± ll···(rt•tk-1)·
109
Section 4.6
a) Hence Var(X) = E(X 2 ) - [E(XW

_ r(r+l) _ (-r-)2
- (r+s)(r+s+l) r+s
_ _r_. (r+l)(r+•)-r{r+•+l)
- r+• (r+s+1)(r+•)
- rJ
- (r+s)>(r+s+l) ·
3. a)(y-x)"
b) (1-x)"-(y-x)"
c) y" - (y - x )"
d) 1- (1- x)"- y" + (y- x)"
e) y)"-"
f) { k+1
n ) :z: k+1 ( 1 -y }n-k-1 + {") 1;(
k :z: 1 -y
}n-k + (l<)!l!(n-k)!x
n! I;( y-x }( 1 -y )n-k-1
4. a) P(Z = 1} = P(Z = 0} = 1/2; b) yes, yes, yes; c) the n! possible orders of the n variables are
equally likely, independent of the n order statistics.
5. Let F be the common c.d.f. of the X's.
a) X(k) is less than or equal to :z: if and only if at least k of the X's are less than or equal to :r: (and
the remainder are greater than :r:). Since each X has chance F(x) of being less than or equal to
:z:, and the X's are independent, it follows that the number of X's which are less than or equal
to x is a binomial (n, F(:r:)) random variable, so
P(X(k) :z:) = t
t=k
(:)[F(:r:)]i[I- F(:r:)]"-•.
b) If r, s are positive integers: The rth order statistic of r + s- 1 independent uniform (0, 1} random
variables has beta (r, s) distribution. But by part (a), its c.d.f. must be
L
r+•-1 (
r + )
1 [F(:r:)]'[1- F(xw+·-1-·.
t=r
Finally note that the c.d.f. of a uniform (0, 1} random variable satisfies F(:r:) = :z:, 0 x 1.
c) If 0 u 1 then
f(u) = B(!,s)ur-1(-u+ 1)'-1
= _1_·
B(r,s) 1
1)(-u)'
•=0
110
Chapter 4: Review
Chapter 4: Review
1. X1 has binomial (n, e->. 1 ) distribution, so
a) E(Xr) = ne->. 1
b) Var(Xr) = ne->. 1(1- e->.c)

2. J1
0 (x + x 2 )dx = => c =
x<O
F(x) =c + u2 )du = { ;/5(x /2 + x /3)
2 3
O$x$1
X> 1.
E(X) = 7/10, SD(X) = Vs/10.
2.51
2.0
I
1.5
1.0
0.0 .J.....__--L__ _ _ _
0.0 0.5 1.0 0.0 0.5 1.0

Density Distribution fWlction
3. X is the maximum of 3 independent uniform (0, 1) variables:
if X< 0
Fx(x) = P(X $ x) = P(Yt $ :r, Ye $ x, Y3 $ x) ={ if 0 <X< 1
if X> 1
2
d ifO<x<l
fx(x) = -Fx(:z:)
d:r;
={ 3x
0 otherwise
E(X) = J xfx(x)dx = 1 1
2
x3x dx =
4. a) P(X <I)= foo f(:r:)dx = L 00

(I/2Vdx + J01 {1/2)e-.rd:r; = 1- (1/2)e- 1 •
b) E(X) =0 by symmetry;
Var(X) = E{X 2 ) = J r 2 f(x)dx = 2 J0"" x 1 • (1/2)e-.rdx = J0 00
x 2 e-.rdx = 2.
c) Y = X 2 has range (0, oo ). For each y 0
Fv(y) = P(X 2
$ y) = P(-Jy $X$ Jy) = 2P(O $X$ Jy}
= 21..;; f(:r)dr = 21..;;(1/2)e-"dx = 1- e-...fi.
Ill
Chapter 4: Review
5. a) P(T > 30) = = .4.

b) If o < t $ 30, then P(T > t) = • If 30 < t $ 70, then P(T > t) = .
c) If 0 <t$ = 2/100 . If 30 < t $
30, then fr(t) 70, then fr(t) = 1/100 .
= Jof30 21 d f10 I d
d) B(T) 100 t+ 30 100 t = 29.
E('Tl) = rJo + f1o Ldt = Jroo

Jo wo 100 3 '
Var(T) = E('Tl)- (E(TW = 392.33 SD(T):::::: 19.81.

e) Let I be the distance from one end of the road to the ambulance station, and 7i be the response
time. We may assume, without loss of generality, that I $ 50. Then
1 t $0
1oo-2r 0 < t $ I
P(Tt > t) =
{ 100 l<t$100-1
0 t > 100- I
After a calculation similar to that of part d),
E(Tt) = -I+ 50,
=
which is minimized when I 50. So the expected response time is minimized when the station
is located at the midpoint of the road.
6. T has exponential distribution with rate .X= 1/48.
a) The c.d.f. ofT is, for t >0

Fr(t) = P(T$ t) = 1-P(T> t) = 1-e->.• = 1-e-•t•s.
1.0.,..... - - - - - - - - - - - - - - - - - - - - - - - -
0.5
0 10 20 30 40 50 60 70 80 100
b) U is distributed as rnin(T, 48). If t 48 then
Fu(t) = P(min(T, 48) $ t) = 1

and if 0 $ t < 48 then
Fu(t) = P(min(T, 48) $ t) = P(T $ t) = 1- e-•t•s.
If t < 0 then Fu(t) 0. =
U is neither discrete nor continuous.
1.0 .
0.0 '
0 I0 20 30 40 50 60 70 80 90 100
112
Chapter 4: Review
c) E(T) = 48 by assumption, and
= J:
E(U) = Emin(T,48)
min(t, 48)/T(t)dt
= 1 0
48
tfT(t)dt + 1 00
48
48/T(t)dt
= 1•s 0
t( _!._e-•1• 8 )dt + 48P(T > 48)
48
= 48(1 - 2e- 1 ) + 48e- 1 = 48(I - e- 1 ) = 30.34.
d) No. The policy of replacement at 48 hours is unnecessary, since, by the memoryless property of
the exponential distribution, the remaining time till failure of components which last beyond 48
hours has the same distribution as the lifetime of brand new components.
7. a) I= I f(x)dx = ae-tJI.rldx = 20' Iooo e-tl.rdx = 2

; • So
b) E(X) =I xf(x)dx = 0 since f is an even function;
Var(X) = E(X = x e-t11.r1dx = -Jr .

2 2
)
c) If y >0 then P(IXI > y) = 2P(X > y) = Iyoo {Je-tl.rdx = e-tJy.
d) P(X < x) = { I - (I/2)e-tl.r x;:: 0

- (I/2)etl.r X < 0
8. a) I/ I h(x)dx c) If.../fi, 0, 1 d) 2, 2/3, 1/18 e) 1/10, 5, 100/12 f) 5, 1/5, 1/25
9. a) normal, mean a, variance t. factor= -j;:

b) normal, mean a, variance b2 /2,
c) gamma (6, a)
6 6 a6
mean = -;;, vanance = a 2 , factor=-
5!
d) bilateral exponential, parameter a
. 2 I
mean= 0, vanance = a• , factor=-
2a
e) beta (8, 10)
8 80 17!
mean= , variance = -:-::-::----:-:- factor= - 1-
18 • 182 x19' 7.9!
f)
x = lru, dx = bdu
8b . 80b 2 17! I
mean= lS' vanance = I82 x I9' factor= ( - ) -
7!9! b17
10. Note that (I/fi)e-r' is the density function of a normal (0, 1/2) random variable, say X.
foo e -r' d x -- :fi

a) Jo 2
foo I
-oo -;;-;e -.r' d x -- 2
• 1 -- •
886 •
b) Io1
e-.r' dx = fiP(O <X< I)= fi (<t>(y'2)- (O)j = .746.
c) Iooo xe-r' dx = - !e-r' [ = !·
d) Iooo x e-r dx = 4
2 1
2
:r: -j;e-"' dx =
2
4Var(X) = Lf = .433.
113
Chapter 4: Review
11. a) 1/2
b) Compare with the gamma (r, ,\)density: if r > 0, ,\ > 0
1= 100 tr-1e-l.'dt <==* 100 tr-le-l.'dt =

Put r = 8, ,\ = 2, obtain 00
1 0
1
t e
-2td _ r(8) _ 7!
t - 2s - 28.
c) Compare with the beta function for positive integers: if r and s are positive integers, then
B( T, S ) -1
-
0
1
X
r-1( _
1 X
)'-1d _ (r- 1)!(s- 1)!
X - ( )I .
r+s-1.
So
12. P{1 < T < 2) = P(T > 1) - P(T > 3) = P(N1 $ 2)- P(N3 $ 2)
where N, denotes the number of hits in the time interval [0, t]. Since N, has Poisson {2t) distribution,
we have
P(N1 $ 2) =
2
e-
2
k
L
= e- 2 (1 +2 +
k=O
2
2
!!
2
) = 5e-
2
2 k 2
P(N3 $ 2) = I>-6 = e-6 {1 + 6 + ) = 25e- 6
k=O
and the desired probability is 5e- 2 - 25e-t = 0.6147
14. a) This is equilavent to finding P(Tt > 2) where Tr is gamma(4,3), so

3 k
P(Tt > 2) = L e- 6
= e-6(1 + 6 + 18 + 36) = 0.1512
k=O
b) Because of the memory less property of the Poisson process, 11, T2 - T1, and Ta - T2 are all
independent exponentials, so this is just P(T1 < 1)
3
{1 - e- 3 ) = 0.858
3
=
c) Given that there were 10 arrivals in the first 4 minutes, the times of the actual arrivals are
uniform on (0,4), so the distribution of tlie number of arrivals in the first 2 minutes is binomial
( 10,.5).
8! 3 4 .. 4
15. t!4! 7 7
16. a) As soon as the first car comes, we are waiting for two more cars to come which gives us a
gamma(2,3), call it T2.
I k
P(T2 < 3) =I- 9

=I- e- (1 + 9) = 0.9988
k=O
b) In a ten-minute interval, the probability of 15 cars is e - 30 = 0.001027. Given that there

were 15 cars, the probability of 10 Japanese cars is = 0.0007378, assuming that the
chance of being Japanese is independent from car to car. Further assuming that the number of
cars which appear in a given time interval is independent of whether the cars are Japanese, the
chance of both events occuring is 0.001027 x 0.0007378 7.576 x 10- 7 • =
114
Chapter 4: Review
17. If T has exponential distribution with rate then for all 0 $ t < oo we have
P(T $ t) = j'
-oo
f(u.)du = 1'
o
.>..e->.udu = 1- e-.\t.
Conversely, if T is such that P(T $ t) = 1 - e->.• for all 0 $ t < oo, then for 0 $ a .b- (1 - e->.a) = e-.\a- e->.b
18. a) The lines arc independent, so we just multiply the probabilities of each of the events to get
e-<>.A +>.a+.\c) >.A >.;..I.e
'2
b) The overall rate for all buses is AA +>..a+ .>..c;so we have

c) P(A > t) = e-.\ 6 '
19. a)(20-2)log 2 10
b) 20log 2 10 -log. 2
20. The assumption indicates T = Tt + T2, where T1 and T2 are independent exponential (.>..) random
variables. Hence T has gamma (2, .>..) distribution.
a) f(t) = .>. 2 te-"', t 0.

b) G(t) = (1 + .>..t)e-"', t 0.
c) hazard(t) = = t 0.
d) For.>..= 1, t = 2 the hazard rate is = per hour.
Given survival to t = 2 hours, the probability of failure m the next minute (1/60 hour) is
approximately
rate) x (length of interval) = x to = to.
21. P(Rt $ r) = 1- e _.1,r,
a) Jy(y) = 2ye- 111 , y 0
b) exponential ( 1)
c) 1
22. a) The range of Y is [0, a/2].

If 0 < y < a/2 then P(Y $ y) = P(X $ y) = yfa;
if y a/2 then P(Y $ y) = 1.
b) Not continuous, since the distribution function is not a continuous function of y.
c) E(Y) = E(min(X,a/2)) = J min(x,a/2)/x(x)dx = J°12 x · 0

+ faa/ 2 • = ta.
12
Or use E(Y) = fooo P(Y > y)dy = foa (1- !)dy
23. a) EM= E(M- 3) + 3 = 5, Var(M) = Var(M- 3) = 4.

b) X= eM= e3 eM-J = e3ew, where W = M- 3. So e 3 and
1 3
fx(x) = - -.!.e-!"' =.!. .!.e-pog(ze- ) = .!..!. (xe- 3 } -! = .!. (x e3 )
I I2 X 2 X 2 2 X
c) P(M > 4) = P(M- 3 > 1) = so the probability is ( e-! r = e-• = 0.3679.

24. Let Y denote the time elapsed in minutes between the instant the lights last turned red and the
instant the car arrives at the lights. By assumption, Y has uniform (0, 2) distribution. And
1- y if 0 < y < 1
X-- { 0 if 1 < y < 2.
115
Chapter 4: Review
a) The range of X is [0,1]. If 0 < x < 1 then

P(X::; x) = P(X::; x,O < Y < 1) + P(X::; x,l < Y < 2)
= P(l- Y $ x, 0 < Y < 1) -r P(O $ x,l < Y < 2)
= P(l -X ::; y < 1) + P(1 < y < 2}
= l-{1-x)+2-l
2 2
l+x
--2-.
1.0 i
0.5
0.0 .1...-------------..-------------
0 2
b) neither.
c) X = g(Y) where g(y) =I - y if 0 < y < 1, = 0 if 1 < y < 2. So
E(X) = E(g(Y)) =I g(y)Jy(y)dy = 11
(I- y). +0 =
Similarly E(X 2 ) = f01 (1- y) 2 • tdy = t• so Var(X) = E(X 2 ) - [E(XW =fa.
d) The total delay is D =
X1 + X 2 + · · · + X1o. where X, is the delay at light i. Assume the X's
are independent. Then E(D) = 10E(X!) = 2.5, and Var(D) = IOVar(Xl) = 50/48. So by the
normal approximation,
D-25
P(D > 4) =P ( > 4-25) ::::: I - 4>{1.47)::::: 7%.
\/50/48 \/50/48
25. a) Y is uniform on (0, 1/2].
b) uniform on (0, I]
c) EY =t Var(Y) =-fa.
26. a) E(X e'Y) = E(X)E(e'Y) = 2 X [2 X f_s e'udu] = f(ei.S' - e') .
b) E(W[) = E(X 2 )E(e 2 'y) = e2 ').

SD(W,) = JHe3' _ e2') _ #(e3• + e2< _ 2e2-5t ).
27. a) For each n 2 we have

P(U1 ::; u and N = n) = F· 1
::; u and N > n- 1)- P(U 1 ::; u and N > n)
=Pt '::;---::;Ut $u)-P(U .. ::;U,.-t ::;---::;U1: •d.
Now for each n I we have
1 u"
P(Un::; ···::; Ut::; u) = 1P(U1::;
n.
u,U2::; u, ... ,U .. ::; u) = 1n.
since each of the n! orderings of Ut, ... , Un is equally likely. Therefore
u n-1 u"
P(Ut ::; u and N = n) = (n - 1)I• n!
116
Chapter 4: Review
b)
QO
P(U1 u and N is even) =L P(U1 u and N = 2k)

k=l
c) From a) with u = 1, P(N n) = 1/(n -1)! for n 2, and this is true also for n = 1. So by the
tail sum formula of Exercise 3.4.20
E(N) = f
n=l
(n 1)! = f
k=O
:! = e
28. Let 6 E [0, ;r] be the angle subtended by the (shorter) arc between the random point and the arbitrary
fixed point. Then by assumption, e has uniform (0, 1r) distribution. Observe that sin(6/2) =X (or
use the Law of Cosines); therefore e = 2 arcsin(X).
a) If 0 <x< 1 then P(X x) = P(e 2 arcsin(x)) = arcsin(x).

b) E(X) = E(sin(6/2)) = fo"" sin(8/2)d8 =
c) E(X 2
) = E(1- cos6) = J0"(1- cos9)d0 = 1; Var(X) = 1-
29. a). Suppose c > 0 has been set, and you choose b > 0. Let G be your net gain. Then
1- cb if X> 0
G- -1- cb
- { if X< o
so
E(G) =(I- cb)P(X > 0) + (-1- cb)P(X < 0)
= P(X > 0)- P(X < 0) - cb
= 2P(X > 0)- 1 - cb
= 2-P(b)- 1 - cb
since P(X > 0) = p ( > =( b).
So the option is advantageous when and only when there exists b > 0 such that E(G) > 0, i.e. when
and only when there exists b > 0 such that
• 1 + cb
2cf>(b)- 1- cb > 0 <==> cf>(b) > - -.
2
Such a b exists when and only when < 4>'(0). [One way to see this: Plot ( b) and 1 as functions Yb
fo b.] Since 4>'(0) = <P(O) = 1f.J2;, conclude that the option is advantageous to you when c < \1'1.
b). For such c, the expected net gain is maximized (why not minimized?) at b satisfying
a
ab E(G) =0 <==> <P(b) = 2c <==> e =
{if
V 2 c.
Note that the right side of the last equation is less than 1 (why?), so the equation has a solution
b = J-2log(,j"fc).
30. a) Let X be the diameter of a generic ball bearing. Then X has normal distribution with mean
p. = .250 and standard deviation a = .001, so the chance that the bearing meets specification
(i) is
P(IX - i.tl < a) = 0.6828.
We want E(TJ6), where T 1G is the number of trials until the 16th success in Bernoulli(p = .6828)
trials; so E(T16 ) = 16/0.6828 ::::::: 23.4.
117
Chapter 4: Review
b) Let Y be the diameter of a ball bearing that meets specification (i). Then Y has expectation
E(Y) = E(XIIX -1'1 < u) = E(uZ + I'IIZI < 1) =I'

(where Z = x;e has standard normal distribution) and second moment
Therefore
Var(Y) = u 2 E(Z 2 IIZI < 1) => SD(Y) = uJE(Z2IIZI <I).
But
E(Z211ZI < z) = :)z))

2
x .P(x)dx
P(IZI < z)
-2z4J(z) . .P(x)dx
=
P(IZI < z)
2z4J(z)
= I - P(IZI < z);
hence SD(Y) = uJ1- = 5.397 x 10-•.
The required probability is
P(3.995 < Y1 + · · · + Y1a < 4.005)

where Yi, ... , YiG are independent copies of Y. By the normal approximation, this is
4.005 - I61')
:;:: P ( IZI <
4
u = P(IZI < 2.316):;:: .98.
31. No Solution
118
@ 1993 Springer-Verlag New York, Inc.
+
Section 5.1
1. The area of the set of interest is 6.
a) P(X < 1) = indicated area = =± = .L . (Or divide the set into 12 triangles.)
area. of set 6 12
b) P(Y < X2) = indicated area = 1 f,2(x2 _ x)dx =

a.rea of set 6 1 36
.
4 4
(a) {b) i
3 31
I
'
2 2 i
'
y=x y=x
0 02
2. a.) P(within 0.01 inches of l) = ·
0. 2
= 0.1
2
. . . h
b) P(two measurements Within 0.01 mc es) = 1 -. 2(1/2)(0.19)
_
0 22
= 0.0975
3.
J! 7
...1.=-
6 12
4. a.)
P(IX- Yl :5 .25)
= indicated area.
=l-
7
Gf
=·-
16
b)
3/4
P (1;- II :5 .25)
4/5
= 5 - - 3
indicated a.rea.
=l-iG+D
9
= 40
0
Section 5.1
shaded region = {Y 2: X, Y 2: .25}
c)
P (Y ··iy .25)
= indicated
= G-:- 1
32 )
5
= 8"
0
5. a) Tile percentile rank X of a student picked at random has uniform distribution on {0, 1 ), so
P(X > 0.9) = 0.1 .
b) The percentile ranks X, Y of students picked independently at random are independent uniform
(0, 1) random variables, so
P(lX- Yl > 0.1) = (9/10) 2 = • 9/10
9/10
1 2 32
G. a) P(Jack arrives at least two minutes before Jill)= ( / ;; ::::: 0.376
1
b) Let F ={first person arrives before 12:05} and L ={last person arrives after 12:10} Then,
P(FL) = 1- P((FLt) = 1- P(r u Lc)
= 1- P(r)- P(Lc) + P(r n Lc)
= 1- ( 10)10
15 15 + 15
= 0.965
7. a)
.:.'· J
X •
120
Section 5.1
b) If 0 < x < 1 then P(M:::; x) = 1- P(M > x) = 1- (1- x) 2 ;

if x 1 then P(M :::; x) = 1;
if x :::; 0 then P( M :::; x) = 0.
So the density of Misgiven by fM(x) = { x) 0<x<1

elsewhere.
0 0
Distribution function Density
8. a) Let U1, ... , Un be the unordered independent uniforms.

P{U(I} >X and U(n) < y) = P(x < U1,U2, ... Un < y)
= P(x < U1 < y) · · · P(x < Un < y)
= (y- x)"
b)
P(U(l) $X and U(n) < y) = P(U(n) < y)- P(U(l) > x and U(n) < y)
= y"- (y- x)"
9. Standardise the length of the stick to be 1 (the solution clearly will not depend on the length of the
stick!). Look at one end of the stick, and let X and Y denote the distances from that end of the stick
to the break points. Then X and Yare independent uniform (0, 1) random variables. Let L denote
the minimum of X and Y, and R the maximum. (L to suggest left, R to suggest right.) Then the
lengths of the broken pieces can be expessed as L, R- L, 1 - R. To form a triangle, the maximum of
these three should be less than the sum of the rest. That is,
triangle '¢:::::::> ma.x{L, R- L, 1- R} < 1- ma.x{L, R- L, 1- R}
<==> ma.x{L, R- L, 1- R} < 1/2
<==> L < 1/2 and R- L < 1/2 and 1/2 < R
'¢:::::::> (X < 1/2 and Y- X < 1/2 and 1/2 < Y) or (Y < 1/2 and X- Y < 1/2 and 1/2 <X).
Conclude: P(triangle) = shaded area = ! .
total area 4
0.0 0.5 1.0
121
Section 5.2
Section 5.2
. { 1 ifO < IYI < X <1

1. a) fx,y(x, y) = 0 elsewhere.
b) If 0 <x< 1 then fx(x) 1dy = 2x; otherwise fx(x) = 0.
If 0 <y< 1 then fy(y) 1dx = 1 -lyl; otherwise fy(y) = 0.

c) No, because /x,y(x, y) # fx(x)fy(y).
d) E(X) = Io 2x dx = 2/3, E(Y) = 0 by symmetry.
1 2
1/2 if 0 < lxl + IYI < 1

2• a) /x,y(x, y) ={ 0 elsewhere
b) If lxl < 1 then fx(x) = = 1 -lxl- Otherwise fx(x) = 0.

By symmetry, Y has tlte same marginal density as X:
f y(y) = { 01 -IYI if IYI < 1

elsewhere.
c) No, because fx,y(x, y) :f fx(x)fy(y).
d) E(X) = r\Y) = 0.
3. a) 1 =I I f(x, y)dxdy = I:, 0 dy J:=o c(x

2
+ 4xy)dx = c I:,o<t + 2y)dy Soc= 3/4.
b) If 0 < a < 1, then
P(X $a)= 1: la 0
dy
2
c(x +4xy)dx =c 1: 0
(
3
a +2a
3
2
y) dy =c (a +a
3
3
2
) = ( +a2).
c) If 0 < b < 1, then
4. a)
= 1:r 1 :r-
11
P(X $ x, Y $ y) 6e- 2 311
dydx
• = 1:r 6e- "'(-.!_e- )1 dx

0
2
3
311
11
= (1- 1:r 2e- zdx

e- 311 ) 2
= (1- e- 311 )(1 - e- 2 :r)
b)
c)
fy(y) = 1 00
6e-2z- 311 dx = 3e- 311
d) Yes, they are independent since f(x,y) = fx(x)fy(y).

5. nT,;
122
Section 5.2
6. a)
P(Y > 2X) =

.1511 90(y- x) dydx
8
0 2z
1
.5 1
9
= 10(y-x) 1 dx
0 2z
.5
=
1 0
10[(1- x)
9
-
9
x ]dx
= -(1- x)Jo-
= 1- (.5) 9
511
512
b)
1
1 1
fx(x) = 90(y- x) dy
8
= 10(y- x) 9 1 = 10(1- x) 9
%' %'
c) minimum, maximum
7, Without loss of generality, the circle has radius 1.
P(R2 $ (1/2)R!) =
0 0
(2r1)(2r2)dr1 dr2 =
11 0
2
(rJ/2) 2r1 dr 1
1
= -.
8
8. a)
fy(y) = 1: c(y
2
- x )e-lldx = ce-y [xy
2 2
- [=-y
= -4c
3
y3 e -y (y > 0)
Therefore, Y - gamma( 4, 1) and

4c 1 1 1
3 = r(4) =6 so c= 8
b) Let Z = 4Y 3 .
1 z
I Ify «4>
1/3
I z <z > = >
1 1z -(.1.)1/l
----e "
12( f )2 /3 6 4
zl/3e-(z/(l'''
(z > 0)
72 41/3
0
c) Since given Y = y, lXI y,
E(IXI) = E [E (lXI Iy)] $ E(Y] = 4
9. a.) 2..\?e-.$r+yJ (0 < x < y), no; b) (x > 0, z > 0), yes; c) X is exponential (2..$
a.nd Z is exponential (..\).
Section 5.2
10. For 0 < v < w,

P(V E dv, WE dw) = P(no X; in (0, v), 1 in dv, n- 2 in (v, w), 1 in dw)
= n(n -1)P{XI E dv, v < X2, .•. , Xn-1 < w, Xn E dw)
2
= n(n -1) (>.e-.\udv) (e-.\u- e-.\w)"- (>.e-.\"'dw)
f( v, w ) = n ( n - 1) "\2 e -.\(v+w) ( e -.\v - e -.\w)n-2 (0 < v < w)
11. a) E(X + Y) =
E(X) + E(Y) = (1/2) + 1 = 3/2
b) E(XY) =
E(X)E(Y) =
(1/2) X 1 1/2 =
c) E[(X- Y) 2 ] = E(X 2 ) - 2E(X)E(Y) + E(Y 2) by linearity and independence
= -2 x t x 1 +2 using Var(X) = 1/12, and Var(Y) =1
= 4/3.
d) E(e 2 y) = f0
00
e 2 Y • e-Ydy = oo. So E(X 2 e2 y) = E(X 2 )E(e 2 y) = oo.
12. For 0 < s < t,
P(T1 E ds, T5 Edt) = P(no arrivals in (0, s), 1 in ds, 3 in (s, t), 1 in dt)
= (e-.\•) (>.ds) ( s)f) (>.dt)
f(s, t) = s) 3 e-.\t (0 < s < t)
13. The distributions are all the same, with density 2(1 - x) for 0 < x < 1.
14. a) Let V = min{U1, ... , U5) and W = max(U1 , .•. , U5 ). Then R = W- V.
P(W w) = 1 - P(W < w) = 1 - w 5
E(W) = ( P(W w)dw = ( 1- w dw = 5

Jo } 0 6
P(V v) = (1- v) 5
£(\') =
10
1
P(V v)dv = 11
0
(1- v) 5 dv =-
6
1
?
E(R) = E(W)- E(V) =
b) For 0 < v < w < 1,
P(V E dv, WE dw) = P(no U; in (0, v), 1 in dv, 3 in (v,w), 1 in dw)
= 20P(U! E dv, v < U2, U3, Ut < w, U5 E dw)
3
= 20dv(w- v) dw
3
f(v,w)=20(w-v) (0<v<w<1)
c)
P(R > 0.5) = 1 1w-o.

0.5
1
0
5
20(w- v) 3 dvdw
=2011 [ ..s dw
0.5 v=O
= 51 0.5
1
w
4
- (0.5} dw
4
=5 [ - w(0.5)
4
L=o.s
31
= 32
124
Section 5.2
15. a) F(b, d)- F(a, d)- F(b, c)+ F(a, c)

b) F(x, y) = f(u, v)dudv.
c) f(x, y) = :
11
F(x, y).
d) F(x, y) = Fx(x)Fy(y).
e) F(x, y) = P(min $ x and max $ y) = P(max $ y)- P(min > x and max $ y)
=y"-(y-x)", O<x<y<l.
f(x,y) =! :
11
F(x, y) = n(n -l)(y- x)"- 2 , 0 < x < y < 1.
16. Considering the three-dimensional space which corresponds to triplets (X1, X2, X 3 ) we wish to find
the volume which gives us X1 < X2 < X3. This volume is given by:
17. a) Note that -R $X$ R. Let 0 $ r $ 1, and -r $ x $ r. If X E dx andRE dr, then the point
(X, Y) lies in one of two (almost) parallelograms:
(R e dr)
So
P(X E dx, R E dr) = 2 x area of
= 2 x dx
area of cucle
x dr
Jl- (xfr)2
I 1'" = -2
1'"
r
./r2 - x2
dxdr
125
Section 5.2
b) Note again that -R :$X:$ R. Let 0 :$ r :$ 1, and -r :$ x :$ r. If X E dx and R E dr, then the
point (X, Y, Z) lies in an "inner tube" formed by rotating the parallelogram in (a) about the
x-axis. Hence
P(X E dx,R E =2x 1r x (distance to x-axis) x (area of parallelogram) I (volum' .>here)
=2x 1!' x y r · - x· x dxdr

Jl- (xfr)2
/4 3
-1!'
3
= -rdxdr
2
So f(x, r) = 0 :$ r :$ 1, -r :$ x :$ r.
18. a) By symmetry, you can argue that the probabiliiy must bet. since P(X1 = X2) = 0 if they are
independent continuous random variables.
Here is an alternative proof using an explicit calculation. Let X1 and X2 have range (a, b).
(Possibly a= -oo or b = +oo.) Then,
= b1:r1
P(Xl < X2)
1 a a J(xt)J(x2)dx1 dx2
= 1b J(x2)[F(x2)- F(a)]dx2
= 1b f(x2)F(x2)dx2- 0
= E[F(X2)] =-21
since F(X2) has the uniform(O, 1) distribution.
b) Again, using symmetry, you can argue that each of the 6 possible orders is equally likely, so the
probability must be t.
19. a) Lon clearly has uniform distribution on (-180, 180): the event (LonE dx) has the same proba-
bility for all x. Hence
-180 <X< 180
f Lon (X ) -- { O
1/360
otherwise.
b) Let e be as in Example 4.4.7; then Lat = 1!0 e. Since the density of e is
fe(B) = { -11'/2 < 8 < 7r/2

otherwise,
it follows by the linear change of variable formula that Lat has density
1!' lr
hac(Y) = 180/e( 180y)
if-f< 1; 0 y<f <==> -90<y<90
otherwise.
c) P(Lon E dx, LatE dy) = P(Lon E dx I LatE dy)P(Lat E dy) = P(Lon E dx)P(Lat E
since Lon still has uniform distribution on (-180,180) given the value of Lat. Henc e joint
density of (Lon, Lat) is
f(x,y) = /Lon(x)/Lac(Y)
= {
03
! 0 • 3; 0 cos( 1; 0 y) if-180 < x < 180 and -90 < y < 90
otherwise.
d) Yes, since the joint density equals the product of the marginal densities.
126
Section 5.2
20. Since (X, Y) has uniform distribution on the unit square, it follows that the probability that (X, Y)
lies in a given subset of the unit square is the area of that subset.
; I
i .
. !
a) If 0 < r < 1 then

/n(r)dr = P(R E dr) = r X '2r x dr
(Recall that arc length = radius x angle subtended) while if 1 < r < -/2 then
/n(r) = P(R E dr) = r x 8 x dr,
where = 9 +2 (arccos has range [O,r] ). So
(r/2)r O<r<l
/R(r) = { r(r/2 -2arccos(l/r)) l<r<-/2
(R E dr)
b) Integrate f R, or observe that if 0 < r < 1 then

Fn(r) = P(R r) = P ((X, Y) is within r of (0, 0)) =
and if 1 < r < .J2 then
Fn(r) =area of sector of circle+ area. of 2 triangles
2Jr 2
=( - arccos
2
r +
127
Section 5.2
c) The following inequality holds for each a, b 0:
a+b
>--
- V2
(P-roof: Rearrange (a- b) 2 0 to get 2(a 2 + 62 ) (a+ b?, then take square roots. Or: Let W
be a random variable taking values a, b with probability 1/2. Then a';b' = E(W 2 ) (EW? =
2
( ) .) Apply the above inequality to X and Y, take expectations to get
E(R)=E)X 2 +Y2 >E(X+Y) = E(X)+E(Y) =-1.

- V2 V2 V2
In fact, the inequality is strict, since P(../X 2 + Y 2 = xJ{) = P(X = Y) = 0, hence R =
JX2 + Y2 > xj{ with probability 1, and so E(R) > E(xJ{).
For the second inequality, note that R >0 implies
d)
=
1
lr2
-r dr +
j,fi 2 1
1
ll"
E(R) 2r (- -arccos-) dr
0
2 1
4 r
=
2
f
Jo
r 2 dr +
2
j,fi
1
r 2 dr - 2 j,fi
1
r 2 arccos !dr
r
=-
2 o
ll" l,fi r
2
dr-2
lrr/4. 3
8sec 8tan8d8(8=arccos-)
1
9=0 - r
To evaluate the second integral, use integration by parts:
J 8 sec 3 tan 8d8 J

= 31 8 · d(sec 3 8)
1
= -8sec
3
1
3 8--
3
J sec 3 8d8
= -31 8 sec 3
8 - -1 · -1[sec 8 tan 8 + log I sec 8 + tan 81 ]
3 2
Hence
E(R)= ( J2)
2 - 3-
ll" {
3
} -2 { 1
3 - 4 (v2)
1r r:: J r:: r::
- 1 [ v2+log(v2+1) l}
6
1
= 3 [J2+ log( Vi+ 1)] = 0.7651957.
You can check that JT < .765 <Jr.

21. a) E(Dcorncr) = .765, E(Dcenter) = E(Dcorncr)/2 = .3825.
b) Let the two points be (X1. Yl), and (X2,Y2).
E(D 2 ) = E[(X1- X2) 2 + (Yi- Y2) 2]

= 4E((X!) 2
]- 4E(X!X2) (by linearity and symmetry)
= 41 1
2
x dx- 4E(X!)E(X2) (by independence)
=4 ° 1/3- 4 ° 1/2 ° 1/2 = 1/3
c) Since Var(D) = E(D 2 ) - [E(DW o, we have E(D) ..jE(D2) = 1/VJ = 0.577 _
128
Section 5.2
d) [f u 2 = Var(D) were known, then a 95% confidence interval for E{D), based on n independent
observations of D, would be
-
D ± (1.96) X
(T
..;n
where D= average of n observations. Here n = 10, 000, D= 0.5197, and we estimate u by
;
n (
Dl - ;
n
Di
)2 = Jo.3310 - {0.5197)2 = 0.2468.
So an approximate 95% confidence interval for.E(D) is
0 5197 ± {1. 96 ) X (0.

2468 ) <=? 0.5197 ± 0.0048.
. v'iQ,'OOO
129
Section 5.3
Section 5.3
1. Using the same notation as in Example 1,
a) P(R $ 1/2) = Fn(1/2) = 1- e-t<W = 1- e-i = .1175.

22
b) tP(1 $ R $ 2) = t(Fn(2)- FR(l)] = t [(1- e-! )- (1- e-!)] = .1178.
c) E(IYI) = IYI*e_Y2/2dy = 2 fooo y-j;;e_¥2f2dy = -Ae-y2/2[ = vT

d) P(IXI $ r} = 2P(O $X$ y'2'10g2} = 2 X (.881- .5) = .762
e) By independence of X andY, P(IXI $ r, IYI $ r) = P(IXI $ r)P(IYI $ r) = .762 2 =.58.
f) By the rotational symmetry of the joint distribution of (X, Y),
2
P(IXI $ ::/fr, IYI $ ::/fr) = = (2 x .2967) = .3521.
[P(IXI $ y'l(ig2)]
2
g) P(IXI $ r, 0 $ Y $ r) = P(IXI $ r)P(O $ Y $ r) = .762 X .381 = .29.

2. a) E(X 2 ) = V ar(X) + E(X) 2 = 4 and E(Y 2 ) = Var(Y) + E(Y) 2 = 8, so
E(IOX
2
+ 8Y2 - XY + BX + 5Y- 1) = 40 + 64-0 + 8 + 10- 1 = 121.
b) 2X- (3Y- 5) is normal with mean 1 and variance 48, so
P(2X > 3Y- 5) = P(2X- 3Y + 5 > 0) = 1 - ( = 0.5574
3. a) 1- 4>(0.5) b) 1/2 c) 5 d) Jl4

4. a) By symmetry, this must be 0.5.
b) P(Y- .2 < X < Y) = P( -.2 <X- Y < 0) and X- Y is normal with mean 0 and variance .4,
so
P(-.2 <X- Y < 0) = (__E._)-

v'A
2
4> (-· )
v'A
= .1241.
5. Say Y has standard deviation CT. Then X has normal (0,1) distribution, Y has normal (l,CT 2 ) distri-
bution, and X- Y has normal (-1,1 +CT2 ) distribution; hence
2
"j = P(X $ Y)
= P(X- Y $ 0)
=P(X-Y+I I )
JI + q2 $ JI + q2
yl+a2
.43 and q = ((1/.43) 2 - 1)1/2 2.1.
G. a) 3X + 2Y has the normal(O, 13) distribution.
P(3X +2Y > 5) = P (JX + 2 y > - 5- )

V13 V13
=I- 4>(1.39) = 0.0823
b) By using inclusion-exclusion,
P(min(X, Y) < 1) = P(X <I)+ P(Y <I)- P(X < 1 and Y < l)
= 24>(1)- [4>(1)] = 0.97482
130
Section 5.3
c) By using the symmetry of the standard bivariate normal distribution,
P(lmin(X,Y)I < 1) = P(-1 <X< 1 andY> -1} + P(-1 < Y < 1 and X> 1}
= P( -1 < X < 1 and Y > -1) + P( -1 < X < 1 and Y < -1)
= P(-1 <X< I)
= il>(1)- it>( -I)= 0.6826
d) By using the rotational symmetry of the standard bivariate normal distribution,
P(min(X, Y) > max(X, Y)- 1) = P(max(X, Y)- min( X, Y) < I) = P(IX- Yl < 1)
=P(-
= (1>(0.71)- il>(-0.71) = 0.5222
7. Let X be the (actual) arrival time of the bus, measured in minutes relative to 8 AM (so X = 1 means
that the bus arrived at 8:01 AM). Let Y be my arrival time, also measured in minutes relative to 8
AM. Then X has normal distribution with mean 10 and standard deviation 2/3, and Y has normal
distribution with mean 9 and standard deviation 1/2.
a) P(Y < 10) = P ( ;}2 < 9

= = 0.9772.
b) Assume X and Y are independent. Then Y- X has normal distribution with mean -1 and
variance (2/3) 2 + (1/2) 2 = 25/36, and
P(Y < X)= P(Y- X < 0) = P Y - X +1 < 1 ) r.;;;-;;;;

=it>( y 36/25) = = 0.8849.
( y25/36 y25/36
c) P(X < 91 X '1. [9,12]} = = = .9795

since P(X < 9) =P ( < = (1>(- = .0668
and P(X > 12) =P ( > 1

;7; 0
) = 1- = .0014.
8. Let X be Peter's arrival time and Y be Paul's arrival time, both measured in minutes after noon.
a) Since X- Y has the normal( -2, 34) distribution,
P(X < Y) = P(X --;.+

34
2
< ;.,..)
v34
= il>(0.34) = 0.6331
b) By independence,
P( -3 <X< 3 a.nd - 3 < Y < 3) = P( -3 <X < 3)P( -3 < Y < 3)

= [ G) - it> ( 1[ (D- ( 1)1
= 0.2626
c)
P(IX-YI<3)=P(-3<X-Y<3)=il>
(3+2)
v'J4 (-3+2)
v'J4 -4>
= <1>(0.86) - 4>( -0.17} = 0.3726

9. Let X; denote the height in inches of the ith person. Then each X; normal distribution with
mean 70 and variance 4.
a) P(max(Xt,X2, ... ,Xtoo) > 76) = l - P(max(Xt, ... ,Xtoo) $ 76)

=l - [P(Xt $ 76)) 100
=l- (<1>(3)P 00 =I - (.9986}
100
= .1307.
131
Section 5.3
b) Write Y = We want P(Y > 70.5). Now E(Y) = E(Xt) = 70, SD(Y) sDg(d = =
.1.
10
= 0.2 and Y !us normal distribution since it is a linear combination of independent normal
'
random variables. Hence the desired probability is
y- 70 70.5- 70)
P(Y > 70.5} = P ( - - > = 1- 41(2.5) = 1-0.9938 = .0062.
0.2 0.2
c) The answer to (b) will be approximately the same as before, since Y still has expected value
70 and SD 0.2; we assume the normal approximation is good. On the other hand, the answer
to (a.) will be exactly 1 - [P(Xt ::; 76}]1 00 , but now there is no guarantee that P(X1 ::; 76) is
still .9986; hence raising the actual probability to the 100th power could make the answer very
different from before .
.10. a.) Let X 1 and X2 be the two incomes in dollars.
P (X1 +X2 > 65 ooo)

2 '
= P (X- 6o,ooo > 6s,ooo- 60,ooo)
10,000/./2 10, 000/./2
= 1 - 41(0.71} = 0.2389
b) Let X and Y be the younger and older persons' income respectively.

X - Y has the normal( -$20, 000, $14, 1422 ) distribution.
P(X > Y) = p (X- Y +20,000 > 20,000 )

.ji4,f42 .ji4,f42
= 1 - 41(1.41) = 0.0793
c) By independence,
P(min(X, Y) >50, 000} = P(X >50, OOO)P(Y > 50, 000)

=(I- 41(1)}(1- 41( -1)}
= 0.1335
11. a) Say that X, has normal distribution with mean m(t) and variance v(t). Clearly Xo = 0, so
m(O) = v(O} = 0. Next argue that m(s) + m(t) =
m(s + t) and v(.!) + v(t) = v(.! + t) for all
.!, t > 0; conclude that m(t) = 0 for all t > 0 and that v(t) = tu 2 for all t > 0. Therefore X, has
normal distribution with mean 0 and variance tu 2 •
b) X, and Y. are independent normal (0, tu2 ) random variables, so R = ..B.r..
avr
has Rayleigh distri-
bution. Therefore
E(Rt) = u.fiE(R) = u{ii
and
Var(Rc) = tu2Var(R) = tu 2(4-x-)
- - SD(R,) = uv/t (4-r)
2
- - .
2
c) P(R 1 > 2) = P(R > 2) = e-! 2' = 0.1353.
12. Let the coordinates of the point where the first shot strikes be (X 1 , Yt). Let the coordinates of the
point where the second shot strikes be (X2 , }'2). Assume the two shots strike independently. Then
Xt, X2, Yt, Y, are mutually independent normal (p.,1) random variables (for some 1'), and the distance
between the points where the two shots strike, say D, is given by
where U = (Xt- X2)/./2. V = (YI- Y,)f./2. But U and V are independent normal (0,1) random
variables, so ./U2 + V2 has Rayleigh distribution. Therefore
a) E(D) = ./2E(./U 2 + V 2 ) = v'2Jf = ,fi-:::: 1.7725.

132
Section 5.3
13. We may assume a= 1 by considering X' = Xfa and Y' = Yfa.

a) Refine the argument used to derive the density of R: The event (R E dr, e E dO) corresponds
to (X, Y) falling in an infinitesimal region having area x 2ndr = rdrd8. (Argue that (X, Y)
must fall in an annulus around the origin of infinitesimal width dr and radius r, but the angle
subtended by (X, Y) is restricted to be between (} and (}+dO.) The joint density of (X, Y) has
2 ! r2
nearly constant value c e- 2 over this region, so
Conclude: ( R, e) has joint density

1 !r2
!R,e(r,8) = rre-'2 r > 0,0 < 8 < 211".
2
Integrate out to get the marginal densities of Rand 9, see that R has Rayleigh distribution, e
has uniform(O, 2r) distribution, and Rand 0 are independent [ f R,e(r, 8) = /R{r)fe(O) ].
b) Let X and Y be independent normal (0, a 2) random variables. Define R and e by R =
../X2 + Y 2 and 9 = arctan{Y/X). By part (a), we know Rand e are independent, and Rfa
has Rayleigh distribution, 9 has uniform{O, 2lf) distribution. Since X and Y are independent
normal (0, a 2) variables, and since the joint distribution of X and Y is determined by the joint
distribution of R and 9 (namely, via the transformation X= Rcos e, Y = Rsin 9), the same
conclusion (that X and Yare indep normal (0, a 2 ) variables) holds whenever Rand 9 have this
joint distribution.
c) Using the results of (b), it suffices to find h, k such that h(U) has Rayleigh distribution and
k(V) has uniform(O, 2r) distribution. Clearly we may take k(v) =
2rv. As for h, note that the
distribution function of a random variable having Rayleigh distribution is
F(r) = 1- e-r,/'2 for r > 0.

The inverse ofF is F- 1 (u) = J-2log{l- u) . By the result given in Section 4.5 on inverse
distribution functions and simulation of random variables, F- 1 (U) is a random variable having
distribution function F. In other words, F- 1 (U) has Rayleigh distribution. So it suffices to take
h = F- 1 , that is, h(u) = J-2log(l- u).
14. a) Since the standard bivariate normal density is rotationally symmetric, 9 is distributed uniform(O, 2r).
This implies that 29 mod 2r is also distributed uniform(O, 2r).
b) Since cos(a mod 2'lr) = cos(a) and sin(a mod 2r) = sin(a), Exercise 13 part b) implies that,
R cos 29 and R sin 29 are independent st;_ndard normals.
c) Since R = ../X 2 + Y2,
2XY = 2(Rcos9)(Rsin 9) = Rsin 29

JX2 + y2 R
They are independent standard normals.
15. a.) Let Y = Z2• Clearly fy(y) = o for y o. For y > o, use Example 5 of Section 4.4:
fy(y) = = = -Ji;y-1/2e-y/2 = *(I/2 )1/2yl/2-le-y/2;
has ga.mma.(l/2, 1/2) distribution. The fact that f(l/2) = fo follows from J0 =
00
that is, Y Jy(y)dy
I.
133
Section 5.3
b) Write n = 2k- 1, and use induction on k = 1, 2, .... It's true fork= 1 by part (a). If true for
k, then
r ck: 1) = ck; I 1)
r +
2\- 1 ck; 1)
= r
2k- 1 ,fi(2k - 2)!
= -2- · 22k-2(k- 1)!
2k(2k- 1). ,fi(2k- 2)!
= 2: 22k-2. 2k(k- 1)!
- ,fi(2k)!
- 22kk! •
therefore true for k + 1.

c) (X/u) 2 has gamma (1/2, 1/2) distribution by (a). That is, Y = X 2/u 2 has density
fy(y) = 7,.0)1f2ylf2-le-yf2,y > o.
So by a linear change of variable, X 2 = u 2 Y has density
f x> (w) = I f Y ( w) = --;;;

1 ( I )lf2wl/2-le-wf2a'.
'
that is, X 2 has gamma (1/2, distribution.

d) Zl, .•. , are independent gamma {1/2, 1/2) variables, by (a). Since Zl + ·· ·+ has gamma
(n/2, 1/2) distribution.
c) For each j, Y, has gamma (kj/2, 1/2) distribution (by definition of x 2 ). The Y,-'s are independent,
Y1 + · · · + Yn has gamma ((k1 + · · · + kn)/2, 1/2) distribution, i.e., has x2 distribution with
k1 + · · · + kn degrees of freedom.
16. Since has the gamma(m, 1/2) distribution, it has the same distribution as the time of the
mth arrival of a Poisson process with rate 1/2.
Let Tm be the time of the mth arrival of a Poisson process with rate 1/2. For x ;?: 0, has
c.d.f.
F(:r:) = :5 x) = P(Tm :5 x)
= I- P(Tm > x) = 1- P(at most m- 1 arrivals in (0, :r:))
mL-1
=1-
k!
k=O
c) Form= I,
:5 :r:) = I -
r
P(R 4 < r)
17. a) Skew-normal approximations: 0.1377, 0.5940, 0.9196, 0.9998, 1.0000
Compare to the exact values: 0.0902, 0.5940, 0.9389, 0.9970, 1.0000
b) 0.441, 0.499. Skew-normal is better.
134
Section 5.3
18. The X;'s are independent and identically distributed uniform [-1/2, 1/2] random variables, so E(X;) =
0 and Var(X;) = 1/12. Hence E(X) = 0 and Var(X) = I/ 12n . Since X is the average of iid random
variables having finite variance, it follows that for large n, X has approximately normal distribution
with mean 0 and variance I/12n, and Jl2ri'X has approximately standard normal distribution. This
argument applies to Y and Z, too. Since X, Y, and Z are independent, we then have that
has approximately chi-squared distribution with 3 degrees of freedom. Hence
0.95 = P(12nR
2
:5 7.82) = P ( R :5 JYJI) .
tr.i2- 0 ·807
So R is 95o/co sure to be smaller than VTin--rn-·
135
Section 5.4
Section 5.4
b) Sec the pictures below: If 0 :5 z :5 I,

P(X 1 + X 2 E dz) = (1/'2)·(z+drr-{1/'2)·z
2
:::::: z·;• (ignoring (dz) 2 as negligible in comparison to
dz);
if 1 :5 z :5 2,
P(X1 + X2 E dz) = {area of a parallelogram);
if 2 :5 z :5 3,
(1/2)·(3-z) 2 -(1/2){3-z-dz) 2 (3-z)dz { • • • (d z )2 ) .
P(x 1 +X2 Ed z ) = 2 :::::: 2 agam 1gnonng
Therefore
{'''
0$z:51
1/2 1:5z:52
fx.+x,(z) = _ z)/ 2 2:5z:53
elsewhere
dz
A
2 2 2
dz
<
z
dz <; > z -1
z•
----
0 0 0
c) You could either integrate b) or argue from the pictures below:
136
Section 5.4
0 z$0
z2 /4 O$z$1
Fx,+x,(•) { (2z- 1)/4 l$z$2
1- (3- z) 2 /4 2$z$3
1 z;:::3
2 ,.---------,
· :>- 3 -z
2. a) J0
1
tdt + f.s(2- t)dt = 7/8.
Or: $ 1.5) = 1- P(S2 $ 0.5) = I - f0112 tdt = 7/8.
b) 1/2, by symmetry of the density of 83.
c) J0
112 2
t dt + f·' {I- t(2- t) 2 - t(t- 1)2 ) dt = 0.2213.
d) Approximately /s3 (1) x 0.001 = (1/2) x O.OOI = 0.0005.
3. Let X be the waiting time in the first queue, Y the waiting time in the second. Then X and Y are
independent exponential random variables with rates a and f3 respectively.
a) From the convolution formula,

fxty(z) = fx(x)fy(z- x)dx = J0z ae-a"'pe-!3(•-:rldx, z > 0.
If a= /3, the integral becomes
a 2 e-a• J0' dx = a 2 ze-a• (density of gamma (2,a)).
If a f /3, the integral becomes
J0
'ae-a:rpe-(J(z-:r)dx = a{Je-fJ• J 0
' e-<a-fJ):rdx = - e- 0 '). Sketch of the density in
case a = 1 and f3 = 2:
0.0 .:...i' _ _ _ _ __
0 2 3 4 5
b) E(X + Y) = E(X) + E(Y) = +i .

c) Var(X + Y) = Var(X) + Var(Y) = + =? SD(X + Y) = Ja,ot/ 2
•
137
Section 5.4
4. a) Let T, be the time to the first failure, T2 the time till both fail. Then T1 has constant hazard
rate 2>., so has exponential distribution with rate 2>.. Also, - T, is independent of T 1 with
the same distribution. So T2 = Tt + (T2 - T1 ) has gamma ('2. '! \) -iistribution.
b) Use the fact that T2 is the sum of two independent exponential (2>.) random variables.
E(T,) = 2 XA = ±'
V ar(T2) = 2 X (l!)2 = 2'h ·
c) For each t > 0:
P(T $ t) = f '(2>.)
0
2
te_ 2 ... 'dt = f02 .>.' xe-"dx = 1- {1 + 2>.t)e-2.1.•.
We want t such that
1- (I+ 2>.t)e- 2 .>.' = .9 <==> e2.1. 1 = 10(1 + 2>.t).
The equation e 11 = 10(1 + y) has one positive solution, namely y*::::: 3.88972, so the desired t is
t -- .L. - !.:!!.!ill.
2.1.- ).
5. a) The range of X+ Y is the interval [1, 7]. If 1 < t < 7 then
P(X +Y Edt)= P(X = int(t), t- int(t) < Y < t- int(t) + dt) =
So X + Y has uniform distribution over [1, 7] and lO(X + Y) has uniform distribution over
[10, 70].
b) = 0.483
6. gamma (rt + r2 + · · · + rn, >.). Use the result for two variables given in Example 2, and induction on
n.
7. a) Write Z = XY. Let z > 0:
z dz
y=-+-
/ X X
- z
y=-
I I x
0
The event (X E dx, Z E dz) is represented by the shaded region. This region is approximately
a parallelogram, with left side having length distance between the vertical sides ;x, so
the area of the parallelogram is approximately fzJdxdz. The joint density of X and Y on this
small parallelogram has nearly constant value f X,Y(x, . Conclude:
1 -
P(X E dx,XY E dz) = ;)dxdz.
This argument works for z < 0 too. Integrate out x to get
!
00
1 ;;
fxy(z) = -oo
138
Section 5.4
b) Write Z =X- Y. Let z be arbitrary.
y = x-z-dz
(Z e dz)
From diagram, we get
P(X E dx, X- Y E dz) = Jx,y(x, x- z)dxdz,
whence
fx-y(z) = 1: Jx,y(x, x- z)dx.
c) Let Z =X+ 2Y. For arbitrary z we have:
I
I I
i I
z -x dz
.-Y=-2-+-:;-
1 ""
,I
'-. z-x
y=--
2
From diagram, obtain
z- x dxdz
P(X E dx, X+ 2Y E dz) = fx.y(x,
and
/X+2Y(z) = j
-co
oo 1 z-x
2/x,y(x, - -)dx.
2
8. If z > 0, then
Fx 1 y(z) =P ( z) = P(Y;::: .;x) (since Y > 0)
139
Section 5.4
= 11 00
z=O
00
y=zfz
fx(x)fy(y)dydx = 1 00
z=O
ae-az (1 00
y=z/z
{3e-{jydy) dx
=
1 oo
z=O
ae
-a:r _l.,d a az
e • x = - - = ---.
az + {J
If z $ 0 then Fxty(z) = 0.
9. fx(x) = -log(x) (O<x<l)
10.
1/2 ifO<y<I
Jy(y) = . if y 1
{
otherwise
11. uniform (0, 1)
12. X = -log{U(l - V)} = -log U -log(l- V). Since I - V is also uniform on [0, 1], we just need to
find the distribution of -log U. For x 0,
so -log U is an exponential random variable with rate 1. Since X is the sum of two independent
exponentials with rate I, it has the gamma(2, I) distribution. Its density is J(x) = xe-" (r > 0).
E(X) = ? and Var(X) = 2.
13. iz(z)
14. If the joint distribution of (X, Y) is symmetric under rotations, then the probability that (X, Y) falls
in a sector centered at (0, 0) subtending an angle 8 is 8/270. So let c E R. We have
P P(Y $ cX,X > O)+P(Y X< 0).
y y
\
lc
y= CJC
\.
X
Casec > 0 Casec<O
From diagram, the corresponding region in the plane is a pair of sectors centered at (0, 0), each sector
subtending an angle arctan c, where arctan cis that unique angle in tangent c.
Hence
P ( $ c) = 2P ( (X, Y) falls in a sector sub tending an angle of +arctan c)

= 211"
= -2I + 1
- arctan c.
ll"
So Y/ X is a continuous random variable having density

d 1 1
fYtx(z) = -P(Y/X
dz
$ z) = - - -2
1r 1+ z
140
Section 5.4
15. a) Let X 1 =>.X and 'Yi = >.Y. Then Xt and 'Yi are independent exponential (I) variables, and
Z = min(Xt. 'Yi )/ max(Xt, 'Yi ).
b) P(Z $ z) = 2z/(1 + z)
c) /z(z) 2/(1 + z) 2 for 0 < z < 1.
16. a) Fix r and t.

F(r, >., t) = P(T $ t) = P(>.T $ >.t) = P(Tt $ >.t)
where Tt has gamma (r, 1} distribution. This last quantity is an increasing function of>., since
any c.d.f. is nondecreasing.
b) Fix>. and t. Let 0 < r' < r. LetT have gamma (r,,>.) distribution, and T', T'' be independent
with gamma (r', >.) and gamma {r - r', >.) distribution respectively. Since T has the same
distribution as T' + T", and since T' + T'' T', we have
F(r, >., t) = P(T $ t) = P(T' + T'' $ t) $ P(T' $ t) = F(r', >., t)

17. a) Let 0 < t < 1. The plane that cuts at (t/3, t/3, t/3) has equation x + y + z = t. The plane that
cuts at ((t + h)/3, (t + h)/3, (t + h)/3), h small positive, has equation x + y + z = t +h. These
two planes determine a "slab" of uniform thickness hf.../3. The desired cross-sectional area is
lim volume of slab =lim volume{(x,y,z):t<x+y+z<t+h}

h-o thickness of slab h-o hf.../3
= lim P( t < X + Y + Z < t + h)
h-o hf.../3
= Vafx+Y+z(t)
where X, Y, Z are independent uniform [0,1] random variables. Hence the area of the cross-
section at (t/3, t/3, t/3) is
O<t<l
l<t<2
2<t<3
b) If t $ 1, the plane x + y + z = t intersects the edges of the cube at the three points
(t, 0, 0)(0, t, 0}(0, 0, t)
(find these coordinates e.g. using the equations of the edges of the cube); hence the cross section
is an equilateral triangle having side length t../2 and area · 2t 2 = L.}t 2 •
If t = 3/2, the plane x + y + z =
3/2 intersects the edges of the cube at six points forming the
vertices of a (plane) regular hexagon having side length 1/ ../2; hence the cross-sectional area is
6-.il!- 3
t 2 - h73'
18. a) Duplicate the derivation of the density of the sum of three independent uniform random variables:
If n 2 then
/n(x)
= 1:
= fs._,+x .. (x)
/n-t(u)fxn(x- u)du
= 1z
z-1
/n-i(u)du
= Fn-t(x)- Fn-1(X- 1).

b) Claim: For each n 1: For all i = I, ... , n:
(1): On the interval (i- l,i), /n(x) is a polynomial having leading term
(II): On the interval (i- 1, i), Fn(:r) is a polynomial having leading term (-tJ;-'
141
Section 5.4
Proof: Induction. True for n = 1. If holds for n = k, where k;:::: I, then

(I): For each i = I, ... , k + 1: On the interval (i- 1, i) we have
/ktt(x) = -1)
1 2
- (-I)'- ("- 1) "- (-1)'- ("- 1) - " terms of degree at most k - 1
- k! i- 1 x k! i - 2 (x 1) +
= . + termsofdegreeatmostk-1
= ( - 1)'-1 ( . k ) xk + terms of degree at most k - 1

k! '- 1 .
1 "alh . lea.di (-l)i-1 (k+1-1) k+1-1

= a po ynomt avmg ng term (k + _ )!
1 1
i _
1
x ·
Note that this argument works for the "boundary" cases i = 1 and i = k + 1.
(II): For each i =
1, ... , k + 1: On the interval (i- 1, i) we have
Fkt1(x) = 1'- 0
1
!ktt(t)dt + jr •-1
/ktt(t)dt
= canst+ { (- (i 1) terms of degree at most k- 1} dt

1
=( 1)'-
1
( k )
+ )! i _
1
x"+ + terms of degree at most k
1
1 1
=a "al h .
po1ynomi
1 ad" (-1) - (k+i _1-1) x ktl ·
avmg e mg term (k + )!
1 1
So claim holds for n =k + 1.

c) Claim: For each n = 1, 2, ... we have: If 0 <x <1 then
(ii) Fn(x) = x:.

n.
Proof: Holds for n = 1. If holds for n = k, where k;:::: 1, then for x E (0, 1)
and
= 1 0
r •
/ktt(t)dt = 1"
0
t"
k!dt = (k+x" 1)!
so claim holds for n = k + 1.
d) Claim: For each n = 1, 2, ... we have: If n- 1 < x < n then
. (n-x)n-1 (n- x)" (n- x)"

(s) fn(x) = (n _ )! ; 1
( ii) Fn(x)=1-
n.1
.en(x)=I-
n.1
Proof: Holds for n = I. If holds for n = k, where k;:::: 1, then for x E (k, k + 1)
(k-(x-I))k) (k+ -x)k

/ktt(x)=Fk(x)-Fk(x-1)=1-Fk(x-1)=1- ( 1- k! = k!
and
+ 1- x)"+ 1
Fktt(x) = P(Sk+t :5 x) = 1- P(Sk+t > x) = 1- " j k+l
= 1-
(k
(k + )!
1
so claim holds for n = k + 1.
142
Section 5.4
e) P(O:::; S4 :::; 1) = F4(1) = t by (c).

f) P(l :::; S 4 $ 2) = P(S4 $ 2)- P(S4 $ 1) = t (by symmetry and part (e))
g) P{l.5 $ S 4 $ 2) = J;\ j4(t)dt. Now if 1 <t< 2 then
{2-tf2 (t-I?
!J(t) = F2(t)- F2(t- 1) =1 - 21
-
21
;
F3(t) = FJ(1) + j' /J(u)du

= (t- 1) + (2- t)3 - (t- 1)3.
3! 3! '
3
2(t-I) 3
f 4 (t) = F3(t)- F3(t -1) = (t -1) + (2-t)
31 31
Hence
P(Ls s4 2) =
1
2
1.
5
h(t)dt = 1 2
1.5
[
(t- I)+ (2 t)
3
- ;! 1) 2(t
3
]
dt = o.2994s.
Aside: Integrate / 4 to get, for 1 < t < 2,
(2-t) 4 2(t-1) 4 1
4! - 4! + 4!'
19. If X, Y have joint density Jx,y(x, y) then argue as in Exercise 7 that X and X+ Y have joint density
/X,X+Y(x, z) = Jx,y(x, z - x)
and X/Y , Y have joint density
fxJY,y(u, y) = IYI!x.Y(uy, y).

Therefore the joint density of X/(X + Y) and X+ Y is
fxt(X+Y),X+Y(w, z) = lzlfx,X+Y(wz, z) = lzlfx,y(wz, z - wz).

Here
..), • ,_, -1.!1 0
and ( )
f yy=r(s)Y e ,y>.
So the joint density of X/(X + Y) and X+ Y is
fxJ(X+Y),X+Y(w,z) = lzlfx,y(wz,(l-:- w)z)

= zfx(wz)Jy((l- w)z)
=z (wzt-te-1.wz [(I - w)z]'-t e-1.(t-w)z
= f(r + s) w r-1 (1 -w )'-1 ).r+• z r+,-1 e -1.z

f(r)f(s) f(r + s)
if 0 <w < 1 and z > 0 ; the density is zero otherwise. Integrate out to see that
(we knew this already); and that
f(r + s) r-1(
f X/(X+Y) (W ) = f{r)f{s) )-•-1
W 1- U! ,0 < W < 1;
and that fxt(X+Y),X+Y(w, z) = fxt(X+Yj(w) · fx+y(z). Thus X/(X + Y) has beta distribution with
parameters r, -'• and is independent of X+ Y.
143
Chapter 5: Review
Chapter 5: Review
1. Draw a 'picture of the unit square; then compute
P(Y > I/2IY > X2) = P(Y > 1/2, y > X2) = 1 - -./2
- - P(Y;::: X 2 ) 4 '
since numerator = f 1
1 12
-../Ydy =HI- 4>
and denominator = J0
1
..jYdy = i.
2. a) 3/4.
b) Density of Z =IX+ Yl is fz(z) = 1- z/2, 0 < z < 2. So
EIX + Yl = E(Z) =
120
zfz(z)dz =
120
(z-
2
2
2
)dz = -.
3
3. a) Assume center of coin lands uniformly on disc of radius 2.5"
1" lr( l2 )2
P(lands in disc of radius - )
2
= -22( = 0.04
11"
1
)
2
b) Center lands uniformly in square of side 4.5
x-(0.25)
-(
)2
4.5
= 0.03878 ...
_ area of square with side (4.5- iv'2) =

0 • 2896
c) 1 ( 4 •51 2
4. a) P(X 2 + Y 2 r 2 ) is one fourth the area of a circle of radius r inside a square with the same
center and side length 2. For r > v'2, this is 1. For r 1, this is ";'. For I < r v'2, the
circle extends beyond the square on each side. On each side, the square intersects the circle in a
chord with length 2v;:r=t. For one side, the area of excess is the area of a wedge of the circle
minus the area of a triangle. That is
2 arccos(1/r) 2 I I r:;--:;
Area of excess= (rr ) - -(2y r 2 - 1}(1) = r 2 arccos(-)- y r 2 - 1
2r 2 r
Using this we see that for 1 r -./2,

P(X 2 + Y 2 I (rr
r2 ) = 4 2 - 4(r 2 arccos(-;=-)-
I y r2 - 1)
2
ll"r 1
= -
4
2
- r arccos(-)
r
+y r2 - 1
b)
0 for x <0
()- ( - )- 7-
•r:r
2 - for 0 x
Fx-PR<x-
x arccos( -j;)
t
+ for I <x
{
for x >2
c) Differentiate the c.d.f. to find the density.
0 for x <0
!-
1f
F' (X ) -- 4 for 0 x I
arccos( -j;)
X -
f( ) - for 1 < x 2
{
for x > 2
144
Chapter 5: Review
5. a) P(D;:: 1/2) =shaded area= 1- 4•lf · (4) 2

= 1- lf/8.
b) Let the coordinates of the point be (X, Y), where without loss of generality X has uniform
distribution on (0, 1), and Y has uniform distribution on ( -1/2, 1/2). Take the midpoint of the
given side to be (0, 0). Then D 2 = X 2 + Y 2 by Pythagoras' theorem, and
6. The charge C per call in dollars is
c-{I + - 1 0.01 X 60(T- 1) = .4 + .6T

ifT $I
if T;:: I
where Tis the call duration in minutes. So
E(C) = 1/(t)dt + 1: (.4 + .6t)dt =I+ .61: (t- 1)/(t)dt
where f is the density ofT.
a) Here f(t) = e- 1 , soJ::, (t- I)f(t)dt = e- So E(C) = 1 + .6e- = 1.22.

1
•
1
b) Here f(t) = (I/2)e- 1 so J.':, (t- 1)/(t)dt = 2e- 1 So E(C) = 1 + 1.2e- 1 = 1.73.
1 2
,
1 2
•
1 2
c) Here f(t) = 4te- so J::, (t- 1)/(t)dt = 2e- So E(C) = 1 + 1.2e- = 1.16.
21
,
2
•
2
7. X has normal distribution with mean ll and SD 0.1, so
P(IX -Ill;:: .25) = 2 (I- .P(.25/.I)) = 2(1- .P(2.5)) = 2(1- .9938) = .0124.
8. By the central limit theorem, the distribution of X is approximately normal, with the same mean and
variance. So the required probability is now approximately 0.0124 . [n is 100, which is pretty large.
So the normal approximation should be good.]
u;;
9. a)
O<x<l
1<x<2
2<x<3
otherwise
b) Try a. uniform (-l/2, 1/2) distribution for U.
10. a) for z >0,

P(Z > z) = P(X > z, Y > z) = P(X > z)P(Y > z) = e-).re- = e-().+l•)r; 1"
hence P(Z $ z) = 1- e-().+p)z and
145
Chapter 5: Review
c) P {t < <2) = 1- [P(t 2)] = 1- 2P(X 2Y) by symmetry.

But X has exponential distribution with rate ..\, and 2Y has exponential distribution with rate
>.{2, so by part (b) the desired probability is
1 - 2. (1/2)>. = .!. .
.,\ + (1/2)>. 3
11. a) If 0 < x < 1 then P(X > x) = P(U/V > x) = P(U > xV) = 1- {1/2)x. [Draw a picture.)
O<x<l
b) Fx(x) ={ J_
X> 1
2.::
O<x<l
c) fx(x) = tzFx(x) = {
X> 1
1.0 1
Disuibution function
o 2 3 4 5 6
1.0
Density
0.0 --------------=======::;:::::::::::=========""""""'"""""""'
0 2 3 4 5 6
12. The distance from the first marksman's shot to the center is
and the distance from the second marksman's shot to the center is
where R, R' are independent Rayleigh. random variables. The desired probability is therefore
P(bR' < aR) = P(R > !!_R')

a
= fR(x)fR'(y)dxdy
{CQ b
= Jy=O fR'(y)P ( R> -;;Y) dy
1
00
= J. l
ye-,Y e-,!(.£"Y )l dy
y=O
1
00 2
=
y=O
l[J
ye-, + "
(.£)2] 2
Y dy = ---.
a
a2 + b2
Intuitively, if a<< b then the desired probability should be small.
146
Chapter 5: Review
13. Let e E (0, 1r] be the angle subtended by the (shorter) arc between {X, Y) and the point {0, 1). Then
e has UI!iform distribution on (0, lr], and X= cos e.
a) If lxl 1 then
' 1
Fx(x) = P(X x) = P(cose x) = P(e arccos(x)) = 1 - 11'- arccos(x),
where arccos has range (0, 1r].
b) Y has the same distribution function as X.
c) P(X + Y $ z) = P(X $ z/V'i) by rotational symmetry of the uniform distribution on the circle.
So
if z <-Vi
if lzl -J2
if z > V2
14. a) I/6, by symmetry: all 3! orderings of Ut, U2, U3 are equally likely.
b) E(U1U2U3) = E(UI) · E(U2) · E(U3) = [E(Ui)]3 = i.
c) Var(U1U2U3) = E[(UtU2U3) 2
]- [E(U1U2U3W = [E(U[W- {i)2 =if- 6
1
, •
d) If 0 <z< 1 then P(U1U2 > z) = J:=z(1- = 1- z + zlog z, so
P(U1U2 > U3) = jj z>u

/u 1 ul(z)fu3 (u)dzdu
= . fu 1 ul(z)/u3 (u)dzdu
=}..
reo=-eo P(U1U2 > u)fu (u)du = Jot (1- u + ulogu)du = 4·1
3
e) > UJ) = 1- P(max(Ut,U2) < U3) = 1-P(U3 is largest)= 1--§- = i

15. a) P(Zt < Z2 < Z3) = 1/6 by the same argument as in (a).
b) E(ZtZ2Z3) = E[(Zt) 3 ] =
0.
3
c) Var(Z1Z2Z3) = [E(Zf}] - [E(Zt)] 6 = [Var(Z W- o = 1.
1
d) P(Z1Z2 > Z3) = 1/2, since Z1Z2- Z3 is a symmetric (about 0) random variable .
e) P(max(Zt,Z2) > Z3) = 2/3 by the same argument as in (e).
I
f) P(Z? + > I)= e-2 by Section 5.3.
g) zl + z2 + z3 is normally distributed with mean 0 and variance 3, so the desired probability is
P ( Zt + + z3 < = <P(2/-./3) = o.s7s9.
h) P(Zt/Z2 < 1) = P(Zt < Z2, Zt > 0) + P(Zt < Z2, Zt < 0) = , by the rotational symmetry
of the joint density (draw a picture).
i) 3Zt - 2Z2- 4Z3 is normally distributed with mean 0 and variance 29, so the desired probability
is
3Zt - 2Z2- 4Z3 I )
p ( v'29 < .J29 = ( 1/ vr.;;;
29) = 0.5737.
16. a) R 2 has the chi-square distribution with 3 degrees of freedom, so it also the gamma(3/2, 1/2)
distribution. Hence it has density
(t > 0)
147
Chapter 5: Review
b) By the change of variable formula,
1 2r r-;; _.2, _.2,

/R(r) = .,P2 2
= tn=vr2e = -r e
2
(r > 0)
(1/2 r ) v 211" ll"
c) Letting u = r 2 /2,
E(R) = 1 00
= 1 00
2u
2
-
1
e-"du = =
d)
E(R2 ) = E(X 2 + Y 2 + Z2 ) = 3
2 2
Var(R ) = 3- (2...j2'k)
= 3-
ll"
17. Xt, Xi, ... , X?oo are independent identically distributed random variables having mean 1 and variance
2. [In fact, X 2 has gamma (1/2, 1/2) distribution: see the section on gamma densities.] Hence the
sum has mean 100 and variance 200; and by the normal approximation, the sum is approximately
normally distributed.
a) So if Z denotes a standard normal variable, then
P(X2 + ... + X2I 00 > ) =p (sum - 100 >

I - 80 v'20o _ 80v'20o
- 100)
::::: P(Z -J2) = .92.
b) Find c so that
.95 = P(100- c $ Xt + · · · + x;oo $ 100 +c)
= p (--c- <
v'20o -
sum - 100
v'20o
< _c_)
- v'200
P(IZI $ cf../Wo) = 1
<=> = .975 <=> = 1.96 <=> c = 27.7.
y200
18. Let r > 0. Note that if a, b 0 then the event
aX +bY$ rVa 2
+b2
occurs if and only if (X, Y) lies in the shaded region:
where the line has slope -afb and the distance from the origin to the line is r. Hence the desired
148
Chapter 5: Review
event occurs if and only if (X, Y) lies in the region formed by intersecting all such regions:
This region is the union of 3 (infinite) "rectangles" and one quarter circle. The desired probability is
then
P(X < o, Y < 0) + P(o <X< r, Y < O) + P(X < o, o < Y < r) + + y2:::; r),
the last term by rotational symmetry of (X, Y). Use independence of X, Y, and the fact that
.../X 2 + y2 has Rayleigh distribution to evaluate:
1
P(X < O)P(Y < 0) + 2P(O <X< r)P(Y < 0) + 4P(R r)
= <P(O)<P(O) + 2[<P(r)- <P(O)]<P(O) + ( 1-

2
= .!_4 + 2 (<P(r)- .!.)
2 2
.!. +.!. (1- e-tr
4
2
) = <P(r)- !e-tr .
4
19. a) (K1 = k) = (Wk < min;# Wi); b) Pk = Ak/(AI +···Ad); c) use the memoryless property of
the exponential waiting times;· d) the answer to g) must be Pk by the law of large numbers; e)
AkT; f) (A1 + · · · Ad)T; g) Ak/(AI +···Ad)·
20. a) If t >0 then
P(Tmin > t) = P(T1 > t,T2 > t) = P(T1 > t)P(T2 > t) = e->.,ce->.,c = e-<>-,+>.2 )'
so Tmin has exponential distribution with rate A1 + A2.
b) P(Xmin = 1) = P(T1 < T2) = f.": roo A2e->. 2
y-0 J=y
.r A1e->.' 11 dxdy = f.":
Y- 0
A!e->. 111 e->. 211 dy =
>.,+>.2.
Get P(Xmin = 2) by interchanging 1 a.nd 2, or by subtraction from 1.
c)
P(Xmin = 1, Tmin > t) = P(T1 < T2, T1 > t)

=
100 100 \ -"'•" \
y=t z=y
"2e .... ,e -"-•lid x d y
=
=
! I
.. 00
>..,
\
,e
-.l.,y
e
->.,yd
-("-o+"-2)1
y
A1 + >..2 e
= P(X.,.. " = l)P(Tmm > t).
Similarly P(Xmin = 2, Tmin > t) = P(Jfmon = 2)P(Tmon > t}.
149
Chapter 5: Review
=
d) Claim: For each n 2, 3, 4, ... :
If T1 , T2, .... Tn arc independent exponential random variables with rates A1, A2, .. \n respec-
tively, and if Tmin denotes the minimum of Tt,. ''ld if X onin denotes the such that
T; = Tmin, then:
(i) The distribution of Tmin is exponential with rate At +···+An ;
(ii) P(Xmin = i) = >:,/!+>.,. ;
(iii) Tmin and Xmin are independent.
Proof: Induction. True for n = 2, by (a), (b), (c).
If holds for n = k, where k 2, then: for each i = 1, •.. , k + 1, define
U; = min{T;: 1 $ j $ k + l,j i}.
That is, U; is the minimum of k independent exponentials; so by the indue. . hypothesis, U;
has exponential distribution with parameter >.;- >.; ). Moreover, for ea(;n i = 1, ... , k + 1
, Tmin is the same as min(T;, U;), and T; and U; are independent.
(i) By (a), Tmin = min(TJ:+t,Uk+t) has exponential distribution with rate >. 1 •
(ii) For each i = 1, ... , k + 1:
. by (b) A;
P(Xmin = •) = P(T; < U;) =
+ <E. >.; -
k+I ·
A; A;)
(iii) For each i = 1, ... , k + 1: For each t >0:
P(Xmin = i, Tmin > t) = P(T; < U;, Tmin > t)
by (c) .
=
P(T; < U;)P(Tmin > t) = P(Xmin = •)P(Tmin > t).
So claim holds for n = k + 1 .
21. a) The distribution function is, for r > 0,
P(R $ r) = 1- P(R > r) = 1- P(no point inside circle of radius r) = 1- e->.o<r'.
The density is, for r > 0,
d (R < r ) = 2 .....'
-d -.\1rr'
r -
1
1
0.5
I
j
Multiples of ,fiJ;i I
2 3 4
Multiples of
b) Write R 1 = ,;v:;R. Then

P(R1 $ rt) = P(R $ rt/..n:>:;) =1-
This is the distribution function of the Rayleigh distribution.
150
Chapter 5: Review
c) From Section 5.3 E(Ri) = 2, and E(Ra) = y'f. (See Example 1). So
1
E(R) = E(Rl)/V'ifi = 2v..X
".
SD(R) = SD(R1 )/vr.:;.-

M7r = 1"
2v..X
R- --
1r
lr
d) The density is maximized at the mode. Suppose first .../2);; = 1, so R = R1 • The density is
1
re-2r , and
d
dr
-re2 = e 2 - r2e 2
which is zero for r = 1, clearly giving a maximum. Thus r = 1 is the mode of R 1 , so r = 1/.../2);;
=
is the mode of R RI/vv::;.
To find the median value r, solve
P(R :5 r) = 1/2
22. a) Note that ( K) is distributed like. W = .JX 2 + Y 2 + Z 2 , where X, Y, Z are independent

112
normal (0, 1) random variables. To find the density of W, duplicate the argument in the text
for the density of .JX 2 + Y2 :
The joint density of X, Y, Z is
1/J(:z;)q,(y)q,(z) = __
(2x-)3/2 .
The event (W E dw) corresponds to (X, Y, Z) falling in a spherical shell about the origin of
infinitesimal width dw, radius w, surface area 4lrw 2 and volume 4lrw2 dw. And P(W E dw) is
the "volume" over this infinitesimal shell beneath the joint density. Over this shell, the joint
2
density has nearly constant value e-t "' ; conclude that
2 1 _!..,2
P (w E dw ) = 41rw ( 1r) 3 / 2 e 2 dw
2
and the density of W is
w > 0.
The density of K = W 2 is, by a change of variable,
fK(Y)= /2.-1
2..JY
·fw ( /2y)
=
V
/2.- 1
-
2..JYV; mcr2
= 2 1 1/2 -yJmc 2
..ji (mcr2)3/2 y e 'Y > 0.
b) E(K) = TE(V}) + TE(Vy2 ) + TE(V}) = 3 ; ' E(V}) = 3 ;'Var(Vz) = 3

;'cr2 • For the mode of
the distribution, note that
/K(Y) =const. ylf2e-yfma';

2
f K '( y } =cons t · [I2 y -1/2 e
-yfmc 1 e -yfmc 2 ] = const · y -1/2 e -y/mc 2 (I - mu
- y 1/2 mcr Y ) •
2 2 2
The derivative of the density equals zero only at y = tmcr 2 • This value of y must correspond
to a global maximum (why?), so the mode of the energy distribution is tmcr 2.
151
Chapter 5: Review
23. a) P(Y>1{3,Z<2{3) _ _ 1/8 '

P( Y2;1{3) - (2/3) -
b) P(Z<2{3)-P(Y>I/3,Z<2{3) _ 7/19
1 P(Y2;1/3J - '
24. a) Assume the center of the coin is uniformly distributed relative to the grid. The probability is
then the ratio of relevant areas: <•-;:>'
b) Assuming also that the coin is fair, and lands heads or tails independent of where its center is
-.:;r-
located: 21 · <•-d)
2
c) Assume the tosses are independent of each other, with assumptions as above for each toss. Then
X and Y are independent, X has binomial (n = 4,p = 1/4) distribution, and Y has binomial
(n = 4, p = 1/2) distribution. Compute probabilities by considering all possible pairs of values
of (X, Y) and adding over the appropriate paiis: P(X = Y) = 886/4096
d) P(X < Y) = 2685/4096
e) P(X > Y) = 525/4096 (subtract the others from 1).
25. a) As in Example 5.3.3, for 0 < :r < y < 1,
P(Vk E d:r, Vm E dy)
= P(k- 1 in (0, :r), 1 in d:r, m- k- 1 in (:r, y), 1 indy, n- m- 1 in (y, 1))
- n!:rk-ld:r(y- :r)m-k-ldy(l- y)n-m-1
- (k- 1)!(m- k- 1)!(n- m- 1)!
Dropping the d:r and dy leaves the joint density for 0 < :r < y < 1. The joint density is zero
elsewhere.
b) beta(m- k, n- mk + 1)
c) beta(k, m- k + 1)
26. See solution to Exercise 6.Rev.30.
27. No Solution
28. No Solution
29. a) Say that the spacing between the parallel lines is 2a. Let Y be the distance from the center of
the needle to the closest line. Let e be the (acute) angle between the needle and a. line running
perpendicular to the parallel lines. Then (Y, 8) is uniformly distributed on the rectangle
R = {(y, 8): o $ y $a, 0 $ 8 $ 211' }.

For all positive x we have
y
{X$ x} = {cose -} = {Y $ xcos6}.
X
(Fix the value of Y first; then decide what e can be.)
y
' /
_l_ . _,.,
/"
____ I
Hence if 0 < x $ a then

w/2
P(X $ x) =
J. 0
xcos8d8
area. o f R
152
Chapter 5: Review
X 2
=-=-x
¥a 1ra
and if x ;::: a then

rarccos .._ r"/2
Jo "' ad8 + Jarccos ._ x cos 8d8
P(X $ x} = area of R "'
= _!_
Jra
[a x
+ x- J x2 - a2] ,
where arccos has domain [0, 1] and range [0, ¥1 .
b)
if 0 < x $a
if x;::: a
30. a) Xn is the sum of n independent uniform(-!, 1} variables, with mean 0 and variance 1/3. So Xn
has mean 0 and SD .;;;TJ = 10 for n = 300. Using the normal appoximation, P(IXnl > 10} ::::::
32% for n = 300.
b) Same as a): 32%
c) This event occurs if and only if both the events in a) and b) occur. Because the components
of a point picked at random from a square are independent, the n uniform variables whose sum
is Xn are independent of the n uniform variables whose sum is Yn. There for Xn and Yn are
independent, and the required probability is
P(iXnl > 10 and IYnl > 10) = P(IXni > lO)P(IYnl > 10):::::: (0.32) 2
d) From the previous part, for n = 300 the joint distribution of Xn/10 and Yn/10 is approximately
that of (X, Y) where X and Y are independent standard normal. Let R = ./X2 + 1'2. Then
the required probability is
+ y,: > 100) = P((Xn/10) 2 + (Yn/10) 2 > 1}:::::: P(X 2 + Y 2 > 1) = P(If > 1) = e- 1 / 2
31. a) For large n, Xn in this problem as well as X .. in the previous problem will be approximately
normal. It is also clear that they will have the same mean, so it remains only to find r so that
they have the same variance, or, equivalently, that E[Xfl is the same in both problems. For the
previous problem,
E[X 12 ] = !1
-1
x 21
-dx
2
1 311
= -x
6 -1
= -1
3
And for the current problem, if we consider R2 = Xr + 'Y;2 , and we observe that E[Xl] = E(Y12 ],
then we see that E[Xfl =
tE[R2 ].
2
E[R] = 1-
1rr 2
lr
0
x 2 21rxdx 1 x { lr =r2
= 24r2
-
o
-
2
And so E[Xfl = r•', but we want this to be equal to t. so r = J{TI.

b) No, since 0 = P(Yn = ;riXn = nr) :f: P(Yn = = 0} > 0.
c) For both the circle and the square, we let R =X n cos 8 + Yn sin 8, and we have
E(R) = cosBE(Xn) +sin BE(Yn) = 0

and
E(R 2 ) = cos2 +sin BcosBE(XnYn) +sin 2 E(Y,;) =
since E(XnYn) = 0 by symmetry and = E(Y;}. Now from a) we know that for large
n the distribution of Xn from the squares is nearly the same as Xn from the circles, so R =
Xn cos B + Yn sin 8 must have nearly the same distribution for both cases.
153
Chapter 5: Review
d) Since the distribution of Xn cos 8 + Yn sin 8 is approximately normal, we can use the result to
see that the answer must be the same as the answer to part d) of the previous problem, using
squares, which is e- 1 / 2 •
32. a) r = .J213 b) no, X1 and Yi are not independent d) e-1/2
33. a)r = Vlfi. b) no, X 1 and Yi are not independent d) e- 1 / 2 g) r"
34. No Solution
154
+
Section 6.1
1. a.) The unconditional distribution of X is binomial (3, 1/2):
P(X
X
= x)
I 0
1/8 3/8
1
3/8
2
1/8
3
b) Given X= x, Y is distributed like x plus a binomial (3- x, 1/2) variable. That is, Y has the
binomial (3 - x, 1/2) distribution shifted to { x, x + 1, ... 3}.
y 0 1 2 3
P(Y = YIX = 0) 1/8 3/8 3/8 1/8
y 1 2 3
P(Y = YIX = 1) 1/4 1/2 1/4
y 2 3
P(Y = yiX = 2) 1/2 1/2
y
P(Y = YIX = 3) I
c) Use P(X = x, Y = y) = P(X = x)P(Y = yiX = x):
values x for X Marginal

distn
0 1 2 3 ofY
1 1 1
values 0 ii . ii 0 0 0 6i
Joint distribution table for (X, Y) y l.1 3
ii ...
1
0 0 9
8 8 6i
1 3 3 I 3 1 27
for Y 2 8'8 8. 2 8. 2 0 6i
3 I
8 . ii
I 3
ii'"i
I 3
8'2
1
k. 1 27
64
I 3 3 1
II
I
Marginal ii 8 8 i 1
distn of X
d) Use P(Y = y) = Lr P(X = x, Y = y) a.qd the result of c):
y 0 2 3
P(Y = y) 1/64 9/64 27/64 27/64
e) Use P(X = riY = y) = P(X = x, Y = y)/P(Y = y):
P( X
r
= X Iy = 0)
I 0
1
P(X = xiY = l) 1/3 2/3
P(X = xiY = 2)
I 0
1/9 4/9 4/9
2 -
2 3
2/9 4/9 8/27
Section 6.1
f) For each y = 0, 1, 2, 3: Guess the value x that gives the highest probability P(X = xjY = y).
2 3
1 or 2 2
g)
P(guess correctly) = L P(guess correctly, Y = y)

ll
= L ?(guess correctly IY = y)P(Y = y)

ll
·• Co:.j:.._un on the value ofT:

4
P(G =g) = L: P(G = giT = t)P(T = t)

t=O
Now given T = t, G has binomial (t, 1/2) distribution, so
P(G=giT=t)= G){l/2)', g=O, ... ,t.
Conclude:
P(G = 0) = 1 · 0.1 + 1/2 · 0.2 + (1/2) 2 • 0.4 + (1/2) 3 • 0.2 + {1/2) 4 • 0.1 = 0.33125
0 + 1/2.0.2 + 2 · (1/2) • 0.4 + 3 · (1/2) • 0.2 + 4 • (I/2) • 0.1 = 0.4
2 3 4
P(G =I) =
P(G = 2) = o + o+ (1/2) 2 • 0.4 + 3 • (I/2) 3 • 0.2 + 6 · (I/2) 4 • O.I = 0.2125
0 + 0 + 0 + (1/2) • 0.2 + 4 · (1/2) • 0.1 = 0.05
3 4
P(G = 3) =
P(G = 4) = o + o + o + o + {1/2) 4 • 0.1 = 0.00625
G 0 1 2 3 4
P(G =g) 0.33125 0.4 0.2125 0.05 0.00625
3. a) Suppose there are n families in all. Then there are a total of
0 X .1 X n +1X .2 X n +2 X .4 X n +3 X .2 X n +4 X .I X n = 2n
children in the town. There are 0 x .1 x n who come from 0-child families, and 1 x .2 x n
children who come from !-child families, and 2 x .4 x n children who come from 2-child families,
and so on. So the distribution of U is given by:
u 1 2 3 4
P(U = u) .2n/2n =.I .8n/2n = .4 .6n/2n = .J .4n/2n = .2
The distribution ofT is different from the distribution of U because we are choosing children
at random rather than families at random; so families with many children are more likely than
before.
b) The number of families having two girls and a boy is .2 X n x (3/8) (This is n times tl··· chance
that a family picked at random has two girls and a boy.) Su t!a:. _ ... c three times that many
children who come from a family consisting of two girls and a boy. The desired probability is
r 3><.2><n><(3/8) _
lh cre.ore 2" - • 1125 •
Or: Let A =(child's family has 3 kids), B =(child's family has two girls and a boy). We want
P(AB) = P(A)P(BIA) = .3 .1125.
156
Section 6.1
c) The probability that a family picked at random has two girls and a boy is (.2)(3/8} = .075. This
is different from the probability in b) be<:ause the two experiments are different.
5. a) Given XI + x'J = n, the possible values of XI are 0, 1, ... 'n. For 0 k n, justify the
following:
_ kiX X _ ) _ P(X1 = k)P(X2 = n- k}
P(x 1 -
1
+ 2- n - P(Xt + X2 = n}
e . e k!
=----.......
e-<At +A2) (At +A2)"
n!
n! ( .X 1 )" ( .X 2 ) n-k
= k!(n- k}! >.t + >.2 At + .X2
So given XI+ x2 = n, X! has binomial (n, distribution.
b) Assume that the number of eggs laid by one insect is independent of the number laid by the
other; and that they both lay the same expected number of eggs. Letting X; represent the
number of eggs laid by insect i, we have XI' x2
independent, each with Poisson (.>.)distribution
for some .X. By a),
To evaluate approximately, use the normal approximation to the binomial. Here I' = np = 75,
a= ...;npq = 6.124 and the desired probability is approximately
1- q; (89.5 -75)
. 6.124
= .0089.
6. a) The probability is only positive when i 1 + · · · + im = n.
Let N = N1 + · · · + N m and .X = >.1 + · · · + .X"'.
_ . N _ . IN_ ) _ P(N1 =it, ... , Nm = im, N = n)
P(N1 - 11, ••• , m - lm - n - P(N = n)
= ft
lc=l
P(Nic = I
= ( it!···im!
n!
T
)(.X 1 )i' ... T
So the conditional distribution is multinomial (n, •... , k-)
}__l 1 l_,l 1
b) Assuming that .X= At +···+Am,

00
P( N1 = i1, ... , N m = im) =L P( N1 = i1, ... , N m = im I N = n }P{ N = n)

n=O
The unconditional distribution is independent Poisson(A" ).

In general, even if .X =1- >.1 + .. · + >.m, a similar calculation shows that the are unconditionally
independent Poisson(>.>.k/(>.t + .. · + .>.,.)).
157
Section 6.1
7. a) For each k 0:
00
P(X = k) = L P(X = k, 0. = n)
n=k
00
=L P(N = n)P(X =kiN= n)
->.)." n! k n-k
=we lkl(
n. on -k)•p
• (l-p)
n=k
So X has Poisson (.>.p) distribution.

b) Let N denote the number of breakages, and X the number of healed breakages. So N has Poisson
(.4) distribution. And given N = n, X has binomial (n, .2) distribution. By a) the unconditional
distribution of X is Poisson (.08)0 So the required probability is = .0000016.
8. The event (X= x, Y = y) is contained in the event (N = x + y). Hence
(X= x, y = y) = (N =X+ y, X= X, y = y)
Now for x 0, y 0,
P(X =X, y = y) = P(N =X+ y, X =X, y = y)

= P(N =X+ y)P(X = x, y =YIN= X+ y)
= e->. .>..r+ll (x + y)! {1/Gt(S/6} 11
(x + y)! x!y!
= [e-s>.J 6 (s;
6)
1
11
]
= P(X = x)P(Y = y)
because by the previous exercise, X has Poisson (.>./6) distribution, and Y has Poisson (5>.j6)
distribution. So X and Y are independent.
9. Write (for example) P(ylx, z) for P(Y = yiX = x, Z = z)o

P(x,yjz) = P(xjz)P(yjz)
P(x, y, z) P(x, z) P(y, z) (definition of conditional probability)
P(z) = P(z) P(z)
P(x,y,z) P(y,z) • I
-<:::=? P(x,z) = P(z) 1
(crossmuhpy)
-<:::=? P(yjx, z) = P(ylz) (definition of conditional probability)
The symmetry of the first equation indicates that we may obtain a further equivalent C< dition by
interchanging x and y in the last equation:
P(xjy,z) = P(xjz)o
10. a) By conditional independence,
P(R = r,Xt = x,,X2 = x2) = P(R = r)P(Xt = Xt I R = r)P(X2 = x2l R = r)

= lrr (;.) (r/10}""' (1- r/10)"-z' (;J (1-
158
Section 6.1
b)
P(R = riXt = xl) = P(R = r, XI =xi) I P(XI = xl)
= (r/10}-"'{1- r/10)"-"'' r, (;J (s/10}"'' (1- s/10)"-"' 1
c)
P(X2 = X21R = r, XI = xl) = P(X2 = X'liR = r) (by conditional independence)
= (;)(r/10)"' 2 (1- r/10)"-"' 2
d) P(X2 = x2IX1 =xi)= Lr P(R = riX1 = x!)P(X2 = x2IR = r, X1 =xi)

e) If ll'r = 1/11, n = 1: Put x2 = Xt = 1 in b) and c):
10
P(R = rjX1 = 1} = {1/ll}(r/10)

I 2)1/ll)(s/10)
•=0
= r/55;
P(X2 = liR = r, Xt = 1) = r/10.

So
10
r r
P(X2 = IIXt = 1) = L - .-
55 10
= 385/550 = .7.
r=O
Similarly
10
10 -r r
P(X2 = 11Xt = 0) = L....J ---ss-- · 10 = .3.
r=O
By taking complements, see P(X2 = O!Xt = 1) = .3; and P(X2 =
OIXt = 0) .7. By symmetry, =
P(X2 = 1) = 1/2. So Xt and X2 are not independent: P(X2 = 1IXt = 1) 'f. P(X2 = 1).
159
Section 6.2
Section 6.2
E(YIX = x) = 13 2x + I:
y=z+1
y. 13 2x
= 1 [x+ 2 .(x+7)(6-x)] 42- x 2

13- 2x 2 13- 2x ·
X 1 2 3 4 5 6
E(YIX = x) 41/11 38/9 33/7 26/5 17/3 6/1
b) For each y =I to 6 we have
P(Y = y) = P(Y $ y)- P(Y $ y- 1)

y 2
= (6) y- 1
- ( --)
6
2
= 2y- 1
_ 1{36 __ 1_ if
P(X = xiY = y) = P(Y=y)
P(X=z.Y=y)
- (211 1){36
2{36
- 2 11 -1
__ 2_ jf X
x = Y
< y.
{
P(Y=y) - (211 1){36 - 2y-1
2 y 2 (y- 1)y y y2
E(XIY = y) = L...J X. 2y- 1 + 2y- 1 = 2y- 1 . 2 + 2y- I = 2y- 1.
:r::::l
y 2 3 4 5 6
E(XIY = y) 4/3 9/5 I6/7 25/9 36/11
2. a) For each x = I to n we have

y<x
y=x
y>x
and
X 2y
E(YIX=x)= 2 (n-x)+I + 2 (n-x)+1
11=1 y=z+l
n(n +1) _ z(z +1)) 2
X+ 2 ( 2 2 _ n(n +I)- x
2( n - x) +1 - 2( n - x) + 1 ·
b) For each y = 1 ton we have

2
2yl-l
x<y
P(X = xiY = y) = x=y
{ x>y
and
!1-1 2
E(XIY = y) = + _Y_ = (y- I)y + _Y_ = _Y__
2y - I 2y - I 2y - I 2y - I 2y - I
x=l
Notice that this conditional expectation is the same for all n.
160
Section 6.2
3. Assume n is at least 2. There are possible values of (X, Y), all equally likely: they consist of all
pairs of numbers of the form (x, y), where 1 :5 x < y :5 n • So
if 1 :5 x < y :5 n
P(X = :r, Y = y) = { otherwise
P(X = x) = t P(X = x, Y = y) = { if1:5x:5n-1

otherwise
if 2 :5 y :5 n
P(Y = y) = P(X = x, Y = y) = .{
otherwise.
a) If 1 :5 x :5 n - 1 then P(Y = yiX = x} = n_:z , X + 1 :5 y :5 n , so
E(YIX=:r)= LYP(Y=yiX=x)=
all 11
t
11=.r+l
= [ty-ty]
I I
= _1_ [n(n+1) _ x(x+1)] = n+x+1.

n-x 2 2 2
Intuitively, given X = x, Y has uniform distribution on { x + 1, ... , n} , so the expected value

of Y is the average of x + 1 and n.
b) If 2 :5 y :5 n then P(X = xiY = y) = 1 :5 x :5 y- 1, so
Sl-1
1
E(XIY = y) = '"'xP(X
L....J
= xiY = y) = '"'x-- = (y -I)(y)
L....J y-1 2(y-1)
= JL.
2
all .r z=l
Intuitively, given Y = y, X has uniform distribution on { 1, ... , y - 1} , so the expected value of

X is the average of 1 and y - 1.
4. a) E(Y) = E(E(Y I X)]= E (!"tX) = t + t (!:p) =

b)
E(Y
2
) = E[E(Y 2 I X)] = E (t.l
= E ( X(2X + 1)) = E ( (2X + + 1)) = iE(X2) + iE(X) +
=.!_(n(n+l)(2n+l)) .!.•(l+n) .!_=4n 2 +15n+l7

3 6n +2 2 +6 36
c) Var(Y) = E(Y2)- [E(YW = rnl1::-13, so SD(Y) = yrn2{26n-u

d)
P(X + Y = 2) = P(X + Y = 2 I X= x)P(X = x) = P(X + V = 21 X= I )P(X =I)=.!..

11
k=l
5. a) Observe that given A, we have X= X 1 , while given A\ we have X= X 2.
F(x) = P(X :5 x) = P(X :5 xiA)P(A) + P(X :5 xiAc)P(Ac)

= P(X1 :5 x)P(A) + P(X2 :5 :r)P(A')
= F1(:r)p + F2(:r)(I- p).
b) E(X) = E(XIA)P(A) + E(XIAc)P(A<) = E(Xt )p + E(X2)(l- p).
161
Section 6.2
c) E(X 2 ) = E(X 2 IA)P(A) + E(X 2 1Ac)P(Ac)

= [(E(Xt)? + Var(Xt)]p+ [(E(XtW + Var(Xt)](l- p). Hence
Var(X) = E(X2 ) - [E(} = Var(Xt)p + Var(X2)(l- p) + p(l- p)(E(Xd- E' · 2

2 ))
6. a) P(all X's < t I N = n) = t' ) < t < 1)

b)
P(all X's < t) =L P(all X's < t I N = n)P(N = n)

n=O
c) P(SN = 0) = P(N = 0) = e-", since in the event N > 0, the probability that all X's will be 0
is 0 .
.i) E(SN) = E[E(SN IN))= =
Condition on N : If 1 $ k $ n then
E(SNIN = k) = E(Xt + · · · + = k) = E(XtiN = k) + · · · + = k) = kll.
If k = 0 then E(SNIN = k) = 0. Hence E(SNIN = k) = kp., k = 0, ... , n . and

n n
E(SN) = LE(SNIN = k}P(N = k) = LkllP(N = k) = p.E(N).
k=O k=O
8. Define Do = 1 and for n = 1, 2, 3, ..• let Dn be the number of individuals in the. nth generation. Then
for each n ?: 0, _.
x<">+···+X<"> if Dn = k, k = 1, 2, ...
Dn+J ={ 0
1
A:
if Dn = 0
where •... are independent and identically distributed random variables having mean ll· [
X}"> is the number of children produced by the ith individual of generation n; generation 0 corresponds
to the original individual. We assume that for each n, each X}"> is independent of Dn. ]
=
Claim: E(Dn) = p.", n 1, 2, 3, ....
Proof: Induction on n. True for n = 1, since Dt = xp>. If true for n = k, where k?: 1, then
= E(XiA:l
= ll · by Exercise 7
= I'. 1'1: = l'k+t.
so claim holds for n =k +1.

9. a) Given T2 = j, T 1 is distributed like the place at which the first good element appears in a
random ordering of j - 1 elements, of which 1 is good; hence
E(Tt!T2 =J) = 1 . X
(i- 1+ 1) = -.j
1+1 2
b) Given T1 = j, T2 is distributed like j plus the place at which the first good element appears in
a random ordering of N- j elements, of which k- 1 are good; hence
E( ,.... ,.... .) . ( N - i + 1) . N - j + 1
.ll.lt=J
1 =J+ 1 X k-1+1 =J+ k .
162
Section 6.2
c) If h < i : Given T; = j, Th is distributed like the place at which the hth good element appears
in a random ordering of j - 1 elements, of which i - 1 are good; hence
E(Th!T; = J). = h X (j- 1+ 1) j

i - 1 + 1 =hi.
As a check, note that (a) is a special case.

If h > i : Given Tt = j, Th is distributed like j plus the place at which the (h- i)th good
element appears in a random ordering of N - j elements, of which k - i are good; hence
1
E(Th IT• = i) = j + (h - i) ( {: 1 ) •
As a check, note that (b) is a special case.
10. Since the number of black balls among the n balls drawn from the box is the sum of n indicators, we
have
£(number of black balls among then balls drawn)= n x P(first ball drawn is black).
Condition on X, the number of black balls among the d balls drawn from the other box (which
contains bo black balls and wo white balls):
d
P(first ball drawn is black) = L P(first ball drawn is black! X= k)P(X = k)

k=O
="""' =k)
d
k+b P(X
k=O
since the box of b + w + d has k + b black balls when X= k. The last sum is E (,;.;;d).
Since X
has hypergeometric distribution (X is the number of good elements [black balls] in a sample of size d
from a population of size bo + wo containing bo goods), we have E(X) = d X = , 0'+?;, 0 • Hence
£(number of black balls among the n balls drawn)= n x ( b' +b )

b+w+d
whereb'= bo+cuo
11. The number of aces among the 5 cards drawn can be written as
where
if ith card drawn is an ace
X;= { otherwise.
Therefore the expected number of aces among the 5 cards drawn is 5E(Xt ). The number of aces
among the pack of 39 cards is H +I, where H is the number of aces contributed by the 13 cards
coming from the top half of 26 cards. Then H has hypergeometric distribution with G = 3, B = 23,
and n = I3, so E(H) = 13 x k = 3/2. Use this fact to compute
3
E(Xl) = P(Xt =I)= LP(Xt = 1IH = k)P(H = k)

k=O
3
= """'k + 1 · P(H =
L..t 39
k)
k=O
=E (\;1)
I 3 1
=-x-+-
39 2 39
5
78'
So the expected number of aces among the 5 cards drawn is 5 x E(X 1 ) = 25/78 .
163
Section 6.2
12. a) E(Bn+tiBn) = Bn ("+:;280 ) +(Bn + 1) = *B"

b) Use induction on n. Assume E(Bn) = E(Bt) = 1 = .!.:p
E(Bntd = E[E(BntdBn)] = E (*Bn) =
c) E.(!+2) = =k
13. c) (n-m) m!. (n-A:)
(n-1) n n
14. P(Xt = it, ... , Xn = i,. I Sn = k) will be 0 unless L:;=l ij = k.
P(x 1 =.1}, ••• , X n =.In ISn = k) = ,P(Xt = it, P(Sn=k)

... ,Xn = in,Sn = k)
pl:i;(l _ p)n-I:i;
= _ p)n-A:
1
=
15. a) E(S) = nE(II)

b) Var(S) = nE(II)(I- E(II)) + n(n- I)Var(II)
c) II constant makes Var(S) smallest, II with values only 0 and 1 makes Var(S) largest. This
follows from b) and
o $ Var(II) = E(II 2 ) - E(II) 2 $ E(II)- E(II) 2 ,

(since o $ II $ 1, so II 2 $ II).
16. Use the formula for E(h(X)Y) by conditioning on X:
E(h(X)Y) =L E(h(X)YIX = x)P(X = x).

all z
But given X= x, h(X) equals h(x), which can be treated as a constant:
E(h(X)Y) = L h(x)E(YIX = x)P(X = x) = E[h(X)E(YIX)J

all z
17. We know that

E[(Y- g(X))
2
) =L E{(Y- g(X)) 1X
2
= x)P(X = x).
:r
All the terms in this sum are non-negative. So E[(Y- g(X)) 2 ) will be minimized if each of the terms
E{(Y- g(X)) 2 1X = xJ is minimized. Given X = x, the random variable g(X) can be treated as a
constant g(x), so
E[(Y- g(X}) 2 1X = x) = E[(Y- g(x))2 IX = xJ

= E(Y 2 IX = x)- 2g(x)E(YIX = x) + [g{x)] 2 •
Taking g(x) = EP"IX = x) minimizes this last quantity. Therefore
E[(Y- g(X)) 2 ] L 2
E[(Y- E(YIX = x)) IX = x]P(X = x) = E[(Y- E(YIX)}
2
].
18. Write
Y- E(Y) = (Y- E(YiX)) + (E(YIX)- E(Y)).
Next, square and take expectations in both sides of the equality:
Var(Y) = E[(Y- E(Y)) 2 ] = E[(Y- E(YIX)) 2 ) + E[(E(YIX)- E(Y)) 2 ].

164
Section 6.2
Observe that the cross term vanishes, since applying Exercise 16 with h(x) = E(YIX)- E(Y) gives
E(Y- E(YIX))(E(YIX)- E(Y)) = E(Yh(X))- E[E(YIX)h(X)] = o.
= E(E(ZIX)) for any random variable Z; therefore
Finally,. note that E(Z)
Var(Y) = E[E((Y- E(YIX)) 2 1X)] + E[(E(YIX)- E(E(YIX))) 2 )

= E(Var(YIX)) + Var(E(YIX)).
165
Section 6.3
Section 6.3
1. By the integral conditioning formula,
P(A) = J P(AIX = z)fx(x)dx = 1 1
x
2
• Idx = I/3.
1
1
2. a) fx(x)=J0 2x+2y-4xydy=2xy+y2 -2xy2 1 =I
y=O
By symmetry, Y is also uniform(O,l).
b) Jy(y I X= t> = = t + 2y- h = t + y (0 < y <I)
c) =f2
y=O
3. Let 0 < x < 2 . Given X = x, it is intuitively clear that Y is uniformly distributed on the interval
(0,2-x),so
if y < 0
P(Y ::=; yiX = x) = { x) if0<y<2-x
ify>2-x
You can be more rigorous: follow the approach of Example I.
4. a) Jy(y) = J 0
Y
3
>. 3 xe-l.lldx = ; y2 e-l.y So, Yhas the Gamma(3, >.)and E(Y) = f·
3
2 (0 I) •
b) !( X I Y = I) = /y(l) = (l.3f2)e
l. :z:e->
I = X <X<
E(X I Y = 1) = fo x(2x)dx = i
1
:r+::;;/2 if-1/2::=;x$0
if lxl $ 1/2
5. a) P(Y I/2IX = x) = { if 0 ::=; x ::=; I/2
otherwise
otherwise
b) P(Y < l/2IX = x) = 1- P(Y 1/21X = x).
c,d) Let - I $ x $ 1. Given X = x, Y is uniformly distributed on (0, 1 - lxl), hence has mean
(1 - lxl)/2 and variance ( 1 - lxi) 2 /12.
6. a) By making the substitution x = ys,
1
Y e-y/2 e-y/2"11 1 e-y/211 .l_l .1_1
jy(y) = dx = -- ds = - - x (I - x) dx
o 21rjx(y-x) 211" o Js(l-s) 21r 0
Evaluating this beta integral gives
- e-y/ 2 r(t)f(t)- 1 -y/2

f Y (y ) - f(l) - 2e
So Y is exponential(\).
b) First find the conditional density o( X given Y = 1.
fx(x I Y = 1) = f(x, I) = e-! /.!.e-! = 1 (0 < x < 1)
Jy(l) 2li'Jx(1- x) 2 1rjx(1- x)
Since this density is symmetric around E(X I Y = 1) = t.
166
Section 6.3
7. We require
JJ z=O
1• y=O
k(z-y)dydz=kj
1
z=O
[-(z;y) ]1z 2
y=O
dz=k
lo
so k = 6.
2
3(1-y) 0<y<1
a) Jy(y) = J J(y, z)dz = Jz=y6(z-
1
y)dz =
{
0 elsewhere.
b) The conditional density of Z given Y =y (0 < y < 1) is
( ly
f zz
= ) = !Y,z(y, z) 6(z- y) y$z$1.
y Jy(y) 3(1-y)2'
Hence the conditional density of Z given Y = 1/2 is
fz(z!Y = 1/2) = B(z -1/2), 1/2$z$1,

and
P(Z < 2/3IY = 1/2) = 1 213
1/2
B(z- 1/2)dz = 1/9
8. a) E(Y) = E[E(Y I X)] = E(5X) =
E(Y 2 ) = E[E(Y 2 I X)]= E[(5X)2 + 5X{l- X)]= 20E(X 2 ) + 5E{X) = 55
6
b) P(Y = y,x <X< x +dx) = (!)xY(l-x) 5 -Ydx

c) P(x <X< x + dx I Y = y) = P(Y = y, x <X< x + dx)f P(Y = y)
The conditional density of X given Y = y is proportional to P(Y = y, x < X < x + dx), so it
must be Beta(y + 1, 6- y).
9. p;riP = P(AB!Y=p)P(YEdp) _ Jp-oP ' 1

l
dp _ !.E._ _ 1
a) P(AIB) = P(B!Y=p)P(YEdp) - f' p·ldp
p=O
- 1/2 - 3 .
b) If 0 < p < 1 then

P(Y d !ABc) = P(Y E dp, AW)
E p P(ABc)
= P(ABcjY E dp)P(Y E dp) = p(l - p)dp =6 ( _ p)dp.
1
f P(ABcjY E dp)P(Y E dp) f 1
0
p(l - p)dp P
10. Let N be the number of arrivals in (0, 1).
a) Since disjoint intervals of a Poisson process are independent,
P(N = 10, t < T1 < t + dt) = P(O arrivals in (0, t), I in dt, 9 in (t, 1)))
e-.l.(l-c)(..\(1 - t))9
=
9!
->..x1o
= t)IO-Idt (0 .'(.Xt) 4 e-.1.( 1-'l(..\(1- t)) 5
= 4! 5!
e->..>.10
= -4!5!
-t -
5 1
(1- t) 6 - 1 dt (0 < t < 1)
The conditional density of Ts given N = 10 is Beta(5, 6).
167
Section 6.3
11. No: if (X, Y) truly is uniformly distributed over the unit disk, then X is not uniformly distributed on
( -1, 1). In the pr('sent situation (for instance) the chance that X is near ±1 is higher than it should
be when (X, Y) is uniform over the unit disk. Here's an explicit calculation: In the present situation,
the chance that X is greater than 1/2 is 1/4, while if (X, Y) 'vere uniform over the disk, this chance
would be
l2 { 23"" 3
:::::: .1955
Another way: Compute the joint density of X and Y:

if x 2 + y2 < 1
f(x, y) ={
otherwise
Compare this with the density of random variables which are truly uniformly distributed over the
unit disk:
if x 2 + y 2 < 1
f(x,y) = { otherwise
12. a) T 1 is the minimum of ten exponential (1) random variables, hence

fTa (x) = 10e- 10 r,x > 0.
b) Suppose the first emission occurred at time x. Then each of the remaining particles is known
to have lived at least x units of time. Given that a particle has lived x units of time, its
remaining lifetime still has exponential distribution with rate 1 (by the memoryless property).
Thus the remaining time (beyond x) until a new emission occurs is the minimum of9 independent
exponentials. That is,
P(T2- T1 > y!T1 = x) = e- 9 y, y > o.
So given T 1 =x (x > 0), T2 has conditional survival function
P(T2 > ziT1 = x) = P(T2- T1 > z- xiT1 = x) = e-9 (:-:rJ, z > x;
and conditional density
9e-9(z-:r) if Z >X
/T,(ziTI = x) = { 0 otherwise.
c) By the average conditional probability formula, we have, for z > 0 :
/T2 (z) = j /T,(ziT• = x)/T, (x)dx

= 1' 9e- 9 (:-:r)l0e- 10 :rdx
= 90e-9 ' 1' e-"dx
= 90e- 9 '(l- e-').

Or you may use the formula for the density of the kth order statistic.
168
Section 6.3
13. a) Use the formula for average conditional probabilities:

00
P(X Y) = 2::: P(X yjY = y)P(Y = y)
= P(X O)P(Y = 0) + P(X 1)P(Y = 1) + P(X 2)P(Y = 2)

-.1. 2 , -.1. 1 .>.?e-.1.
=lxe +JXAe + 3 x -- o
2
Therefore
P(X < Y) = 1- + 2..\ + ..\2 /2)0
b) P(X E dxjX < Y) _ P(XEdz,z<Y) _ P(XEdz)P(Y>z)
- P(X<Y) - P(X<Y) o
If 0 < x < 3 then P(X E dx) = ; otherwise P(X E dx) = 0 0

If 0 :5 x < 1 then P(Y > x) = P(Y 1) = 1- e-\
If 1 :5 x < 2 then P(Y > x) = P(Y 2) = 1 - e-.1.(1 + ..\);
If2 :5 x < 3 then P(Y > x) = P(Y;::: 3) = 1- e-.1.{1 + >..+..X 2/2)o
Therefore
if 0 :$X< 1
ifl:$x<2
if 2 :$X< 3
O:$x<1 1:$x<2 2:$x<3

..\=1 o65 027 o08
..\=2 .49 o33 o18
). = 3 .41 034 025
Remark: Having computed P(Y > x), we can redo (a):
P(X < Y) = J P(Y >XIX= x)P(X E dx)
= j P(Y > x)P(X E dx) (independence)
= J Jo P(Y > x )dx
= [P(Y ;:::-1) + P(Y;::: 2) + P(Y 3))
= [3- e-.1.{3 + 2). + ).: >] 0
c) Apply (b):
E(XIX < Y) = J xP(X E dxiX < Y)

_J 0
3
xP(X E dx)P(Y > x)
from (b)
- P(X < Y)
_ P(Y;::: 1) J:_ 0 xdx + P(Y;::: 2) J:_ 1
xdx + P(Y;::: 3) J: 2
xdx
- 3P(X < Y)
_ P(Y;::: I)+ 3P(Y > 2) + SP(Y > 3)
- 6P(X < Y)
-
9- e-.1.(9 + 8.\ + 2
- 6- 2e-.1.(3 + 2). + t..X2).
169
Section 6.3
14. Let X1 + •· · + Xn = k.
_ X_ )- ... ,X.. =xn)
P(n E d p IX 1 - X1, • • ·, n - Xn - • X )
.r'{.\1=Xt, ... , n=Xn
p"(l - p)n-1: f(p)dp

P(Xt = Xt, ... ,Xn = Xn)
The posterior density of n given x1 =Xi' ••• ' Xn = Xn is proportional to pk(l - p)n-1: f(p), which
depends only on k.
15. a)
P(1r E dp, Sn = k) = P(lr E dp)P(Sn = k I 1r = p)
= p)"-k
Hence
_ P(wEdp.Sn=k)
f >r (P ISn = k)· = P( 1r E dP I Sn = k)/d
. P - dpP(Sn k)
= c(r, s, n, k)pr+k-1 (1 - p)'+n-k-1
for some constant c(r, s, n, k). Now use the hint.

b)
P(S.. = k) = f 01 P(1r E dp, S,. = k)
_ r(r+kll'(•+n-k)
- l'(r)l'(•) k l'(r+•+n)
c) Taker= s = 1. Use r(m) = (m- 1)! for integer m:
1 n! k!(n- k)! 1
P(Sn = k) = 1-=-1 k!(n- k)! (n + 1)! = n + 1
d) If X has beta (a, b) distribution,

ab
E(X) = _a_ and
Var(X)= (a+b)2(a+b+l)
a+b
Applied a = r +k and b = s +n - k this gives
E(1r I Sn = k)
_ (r+A:)(•+n-A:)
Var(1r I Sn = 1:) - (r+•+n)'(r+•+n+l)
IG. a) By using the substitution u = .A(t +a),

P(N = k) = 1 00
P(N = k I A= .A)f,..(.A)d.A
= 1oo e-"''(.At)k ar.Ar-le-a.\ d.A
k! r(r)
0
where p = and q =
170
Section 6.3
b) Using r(x + 1) = xf(x),

k-1
r(r + k) = (r + k- 1) r(r + k- 1) = ... = II(r + i)
r(r) r(r) .
t=O
When r is a postive integer,
r(r + k) = (r + k- 1)! = (r + k- 1)
f{r)k! (r- 1)!k! r- 1
c)
Substituting u = A(t(l- z) +a),
ar 1oo -u r-1 d
=r(r)(t(l-z)+at 0
e u u
= c+:-ztr
= C -zt;(t+a))r
= pr{l _ zq)-r
d) E(N) = E[E(N I A)]= E(At) = !:J: =

E(N
2
) I A)]= E[At + (At) 2 )
= E[E(N 2 = !:J: + + (;-?) = r(l;e> + r(r + 1){7)2
Var(N) = E(N 2 ) - [E(NW =
J>
e) Let g(z) = E(zN). Then,
E(N) = g'(l) = pr qr(l- qz)-(r+t)lz=l = r{1; p)
E[N(N- 1)] = g"(1) = pr lr{l' + 1)(1- qz)-(r+

2t=l r(r + 1)(1- p) 2
p2
Var(N) = g"(l) + g'(1)- (g'(I)] 2 = r(I-

p2
p)
f)
P(A
E
dA IN= k) = P(A E dA, N = k)
P(N = k)
= (ar Ar-te-o>.dA)
r(r)
(e--''(k!At)k) /P(N = k)
= (consta.nt)Ar+k-te-.\(a+<)dA
So given N = k, A has the gamma(r + k, a+ t) distribution.
17. a)Independenl negative binomial {r;,p) (i =I, 2) b) negative binomial (r 1 + r2,p) c) negative
binomial (L, r,,p)
171
Section 6.4
Section 6.4
1. a) 0.5 b) positively dependent c) 0.2 d) 0.356
2. a) If P(AIB) = P(AIBc) then
P(A) = P(AIB)P(B) + P(AIBc)P(Bc)

= P(AIB)[P(B) + P(Bc)] = P(AIB),
which implies that A and B are independent.

b) If P(AIB) < P(AIBc) then
P(A) = P(AIB)P(B) + P(A!Bc)P(Bc)

< P(A!B)[P(B) + P(B")] = P(A!B)
c) By interchanging the roles of Band Be in part b), we get the result immediately.
d) Independence implies P(A!B) = P(A). Plugging this into the given formula, we get
P(A!B) = P(A) = P(AIB)P(B) + P(AIBc)P(Bc)

P(AIB)[l- P(B)] = P(A!Bc)P(Bc)
P(AIB) = P(A!Bc) so long as P(B) < l.
e) Positive dependence implies
P(AIB) > P(A) = P(AIB)P(B) + P(AIBc)P(Bc)

P(AIB)(I- P(B)] > P(AIBc)P(Bc)
P(AIB) > P(AIBc) so long as P(B) < l.
f) This is just the same as the previous part with the inequality reversed.
3. Let F, denote the event that the first component fails, and W1 denote the event that the first com-
ponent works, etc. Then Ff = Wt, and P(F2IF!) > P(F2) implies that
Also,
P(W IW) = P(W, W2)

2 t P(W,)
_ 1- P(Fl)- P(F2) +
P(F1F2)
- 1- P(Ft)
> 1 - P(Fi)- P(F2) + P(Fl)P(F2)
1- P(Fl)
= I - P(F2)
= P(W2)
since P(F,F2)/P(F2) = P(F,jF2) > P(Ft).
4. Joint distribution table /or (X, Y)
Section 6.4
values x for X Marginal

distn
-1 0 1 ofY
values -1 0 1/4 0 1/4
y 0 1/4 0 1/4 1/2
for Y 1 0 1/4 0 1/4
Marginal 1/4 I 1/2 1/4 1 I
distn of X
Now you can easily check that X and Yare uncorrelated but not independent:
E(XY) = E(X) = E(Y) = o- Cov(X, Y) = o, but
P(X = 0, Y = 0} :/= P(X = O)P(Y = 0).

5. Joint di.stribution table for (X, Y)
value.s x for X Marginal
di.stn
-1 0 1 of Y
values y 0 0 1/3 0 1/3
for Y 1 1/3 0 1/3 2/3
Marginal 1/3 lt/3 1/3 II
di.stn of X
So compute E(X) = E(XY) = o Cov(X, Y) = o.
But P(X = -1, Y =
0) = 0 :/= P(X =
-1)P(Y = 0), etc., so X andY are not independent. In fact,
Y =
X 2 implies that X and Y are not independent.
G.
Cov(X, Y) = E[(Xt- X2)(X1 + X2)]- E(Xt - X2)E(Xt + X2)

= E(X;- sinceE{Xt) = E(X2)
= 0 sinceE(X;) =
So X, Y are uncorrelated. But X, Y are not independent: For example, you can easily check that
P(Xt - X2 = 0, Xt + X2 = 3) = 0, while P(Xt - X2 = 0) = 1/6, P(Xt + X2 = 3) = 1/18.
Intuitively, if we know that Xt - X2 = 0, then.X 1 + X 2 must be even, since the two dice landed the
same way. So given that the difference is 0, we have some information about the sum.
7. a) Joint di.stribution table for (X2 + XJ, X2- X3)

value.s for X2 +Xa Marginal
distn
0 1 2 of X2- Xa
value.s -1 0 1/6 0 1/6
for 0 1/3 0 1/6 1/2
X2 -Xa 0 1/3 0 1/3
Marginal 1/3 1/2 1/6 II
di.stn of x2 + x3
3
b) E(X2- X3) = (-1) 3 • (1/6) + (0) 3 · (I/2) + (1) 3 • (1/3) = 1/6
c) E(X2Xa) = P(X2 = 1, Xa = 1) = P(X2 = 1)P(Xa = 1} by independence
=£(X2)E(Xa), so X2 and Xa are uncorrelated.
173
Section 6.4
8. Begin by noting that x1 and XN each are distributed binomial(k, 1/N).

E(X1XN) = E[E(X1XN I X!)]= E[X1E(XN I X1)]
= E[X k- X1] = kE(Xl) _ Z(Xi)
1
N-l N-1 N-1
2
k ( k fl ) k(k - 1)
= N(N- 1) - N2 + N 2 (N- 1) = N2
X X ) _ E(X1XN)- E(Xl)E(Xn) _ E(X1XN)- {E(X!))2
corr ( 1
' N - SD(Xt)SD(XN) - Var(Xl)
(k(k- l)/N2 ) - (k/N) 2 1
k(l/N){(N- 1)/N) =- N- 1
9. Use the results of Example 7 to obtain
a) k{n+1)/2
1
b) k(n -1Xn-k)
12(n-1)
Note: Brief solution in text is incorrect.

10. Condition on H1oo.
E(H1ooH3oo) = E[E(H1ooH3oo I H1oo)]= E[H1ooE(H3oo I H1oo)]
1 2
= E(H1oo{H1oo +200 · 2)) = E(H 10o) + 100E{H1oo)
= (1oo. + 5o2 ) + 100(50) = 7525

H H ) E(H1ooH3oo)- E(H10o)E(H3oo)
corr ( 100
'
300
= SD(H1oo)SD(H30o)
7525 - 50{150) 1
= (5){5.,13) = .,13
11. ,jlj3
12. X, Y, and Z are indicator random variables.
E(X) = P(select a Democrat)= a, E(Y) = P(select a Republican)= {3

8) Var(X) = a{1- a), Var(Y) = /3(1- P)
c) Cov(X, Y) = E(XY)- E(X)E(Y) = o- a{J
d) E(Dn - Rn) = nE(X)- nE{Y) = n(a- P)
e) Dn- Rn = E: 1 {X;- Y.).
Var(Dn- Rn) = nVar(X- Y) = nVar(X) + nVar(Y)- 2nCov(X, Y)
= no(l- a)+ n/3(1- P) + 2naP = n(a + P- (a- /3) 2 )
13. For each i = 1, ... , n let A; denote the event that result A occurs on trial i , similarly for 8;. Then
n
NA = Lltt;;
1::1
J=l
n
NANB=EL:
·=• J=l
n
1A;1B; = LLIA;B;·
i:::;:.l J:::;:l
174
Section 6.4
Hence
n
E(NA) = LP(Ai) = nP(A!)

i:::l
i=l j=l
P(A;B;) = L" P(A;Bj) + 2 L < jP(A;)P(Bj)

i=l . i
= nP(A1B!) + n(n- l)P(Al)P(Bt).

Therefore NA and Ns are uncorrelated if a.nd only if E(NANs) = E(NA)E(Ns) if a.nd only if
P(AtBt) = P(A!)P(Bt), i.e. if and only if lA; and Is; are independent for a.ll i.
It is intuitively clear that N,.. a.nd Ns must then be independent. In more detail, if lA; and lB;
are independent for each i, and the n pairs ( 1A 1 , I s 1 ), ••• , ( 1 A., 1 B.) are independent, then the 2n
variables 1,..,, Is,, ... , lA., ls. a.re independent. So N A a.nd N B must be independent, since they are
functions of disjoint blocks of independent variables.
14. From the inequa.lity -I Corr(X, l') 1 we get the following equivalent statements:
-SD(X)SD(Y} S Cov(X, Y) SD(X)SD(Y)
-2SD(X)SD(Y) S 2Cov(X.Y) 2SD(X)SD(Y)

-2SD(X)SD(Y) S Var(X + l')- Var(X)- Var(Y) 2SD(X)SD(Y)
Var(X)- 2SD(X)SD(Y) + Var(Y)
Var(X + Y) Var(X) + 2SD(X)SD(Y) + Var(Y)
Var(X + Y) (SD(X) + SD(Y)) 2
2
(SD(X)- SD(Y))
ISD(X)- SD(Y)I SD(X + Y) ISD(X) + SD(Y)I
15. a), b) are special cases of c), so it suffices to prove c) only.
c)
Cov(LX;,LYi)
t J
=E [L:x•-
I
E(LX•)]
t
[:L}j- E(LYJ)l
) J
= E [Ex•-
I • I
[:L¥J- :LE<YJ>]
J J
E{ E(X;)]} { E(Yi)]}
=E E(X;)J[}j - E(}j)]}
= L E[X;- E(Xi)][}j- E(}j)]

•.J
=L Cov(X;, Yj).
l.J
d) We have NR = X; and Ns = :L;=I

}j, where X, = I if the ith spin is red,= 0 otherwise;
and Y, = 1 if the jth spin is black,= 0 otherwise. Note that E(X,) = r, E(Yj) = b, and that
E(X, }j) = P(ith spin is red and jth spin is black)= { rb

0
if i
if i
:f=
=j
j
175
Section 6.4
Hence
if i i= j
Cov(X;, Yj) ={ if i = j
Now use c) to obtain
"
Cov(NR, Ns) = L Cov(X;, Y;) + L Cov(X;, Yj) = -nrb.
i=l
16. Note that
Cov(aX + b, cY +d) = acCov(X, Y), SD(aX +b) = la!SD(X), SD(cY +d)= !c!SD(Y).
Therefore
Corr(X, Y) if ac > 0
Corr(aX + b, cY +d)= ac Y) ={ -Corr(X, Y) if ac < 0
17. Note that lAc = 1- lA, and lac= l - 1s, and apply the result of Exercise 16.
18. If X 1 , ••• , Xn are exchangeable, then (X,, ... , Xn) has the same distribution as (Xw(l)• ... , X..-(n))
for any permutation lr of (1, 2, ... , n). In particular, by considering the first coordinate of the above
relation, each X; has the same distribution as X 1; and by considering the first two coordinates of the
above relation, each pair (X;, Xi) (i#j) has the same distribution as (X1,X2). Therefore
"
Var(LX;) = LVar(X;)+ LCov(X;,X 1)
"
= LVar(X!)+ LCov(Xl,X2)
= nVar(Xt)+n(n-l)Cov(X,,X2)·
19. a,b) Let Y denote the value (in cents) obtained on any particular draw. Then E(Y) = 18.75,
SD(Y) = 8.1968, so (measuring X in cents)
E(X) = 20; E(Y) = 375
SD(X) = J 40 20
-
40-1
· V2o · SD(Y) = 26.25.
c) P(X :5 300)::::: cfl(-2.86)::::: .0021. Exact value: .002152141.

d) The quarters are bigger than the other coins, so maybe they'll be drawn more readily. This
suggests that E(X) will be higher than calculated, and P(X :5 300) will be lower than calculated.
20. a) :z:,p, + :z:2p2

b) I'! PI + IL2P2
c) l:z:, - :r2ly'PiP2
d) v' + +(I'! - 1'2 FP1P2 = J E(Var(YIX)] + Var[E(YIX)]
c) (:z:!- X2)(1'1 - /L2)PIP2
In general:
E(X) = L:r;p;
n
E(Y) =L I'·P·
i=l
176
Section 6.4
Var(X) = L:)x;- x,fp;pi

i<i
n
Var(Y) = + L)JL;- JLi) 2 PiPJ

i=l i<i
Cov(X, Y) = 2,)x;- Xj )(JL;- JLi )PiPi
i<j
21. a) 15/13 b) 24/13 c) 100/169

d) Use the fact that X+ Y = 3 to conclude
o = Var(X + Y) = Var(X) + Var(Y) + 2Cov(X, Y).

But X+ Y = 3 implies that Var(X) = Var(Y), so Cov(X, Y) = -Var(X) = -100/169.
22. X= X 1 + X 2 + · · · + Xm, where X;= I(ith couple survived), and
a) E(x) -_ m X
_
-----p;) -
(2m-d)(2m-d-l)
2(2m-l) ·
b)
Var(X) =m X
2
emd- ) [
(2;;') 1 -
e":t-2
C::')
) l + m( m- 1) X
{ e;) -
emd-·)
2
[ emd-·) ] }
(2;;')
d(d- 1)(2m- d)(2m- d- 1)
= 2{2m- 1)2(2m- 3)
23. a) MSE = E[Y- ({3X + -rW = E(Y2 ) - 2E[Y(PX + -r)] + E[(PX + -r) 2 ]
= /3 2
E(X 2 ) - 2{3E(XY) + 2/3-rE(X)- 2-rE(Y) + -r2 + E(Y2 )
b) The derivative of MSE with respect to -r is 2PE(X) - 2E(Y) + 2-r. Set to zero, obtain 7 =
E(Y)- PE(X). The second derivative with respect to-r is positive, so MSE is minimised at')-.
If {3 = 0, the MSE is minimised at t =
E(Y) and the resulting minimal MSE is
E[(Y- (PX + i'))] = E[Y- = Var(Y).

2 2
E(Y)]
c) Substitute -r = .Y(P) = E(Y)- PE(X) in part a) to obtain

MSE = {3 E(X ) - 2/JE(XY) + 2{3E(X)[E(Y)- {3E(X)]
2 2
2E(Y)[E(Y)- PE(X)J + [E(Y)- PE(X)f + E(Y 2 ).

The derivative with respect to {3 is
2PE(X
2
)- 2[E(XY)- E(X)E(Y)]- 4/3[E(X)] 2 + 2E(X)E(Y)- 2[E(Y)- ,BE(X)]E(X)
= -2Cov(X, Y) + 2{3 [E(X 2 ) - [E(X)J 2 )

= -2Cov(X, Y) + 2{3Var(X).
The second derivative with respect to .B is 2Var(X) > 0.
d) Let /3 and -r be arbitrary. From parts b) and c) we have
MSE(P, .Y(P)) $ MSE(/3, t(.B)) $ MSE(/3, -y),
so the value of MSE at (,8, .Y(P)) is at least as small as the value at any other pair (/3, -r). But
from part a) we know that MSE is a quadratic function of /3 and -r, so it has a unique minimum.
177
Section 6.4
e) Since /3 and i' can be treated as constants,

E(Y) = E(/JX + t) = /JE(X) + [E(Y)- /JE(X)) = E(Y)
and
Var(Y) = Var(/JX + t) = ,IFVar(X).
For the last equality note that
E(YY) = E[(/JX + i')Y) =

PE(XY) + .:YE(Y)
= /3[E(XY)- E(X)E(Y)) + [E(Y)Y = {3 2 Var(X) + [E(Y}Y
and
f) We have
Var(Y) = E(Y2 ) - [E(Y)) 2 = E[Y- Y + YJ 2 - [E(Y)) 2

= E(Y - Y) 2
+ 2E[Y(Y- Y)] + E(Y) 2 - [E(Y)] 2 •
By part c), E[f'(Y- Y)] = o and E(Y) = E(Y). Conclude that Var(Y) = E[(Y- Y) 2 ]+Var(Y).
Finally observe that by part c),
• ·2 [Cov(X, YW •
Var(Y) = {J Var(X) = [Var(X)]2 Var(X)
[Cov(X,' W 2
= Var(X)V ar(Y) Var(Y) = p Var(Y).
g) By part c),
{3 = Cov(X, Y) = Cov(X, Y) . ylVar(Y) = P SD(Y)

Var(X) JVar(X)ylVar(Y) JVar(X) SD(X)
If tis such that the line y = Px+.Y passes through the point (E(X), E(Y)), then this requirement
yields the equation
E(Y) = /3E(X) +t <=> 1 = E(Y)- /JE(X).

h) By the previous part, the best linear predictor oC y• based on x· has slope
Corr(X•, y•) · = Corr(X, Y) · T= p

(since the correlation coefficient is invariant under linear transformations), and passes through
the point (E(X.), E(Y•)) = (0,0).
178
Section 6.5
Section 6.5
1. Let X and Y denote the PSAT and SAT scores in standard units.
a) P(Y > 0 I X= -2) = P [Y-(0.&)-{-

../1-0.62
2 ) > O--(O.S)·(- 2 ) I X=
../1-0.62
-2] = P(Z > 1.5) = 0.0668 .
b) P(Y > OIX < 0) = =
2P(X < 0, y > 0).
Transform to independent random variables, as in Example 2.
arctan 4/3
P(X < 0, Y > 0) = P(X < 0, 0.6X + 0.8Z > 0) =
2
lr = 0.1476.
(since the corresponding region in the (X, Z) plane is a sector centered at (0, 0) subtending an
angle of arctan 4/3) . So the desired probability is 0.2952.
c) We need
P(SAT score> PSAT score+ 50)= P(90Y + 1300 > lOOX + 1200 +50)
= P [90(0.6X + 0.8Z) - lOOX > -50)
= P( -46X + 72Z > -50).
Now -46X + 72Z has normal distribution with mean zero and SD 85.44 (it's a linear combination
of independent normals); hence the desired probability is 4>(50/85.44} = 0.72.
2. Let X and Y be the heights in standard unils of a randomly chosen daughter-mother pair from this
population. Then Y = tX + where X and Z are independent standard normals.
P(X y IX 0) = P(O < X, X < Y)

< > P(X > 0)
= P(o < x, x < ix + V,: Z) I P(X > o)
= P(o < x, Jax I < Z) P(X > o)
By using the rotational symmetry of the bivariate normal distribution, the probability in the nu-
merator is = t· Hence, %of the daughters with above average height are shorter than their
mothers.
3. Let X and Y denote the weight and height in·standard units of a person chosen at random. The
event (person is above 90th percentile in height) is the same as (Y > 1.282), and the event (person
is in 90th percentile in weight) is the same as (X = 1.282). Hence the desired probability is P(Y >
1.282IX = 1.282). Now given X= x, Y has normal (px, 1- p2 ) distribution, so
P(Y>:z:iX=:z:)=P(R> R I X = x )
1- p2 1- p2
=P(Z>
If x = 1.282 and p = 0.75 then the desired probability is P(Z > 0.484) = .314 .
4. V ar(X + 2Y) =V ar(X) + 4 V ar( Y) + 2Cov(X, Y). If X and Y are independent, then the variance
of this sum is 5. If the correlation of X and Y is then the variance of the sum is 6. Therefore,
P(X + 2Y :::; 3) is either 4>( or 4>( 76 ).
179
Section 6.5
5. a) As in Example 2, transform to independent variables X and Z:
q = P(X;:: o, o) = [x o, px]
Y P z;::
1- p2
This last quantity is the probability that (X, Z) lies in the sector centered at {0, 0) subtending
an angle of
[ h] = + arcsin(p),
(where arcsin and arctan have range (-r/2, ;r/2] ). Hence, by the rotational symmetry of (X, Z),
this probability is
angle subtended
q = = -1 + -1 arctan
[ P ]
= -41 + 1 · ( )
arcsm p
2ll" 4 2ll" J l - p2 "
Note that 0 :::; q :::; 1/2.

b) Solve for p in the above formula for q. The arcsin version is most convenient for this:
This can be obtained from the arctan version, too, but you have to argue about the sign of p.
G. a) X - k Y is normal with mean 0 and variance 1 + k 2 •

1
P(X < kY) = P(X- kY < 0) = 2
b) U = ../3X + Y, V =X- ../3Y.

P(U > kV) = P(VaX + Y > k(X- VJY)) = P((Va- k)X + (1 + kVa)Y > 0) =
c)
2
P(U + V < 1)
2
= P ((VaX + Y) 2 +(X- J3Y) 2 < 1)
=p ( X2 + y2 < ±) = p ( J x2 + y2 < i)
= _.lr2d
2 r = e -ud u = -e
0 0 u:O
= 1- e-t
since J X2 + Y2 has the Rayleigh distribution (or using polar co-ordinates).
d) Var(V) = Var(X- ../3Y) = 4, so V/2 has the standard normal distribution
Cov(X, V) = E[X(X- v'JY)] = E(X 2 ) - v'JE(XY) = 1

Cov(X, V) 1
corr(X, V/2) = corr(X, V) = SD(X)SD(V) 2
fx(x IV= v) == fx(x I Vf2 = v/2)
13ecause X and V/2 have the standard bivariate normat distribution with p = l; (X I V =
v) = and Var(X IV= v) = 1- Therefore given V = v, the .. 1Stribution
of X is normal( f, t ).
180
Section 6.5
7. a) As always (Section 6.4) independent implies uncorrelated. Conversely, let X*= (X- Jlx)/ux
and Y* = (Y- JlY)/uy be the standardised versions of X and Y. Then X" and y• have
standard bivariate normal distribution with p = 0 . (See box on bivariate normal.) But in the
Case p = 0 I inspection ShOWS that the joint density of x·
and yo equals the product of the
marginal densities, so X" and y• are independent. (Another way: appeal to the definition of
the standard bivariate normal distribution in terms of independent standard normal variables;
if p = 0, then we return to the original pair of independent variables.) Being functions of x·
and y• respectively, it follows that X and Y are also independent.
b)
E(YIX = x) = E (JlY +. uy y· I x· = X- JlX)

ux
=py+uyE(Y·I x· =
= Jly +uy •p• (X 1
since given x· = t, y· has normal (pt, 1 - p 2 ) distribution.

c)
Var(YIX = x) = Var (JlY +uyY• I x· =X
= 2
l!y v (y· I x· = ----;;;;---
ar x- Jlx)
= p2).
d) In terms of the standardized versions, we have
aX+ bY+ c = a(Jlx + uxX") + b(JlY + uy y•) + c

= a(Jlx + uxX") + b [JlY + uy(pX· + +c
= (aux + bp11y)X" + + aJlx + bJly + c,
where X" and Z are independent standard normal variables. Hence aX+ bY+ c is a linear
combination of independent normals, so is itself normally distributed. Its parameters are:
E(aX +bY+ c) = aE(X) + bE(Y) + c = aJlx + bJly + c,
Var(aX +bY+ c)= a Var(X)
2
+ b2 Var(Y) + 2abCov(X, Y) = a 2 u3c + + 2abpuxuy.
e) Let V = X cos 8+ Y sin 8 , W = -X sin II+ Y cos 8 . Then V and W are each normally distributed
by part (d). To show independence, it suffices to show, by part (a), that E(VW) = 0. Note
that
VW = -X 2 sin8cos8 + XY cos2 8- XYsin 2 8 + Y 2 sin 8cos8

= XYcos28-
Therefore E(VW) =cos 28E(XY)- sin 28E(X 2 - Y 2 )/2 = 0 if and only if
E(X 2 - Y 2 ) u3c-
cot 28 = = ...::::'----!...
2E(XY) 2puxuy
I)= -1 arccot [ u3c - .

2 2 puxuy
Geometric significance of 8? If a point (:r, y) in the plane is rotated clockwise by an angle t/J about
the origin, its coordinates become (x cos t/J + y sin t/J, y cos t/J- x sin t/J ) •• Thus if a bivariate normal
distribution in the plane, centered at the origin, is rotated clockwise by the above angle 8 , one
obtains a bivariate normal distribution whose axes of symmetry coincide with the coordinate
axes. {Reason: By the above calculation, the new variables are independent, so the ellipses of
constant density now have the form ax 2 +by 2 =canst.) So 8 = (1/2)arccot[(u3c
is the inclination of one of the axes of the ellipses of constant density for (X, Y) .
181
Section 6.5
8. a) Since E(Yi) = E(l';) = 0,

Cov(Yi, 1'2) = E[(X, + X2)(aXt + 2X1 )]
= aE(Xn + 2E(Xi) +(a+ 2)E(XtX2)
=o-+2=0
so, a= -2. Therefore 1'2 has the normal(O, 16) density.

b) Cov(X2. 1'2) = E[X2(2X2- 2Xt)] = 2E(Xi)- 2E(XtX'2) = 2
9. a) Means: "' al' +b. Variances: u 2 , a 2 u 2 + r 2 • Correlation: auf../a 2 u 2 + r 2 b) normal (ap. +

b, a 2u 2 + r 2 ) c) normal (I'+ au2 (z- al'- b)/(a 2 u 2 + r 2 ), u 2 r 2 /(a 2u 2 + r 2 ))
10. a) Let corr(V, W) = p.
W-p.w =p(V-p.v)
crw uv
so,
Now show that a V + bW is a sum of independent normal random variables (plus a constant),
and hence is normal.
aV +bW = aV + buwp (V +
=(a+ b::p) V +
b) We need to show that cV +dW = a(aV +bW)+P(Z') where aV +bW and Z' are independent.
11. See solution of Exercise 6.Rev.I8.
12. a) Cov(S, V) = E [(S- E(S))(V- E(V))] = E ((bV + W)(V)] = E(bV 2 + VW) = buv 2 + 0;
h C (S V)
ence orr ,
Cov(S,V)
= SD(S)SD(V) 175 = =
bo-y
..,jf>2 ..v>+,.w> •
b) Since Sand V are two different linear combinations of independent normal variables, they have
a bivariate normal distribution with
E(S) = I'S =a SD(S) =us= ../b2uv 2 + aw 2
E(V) = I'V = 0 SD(V) = uv
Let X and Y be standardized versions of S and V, i.e.,
X= S }' = V -JLv.
lTS lTV
Then X and Y have standard bivariate normal distribution with correlation p, so given X = x, Y
has normal (px, 1- p2 ) distribution, and V = uv Y has normal (puvx, uv 2 (1- p 2 )) distribution.
The event (S = s) is the same as the event (X = •:;s); thus, given that S = s, the distribution
of V is normal with mean
s -JLs) buv 2 (s- b)
(JI7V ( - - -
175
= -:-=--::-'----':-
b2 uv 2 + uw 2
and variance
2 2 uv 2 aw 2
uv ( 1 - p ) = b2 2
17V + lTW 2 •
c) The best estimate of V given S = s is E(VIS = s) = puv ( •-,...,)
t7S
= S""V;c;-bl,
a"V a,v
d)
as
o-yo-w
Vb:Javl+aw l
J!.
"'
182
Section 6.5
13. The standard bivariate normal distribution with correlation p can be obtained by taking two indepen-
dent standard normal variables X and Z and transforming (X, Z) into (X, Y) via the transformation
x=x
y = x cos 8 + z sin 8
where 0 is that unique angle in [0, 1r] whose cosine is Pi the joint distribution of (X, Y) is then the
required distribution. Under this transformation, the image of the point (x, z) = (cos(0/2),sin(B/2))
is
=
(x,y) (cos(0/2),cos(0/2) cosO +sin(0/2)sin 8) = (cos(0/2), cos(0/2)),
a point on the 45° line in the (X, Y) plane. Similarly, under this transformation, the image of the
= (-·
point (x, z) = (cos(0/2 + 1r /2), sin(0/2 + 1r /2)) sin(0/2), cos(B/2)) is
(x, y) = (- sin{0/2),- sin(0/2) cos 0 + cos(0/2) sin 8) = (- sin(0/2), sin(0/2)),
a point on the -45° line in the (X, Y) plane. These two images determine the axes of an ellipse of
constant density for the distribution of (X, Y), so the ratio of the lengths of these axes is
2
J2cos {8/2) = v'1 +cosO= fl+P.
j2sin 2 (8/2) v'l- cosO V
183
Chapter 6: Review
Chapter 6: Review
1. a) {o·, 1, 2, ... , 100}.

b) For k between 0 and 100,
P(X = k I X+ y = 100) = P(X = k, X+ y = 100)

P(X + Y = 100)
= _P.!....(X-:-.,-:=:-:-k.:....,Y-=-=--=-1-,oo....,..-,----:..k)
P(X + Y = 100)
-.\ 1 .>.100-k
- e I:! x e (too-l:)!
-
100!
100! ( ,\1 ) " ( ,\2 ) 1oo-k

= k!{100- k)! AJ + A2 .A, + .A2
This is the binomial distribution with parameters 100 and .Atf(,\1 + .A2).
c) Given X+ Y = 100, X has the binomial distribution with parameters 100 and I/100, by part
b). This is approximately the Poisson (I) distribution. The chance is about
1
e- {1/4! + 1/5! + 1/6!} = e- 1 {1/24 + 1/120 + 1/720} = 0.018472
2. For each integer x ;:-:: 0, y ;:-:: 0 with x + y ;:-:: 1 we have
P(X = x, y = y) = P(X =X, N =X+ y)

= P(X = xiN = x + y)P(N = x + y)
= (X;!/) (1/2Y'+ll(l/3){2/3)"'+Y- 1
= (X; y) (1/3)"'+!1 (1/2).
Hence
P(X =X, y = 0) = (1/3}"'(1/2), X= I, 2, 3, ...
P(Y = 0) = L{l/3)"'(1/2) = 1/4

P(X=x,Y=a)
P(X = xiY = 0) = P(Y = O) = 2(1/3) = (2/3)(1/3) :r-1 , x = l, 2, 3, ...
:r
So X has geometric (p = 2/3) distribution on {1, 2, 3, ... } given Y = 0. The most likely value of X
given Y = 0 is then 1, and E(XIY 0) = 3/2. =
3. First note that An+ Bn = 21' , hence An- Bn = 2An- 2JL ; then apply the results of Example 6.4.7
to get E(An) and SD(An)· Hence
E(An - Bn) = 2E(An)- 2JL = 2p.- 21' = 0;

u
SD(An- Bn) = 2SD(An) = 2 yn
'- - -
2 n- 1
2u
= v'21i"=!.
2n- I
4. No Solution
184
Chapter 6: Review
5. The joint density looks like that of a bivariate normal distribution. Indeed, try to find positive
constants a, b so that the distribution of (Xfa, Yfb) is standard bivariate normal with correlation p.
The joint density of (X fa, Yfb) is
abf(ax, by) = abcexp { -(a2 x 2 + abxy + b2 y2 ) } .
Comparing this with the standard bivariate normal joint density yields the equations
2 1
a = 2(1 - p2)
ab = - 2p
2(1 - p2 )
b2 = '1
2(1- p 2 )
1
abc=
which have solution a = b = y'f, p = -t ,

c = ...{li. The correlation between X and Y is the
same as (since correlation is invariant under linear transformations) the correlation between X fa and
Y/b , which is p = -1/2.
G. a) Use the technique of Section 5.4 (Operations):
--x-y = v
--x-y = v + dv
(X+ Ye du)
(X- Y E dv)
..-x+y= u +du
-x+y=u
u+v u-v) du dv
P(X+YEdu,X-YEdv)=fxy ( - - , - - r.:; r.:;·
· 2 2 v2 v2
Since the joint density of (X, Y) is fx.Y"(x, y) = e-ze-Y(x ;::: 0, y ;::: 0) , the joint density of
(X+ Y, X- Y) is
•· 1 (u+v u-v)
f(u,v) = 7,fx.Y - - , --
2 2
=
{
(·-·)
2 e- -2- if 2 > -
0 and >0
.!:!..=.!!.
2 -
otherwise.
This simplifies to
f(u, v) ={ if u ;?: 0 and -u :::; v :::; u

otherwise.
Check: Compute the marginals! Do you get what you expect?
b) If X and Yare independent with the same distribution, then
Cov(X + Y, X- Y) = E ((X+ Y)(X- Y)) - E(X + Y)E(X - Y)
= E(X Y 2 ) - E(X + Y)E(X- Y) = 0,
2
-
so the correlation between X and Y is zero.

185
Chapter 6: Review
7. The joint density of (X, Y) is
if 0 < x < 1 and 0 < tJ < 1 and x + y < 1

f(x,y) ={ otherwise
a) The marginal density of X is
if0<x<1
fx(x) ={ x)
otherwise
so E(X) = 1/3, Var(X) = 1/18.

b) Given Y = 1/3, X has uniform distribution on '(0, 2/3), so
E(XjY = 1/3) = 1/3, Var(XjY = 1/3) = 1/27.
c) The density of max( X, Y) is
4z if 0 < z < 1/2

f(z) = { z) if 1/2 < z < 1
otherwise
(i.e., it is triangular on (0, 1)), so max(X, Y) has mean 1/2 and variance 1/24.
d) The density of min(X, Y) is
f(w) ={ 2w) if 0 < w < 1/2

otherwise
so min(X, Y) has mean 1/6 and variance 1/72.
8. a) Given:
fy(y) = 2e- 211 , y >0
fx(xjY = y) = ;e-:rfy, x >0
Hence Jx,y(x,y) = /;.. .,;!Y = y)fy(y) = x > O,y > 0. Recall that if T has
exponential distribution, then Var(T) = [E(T)f and = 2(ET) 2 • So
E(Y) = 1/2, E(Y
2
) = 2(1/2) 2 = 1/2;
and
E(XIY = y) = y, E(X jY
2
= y) = 2y2 •
b) E(XjY) = Y, so E(X) = EE(XiY) = EY = 1/2
2
c) E(XYjY = y) = E(X yjY = y) = yE(XIY. = y) = y 2 , so E(XYIY) = Y • Hence
E(XY) = EE(XYIY) = EY
2
= 1/2.
Next, E(X 2 jY) = 2Y 2 , so
E(X 2 ) = EE(X 2 iY) = E(2Y 2 ) = 2EY 2 =I
and
Var(X) = E(X 2 ) - (EX) 2 = 1- (1/2) 2 = 3/4.
Therefore
Corr(X, Y) = Cov(X, Y) = E(XY)- (£'"'' rn = t!_l2)- (1/2)(1,
SD(X)SD(Y) JVar(X)Vartl 1 J(3/4)(1/4)
186
Chapter 6: Review
9. Condition on Z.
P[(X/Y) >(Y/Z)] = P
= 1 1
P lz z) 1 = dz =
1
P (x > dz
P (x > .!.y2) = 1../i 1- Y2 dy = Y- Y31../i =

z 0 z 3z y=O 3
Substituting into the first integral,
,
P[(X/Y) > (Y/Z)] =
11 3z 2
ldz
2
= 3 (J/ 2) •=o = 9
It 4
0
10. a) exponential (o- + {J + "Y)

b) "Y/(o- + {J + "Y)
c) By the memoryless property of the exponential distributions of TA and TB, it's the same as the
distribution of min(TA, TB), i.e., exponential (o- + {J)o
d) Call this To get the c.d.L use P(T2 $ t) = 1 - P(T2 > t)o and condition on
which component fails first:
P(T2 > t) = a + {J e-(-r+a)c + i

o-+fJ+"Y o-+fJ+"Y o-+fJ+"Y
11. a) Let N, = number of claims in time t. Given N, = n,
X,= Y1 + ··o + Yn
where theY; are independent exponential (JL), with mean 1/JL. So
E[XriN, = n] = JL
E[X.] = E[E[XdNc]] = E( N,) = E(N.) = ..\t
1-' 1-' 1-'
b) Similarly
E[X?IN, = nJ = E(Y1 + 0
•
0
+ Yn) 2
.= (!!.,. ) 2
+ V ar(Y1 + · · · + Yn)
= = + [E(N,W + E(Nt)]
= (..\2t2 + 2:At)/l-'2
c) SD(X,) = .j2..\tfp2 = -4P
d)
Cov(X,,X,) = Cov(X, X,+ X,- X,)
= Cov(X,, X,)+ Cov(X,, X,- X,)

= Var(X,)
because X, and X, - X, are independent. So
[SD(X,W SD(X,) (S
Corr(X.,X,) = SD(X,)SD(X.) = SD(X,) = V t
187
Chapter 6: Review
12. a) Let X; be the weight of the itll person.
P(total weight> 4000) = P 4000)
4000- 26(150))
= 1- ( . 1M = 1- = 1 - 0.7422 = l>.2578
v26 x 30
b) Let W; be the weight of the object the ith person is carrying.
E(X; + W;) = E(X;) + E[E(W; I X;)]= 150 + E(0.05X;) = 157.5

E[(X; + W;) 2 ) = E(Xf) +2E(X;W;) + E(W[)
= E(Xl) + 2E[E(X;W; I X;))+ E[E(W[ I Xi)]
= E(Xf) + 2E{o.o5X?) + E({0.05X;) 2 + 4]
2 2
= {1 + 2(0.05) + (0.05} )(150 + 900) + 4 = 25802.5
Var(X; + W;) = 25802.5- (157.5) 2 = 996.25
P(total weight> 4000) =P Wi) > 4000)
= 1- (4000- 26{157.5))
V26 X 996.25
= 1- -0.59) = = 0. 7224
c) Ignore any packing constraints and assume that the people will fit as long as the total area they
need is at most 54 x 92 = 4968 square inches.
4968 - 201')
( v2o = 0.99 =
20{0.11£)
Thus,
4968 .
I' = Inn = 236 square mches
2.33(0.l)v20 + 20
13.
Cov(X - Y, X + Y) = Var(X) - Var(Y) - Var(X) - Var(Y)
= 0 iff Var(X) = Var(Y)
a) By a), X - Y and X+ Y are uncorrelated because Var(X) =1= V ar(Y). So X - Y, X+ Y
are uncorrelated (hence independent)
P(X- Y < 1,X + Y > 2) = P(X- Y) < l)P(X + Y > 2)

= :;h )[1 -
= (0.8687)(1 - 0.8687)
= 0.114
14. Pick a persc:; at random from the population. Let H be that person's height, and let G ( = m, f) be
the gender of that person.
a) E(H) = E(HIG = m)P(G = m) + E(HIG = f)P(G =f)= 67 · (1/2) + 63 · (1/2) = 65.

b)
E(H
2
)
2
= E(H 1G = m)P(G = m) + E(H 2 IG = f)P(G =f)
2 2
= (3 + 67 ). {1/2) + (3 + 63 2 ). (1/2)
2
= 4238
188
Chapter 6: Review
so Var(H) = 4238- (65) 2 = 13. Or use

67-63 2
Var(H) = E[Var(HIG)] + Var[E(HIG)] = (3
2
+ 32 ) • (1/2) + ( 2
) = 9 +4 = 13.
c)
P(63 < H < 67) = P(63 < H < 67IG = m)P(G = m) + P(63 < H < 67IG = f)P(G =f)
=( 1/ 2)P(63;67 <Z< 67;67) +( 1/ 2)P(63;67 <Z< 67;63)
= 4>(4/3) - <1>(0)
= .4088
d) P(63 < H < 67) = ( 6 Jij5 ) - ( 6 \m-5 ) = .4209.

The answers are different because in c), the distribution of H is not normal: it's a mixture of
normals, which results in a normal distribution only if the two distributions being mixed are
identical. The answers to c) and d) differ only slightly because the distributions in c) and d) are
both unimodal and symmetric about 65, with the same standard deviation, and we are seeking
the probability of an event which is symmetric about 65.
Remark: You can show by calculus that a half-and-half mixture of a normal (a, u 2 ) distribution
with a normal (b, u 2 ) distribution will have a bimodal density if and only if Ia- bl > 2a.
e) Let X be the man's height, and Y the woman's height. Argue that X- Y = X+ (-Y) has
normal distribution with mean 67 - 63 = 4, and variance 3 2 + 32 = 18, and therefore
P(X > Y) = P(X- Y > 0) = P (

Y
X-ViS -4 ViS
-4)
> = 4>(4/vl8) .83.
15. a) By rotational symmetry, the desired probability is one-quarter the chance that (X, Y) falls in
the square:
The square has side length .../2, so after a 45° rotation its probability is seen to be
Hence the desired probability is
P(X 0, Y O,X +Y 1) = ±(2<1>(1/v'2)- 1] 2 = [<1>(1/./2) -l/2r.

b) By the same argument, we have fort> 0
P(X +Y t,X 0, Y 0) = [<1>(t/v'2)- 1/2r

and
2
P(X +Y tlX 0, Y 0) = (2<1>(t/vf2)- 1) •
Differentiate to get
P(X +Y E dtlX 0, Y 0) = 2..fi,P(tf..fi) [2<l>(tf..fi)- I) dt.

189
Chapter 6: Review
c) For the median, find t• so that
[24i(t• f.../2) - 1]
2
= i
4i(t" /.../2) = (1 + = .8536
t•
../2 = 1.05 t" = 1.49 .
d) For the mode, ignore the 2../2, let x = tf../2 in the density, differentiate with respect to x and
set the result equal to 0: We have
2
-1] = <P'(x)[24i(x) -1] +2<P(x) ,
where 9'1'(x) = -x<P(x). Thus x solves

4i(x) = 9'1(x) + .!..
X 2
Trial and error gives x = .876901, hence the mode is t = ../2x = 1.240125 .
16. a) Let X denote the rainfall in any particular year. X has gamma(r = 3, ,\ = 3/20) distribution.
So
25
P(X > 35) = P(Nas $ 2) = e- 5 • (1 + 5.25 + .1051,
·... .!re N3s has Poisson( !ox 35) distribution.

10
b) ': - .1051) :::::: .3294.
c) - [1 - P(X > R2o)] 10 ; calculate P(X > R2o) as in part a).
d) Let M1 be the maximum rainfall over the period from 20 years ago to 10 years ago, let M2
be the maximum rainfall over the last 10 years, and let Ma be the maximum rainfall over the
next 10 years. The M's are independent and identically distributed, by assumption. Want
P(MJ > max[M1, M2]), which is 1/3 by symmetry.
17. Fact: (X, Y) has a joint distribution which is symmetric under rotations if and only if for all fJ, (X, Y)
has the same distribution as (X cos fJ - Y sin fJ, Y cos fJ +X sin fJ) .
Proof: The point (x, y), when rotated by (} about the origin, is mapped to the point (x cos 8 -
ysinfJ,ycos8+xsinfJ).
a) X and Y are not necessarily independent. For example, (anticipating part (b)) let (X, Y) be a
point (in the plane) picked uniformly at random from the unit circle {(x, y) : x 2 +y 2 = 1} . Then,
given X = x , the value of Y is constrainM to at most two points, whereas unconditionally, Y
has a continuous distribution on the interval (-1, 1).
In general, X and Yare uncorrelated: Put 6 = lf in the above Fact to see that (X, Y) has the
same distribution as (-X,- Y); hence E(X) = E(Y) = 0 (this is intuitively clear). Put 8 = lf/2
to see that (X, Y) has the same distribution as (-
Y, X) ; hence X has the same distribution as
Y. (Again, this is intuitively clear.) Put (} =
lf/4 to see that (X, Y} has the same distribution
X-Y XJ{)
as ( --;;r-. 2 • Hence
E(XY) = E (X (X = E(X22_ Y2) = o = E(X)E(Y).

(The third equality follows from the fact that X has the same distribution as Y.)
b) The distribution of (X, Y) is clearly symmetric under rotations. Since X has the same distribu-
tion as Y (see part (a)), we have S(X 2 ) = E(Y 2 ). But X 2 + Y 2 = 1. Hence E(X 2 ) + E(X 2 ) = 1
, and E(X 2 ) =
S(Y 2 ) = 1/2. Since X and Yare uncorrelated, we have E(XY) 0 (see part =
(a)).
190
Chapter 6: Review
c) (X, Y) is uniformly distributed on the unit circle, as in (b).

[Reason: Let e ( -1r < 8 < 1r ) be the angle subtended by the point (X, Y) . Show that e is
uniform on ( - l f , 1r): If -'i < B < then
P(B < 8 < B +dB)= P(B < tan- 1 (Y/X) < B +dB)
= P(tan 0 < Y/X < tan(O +dO))
= P(tan 0 < tan 21fU < tan(O +dO))
= P(B < 21rU (x),P(x)dx, hence
E(max) = 2x4>(x),P(x)dx
d
j
oo
i:
= -oo 2[dx - .P(x)]4>(x)dx
= 2,P(x),P(x)dx
= 1/..fo.
By symmetry, E(min) = -1/Vi.
b) By scaling from a),
a
E(max) =I'+ Vi' E(min) =I'-
c)
P(max E dx) = 2P(X E dx)P(Y $ xiX = x) = 2.P(x)4> x- px ) dx

r,--:;;
( v 1- p2
Integrating by parts as in a) gives
E(max) = V----;;:-·
19. a) The fraction F of elements seen in the sample is
F = .;; t /(ele.ment i is seen at least once).
The indicators are identically distributed, so F has expectation
E(F) = .!.n x n x P(element 1 is seen at least once)
= 1 - P( element 1 is not seen) = 1 - ( 1 - ;; ) n
b) About 1- e- 1 •
c) The number of elements not seen is N = J; where J; =/(element i not seen). The J;'s
are identically distributed (not independent). Argue that
Var(J;) = E(J;)- [E(J;)] 2 $ 1/4

and (if i :f; k)
Cov(J;, J") = E(J;J")- = (1 -;;2) n - ( 1)

1-;; 2n
191
Chapter 6: Review
Note that Cov(J;, lJ,,) $ 0 for all n (if n;:::: 2, use the fact that 0 $ 1- 2/n $ (1- l/n) 2 ); so
n
Var(N) = Var(L Ji) = nVar(Jt) + n(n -l)Cov(JJ, J2) $ n/4,

i=l
and
Var(F) = Var (n- N) = Var(N) < J__.
n n2 - 4n
d) Chebychev gives
P(IF- E(F)I > .01) < Var(F) < 2500
- (.01)2 - n
If n = en 1 this last is .00096.
20. Let Xo be the result of the first roll. Then
E(Y) = E(Y!Xo = x)P(Xo = z).

If x = 2, 3, 7, 11, or 12, then E(Y!Xo = x) is 1. Otherwise it equals 1 + ), where denotes the
number of throws of a pair of dice required to see either an x or a 7. has the geometric distribution
on { 1, 2, ... } with parameter p =
P(x) + P(7) =
(x + 5)/36; therefore E(G:r) 1/p 36/(x + 5), = =
and
E(Y) *
= 1 · + (1 + ¥) · fG · 2 + (1 + · • 2 + (1 + · • 2 = 3.3757576.
21. a) E(WH) =
b) Let E(WHH) = x. Then
x = (1- p)(1 + x) + p(2p + (2 + x)(1- p))

= q + qx + 2p + 2pq + xpq
2
q + 2pq + 2p2
x=
1-q-pq
) wHHH { k +w if WT = k, k = 1, 2, 3 where given that WT = k, the r.v. W has the

C = 3 else
same distribution as WHHli·
So x = (1 + x )q + {2 + x )pq + (3 + x )p2 q + 3p3

= q+2pq +3iq+3p3 +(q+pq +lq)x
q + 2pq + 3p2 q + 3p
3
x=
1- q- pq- p2q
d)
m
x = L)k + x)pk-lq + mpm
k=l
kpk-lq + mpm
So X = 1- '"'m p
k-l
q
22. (i) Let Wn denote the number of trials until n heads in a row will be observed. Then
W _ { Wn-1 + 1 with probability l/2
" - Wn_ 1 + I+ probability I/2
where is an independent copy of Wn. Therefore
E(Wn) = tE(Wn-t + 1) + tE(Wn-1 +I+
which implies
tE(Wn) = E(Wn-d + 1 .
Solve the above difference equation with E(W1) = 2: get E(Wn) = 2"+1 -2.
Put n = 100: then E(W1oo) = 2 101 -2 seconds:::::: 8.04 x 10 22 years.
192
Chapter 6: Review
(ii) Use Markov's inequality to get a rough answer: since P(Wtoo > n) will do to take
n such that
E(Wtoo) .0 1 _ n;:: 10 2 x ( 2 101 _ 2 ).
n
roughly 8.04 x 102 i years.
23. Let X denote the total number of eggs laid, Y the number that hatch and Z the number that don't
hatch. Then X= Y + Z.
P(Y =y) =I: 00 ( )

x p!l(l- p)"'-y_e_,-
Y X.
r=y
- e-p.\(p>.)Y
- y!
SoY has a Poisson(p>.) distribution. Similarly Z has a Poisson ((1- p)>.) distribution.
Independence:
P(Y = y, Z = z) = P(Y = y, Z = z, X= y + z) since (Y = y, Z = z) C (X= y + z)

= P(Y = y, Z = z!X = y + z)P(X = y + z)
y!z!
= P(Y = y)P(Z = z)
24. T = iN;, N; =number of dice that land i

ET = 42, Var(T) = 364, SD(T) = 19.08
25. a) Nt and N2 are independent Poisson variables having parameters >.p1 and >.p2 respectively: Let
n1 and n2 be nonnegative integers. Then for each n ;:: n 1 + n2 we have
= P(nt of the accidents result in 1 injury and n2 of the accidents result in 2 injuries IN = n)
= n! nt
Pt P2
n,(l - Pt -• P2 )n-nl-n'l
nt!n2!(n- n1- n2)!
and
00
= L P(N = n)P(Nt = nt, N2 = n2!N = n)

n=n 1 +n2
= e-.\(>.p1 )"' (>.P2)" 1 e>.(r-p,-n}

n1 !n2!
= e->.p, (>.pt)"' e->.Pl(>.p2)"'
193
Chapter 6: Review
b) We have M = L::O=o kNk . Generalize part (a) to see that the variables No, N 1 , N 2 , ••• are
independent, and Nk has Poisson (.\pk) distribution. Therefore
k=O k=O k=O

00 00 00
k=O k=O k=O
26. Let N denote the number of points falling in the unit square. The key idea is that given N with
N 1, the N points of the scatter are distributed like N points picked uniformly and independently
at random from the square. (See following exercise for a calculation which verifies this). In particular,
given N = 2, the two points are independently uniformly distributed on the unit square. Draw a circle
of radius 6 about one of them. The chance that the other is outside is 1- 1r6 2 , provided the first point
is not within 6 of a side of the square; if the first point is within 6 of a side, then the chance is bigger
than 1- 1r6 2 • So overall, P(DIN = 2) 1 - 1r6 2 •
Given N = 3, the three points are independent and uniformly distributed on the unit square. So
P(DIN = 3)
= P(two points not within 6) x
P(third point more than 6 away from both I first two not within 6)
(I - 7:6 2 )[1 - P(third point within 6 of one of the first two I first two not within 6)]
2
7:62 )(1- 21r6 )
1 - (7:6 2 + 27:62 ) = 1- 37:62 •

In general,
P(DIN = n) (1- (r6 2 + 21r6 2 + ... + (n- 2)1r6 2 )) (I- (n- I)1r6
2
)
2 n(n- 1) 2
.. ·+n-1)=1- 1r6.
2
Therefore
00
P(D) =L P(DIN = n)P(N = n)

n=O
n=O
27. By considering B 1 +t = S Bk, we may consider, without loss of generality, a partition Bt, ... , Bi
of S. Since X, 's are conditionally independent and identically distributed, given N = n, (N(Bt), ... , N(Bi)J
has a multinomial distribution with parameters P(X; E Bt) = Q(Bt), ... , P(X, E Bi) = Q(B1 ).
Therefore
P (N(Bt) = nt, ... , N(B =ni] 1)
194
Chapter 6: Review
= P [N(Bt) = n 1, ... , N(B;) = n;, N = n1 + · · · + n;], (since B,c's partition S)

= P [N(Bd = n1, ... , N(B;) = n; IN= n1 + · · · + n;] x P(N = n1 + · · · + n;)
n! e-")."•+···+n;
= nd···n;![Q(Bi)]"' ···[Q(B;)]"i X (nt +···+n;)!
e ->.Q(Bd[).Q(Bi)]"' e->.Q(B;l().Q( B; )J"i
= .l
n J.
28. a) P(X = Y) = P(X- Y = 0). Now X and Y are independent, each approximately normally
distributed with mean N/2 and variance N/4, so X- Y is approximately normally distributed
with mean 0 and variance N /2. Therefore
I 1
P(X- Y = 0):::::: = cr::r·
'l/27r(N/2) V7rN
To have P(X = Y) :::::: 1/10 we need N:::::: 32.
b) P(x _ kiX _ Y) _ P(Xck,X=Y> _ PCX=k,Y=k)
- - - P(X:oY) - P(X Y)
_
-
P(X=k)2
P(X=Y).
By the normal approximation, this is roughly
( I
J27r(N/4)
2
e-(tc-(N/2)) /2(N/i))
2
/-I-
.JiN
= 1
J21r(Nj8)
2
e-(k-(N/2)) /2(N/8).
That is, given X = Y, X has approximately normal distribution with mean N /2 and variance
N /8 - same as unconditionally, but with variance half as big. So the conditional probability
that IX- N /21 ..fN/2 is somewhat larger than 68%.
Another way: The probability in question is approximately P(IX *I IIX * = Y *) where X*
and Y * are independent standard normal variables. Slice the standard bivariate normal density
along the line {(x,y) : x = y}. The resulting curve is again a. standard normal density, by
rotational symmetry. The probability in question is approximately the area under this curve
above the line segment {(.r, y): x = y,lxl 1}, divided by the total area under the curve (which
is 1 ). So the probability in question is approximately the area under the standard normal density
from -../2 to ../2, which is greater than the area from -1 to 1.
Yet another way: Let X *• Y * be independent standard normal variables.
Let U =(X* +Y*)f../2, V =(X* -Y*)f../2. Then U and V are again independent standard
normal variables, so
P(X* E dxiX* = Y*) = P (U E dxjV = o) = P ( E dx).
That is, conditional on X*= Y *• X* lias .normal distribution with mean 0 and variance 1/2.
29. a) Fix k ;:: 1, and suppose N ;:: k + I . Let Good1 denote the event that there is a good element
in the first place, Bad1 the event that there is a bad element in the first place. Given Good1,
T, -I equals zero; and given Badt, Tt - 1 has the same distribution as Tt, where r;· is the place
of the first good element in a random arrangement of k good elements among N- I elements.
So
E[(Tt- l)(T2- 2)1Badt] = E[T."(T." -1)] = a(k, N- 1),
and
a(k,N) = E[Tt(Tt -1)]

= E[Tt(Tt -I)IBadt] · P(Badl) +O
= E[(Tt - I)(Tt- 2) + 2{Tt - 1 )!Bad.]· P{Badt)
= {E[(Tt- l)(T1- 2)1Bad!] + 2E[T1- 1IBadt]} · P(Bad.)
N -k
= [a(k, N- I)+ 21-'(k, N-
= [a(k, N- I)+ 2 k 1
] N; k.
195
Chapter 6: Review
b) Claim: For each k;::: 11 the formula holds for each N = k 1 k + 11 k + 2, ....
Proof: Induction on N. If N = k 1 then T1 1 so T1(T1 -1) = 0 1 so =
a(k N) = ET,1 (T,1 - 1) = o= 2(N + 1)(N- k)
I (k+1)(k+2)
If claim holds for N =m 1 where m ;::: k 1 then
+a(k,m)]
+1 m+1
_ (m 1 _ k) [-2- 2(m- k) ]
- + . k+1+(k+1)(k+2)
_ 2(m+2)(m+1-k)
- (k+l)(k+2)
so claim holds for N = m +1 .
c)
Var(TI) = E(T{)- (ETt) 2
= ETt(Tl - 1)- (ETd + (ET!)

2
= 2(N+1)(N-k) _ (N+1) + N+1
(k+1)(k+2) k+1 k+1
= (k 2) [2(N- k)(k + 1)- (N + 1)(k + 2) + (k + l)(k + 2)]

(N + I)(N- k)k
= (k+I)2(k+2).
d) If k = 1 then T1 has uniform distribution on 1, ... , N, so Var(T1) = N:2 1 , whiclt agrees with
the formula in (c).
e) Let k ;::: 1 . If 1 $ i $ k + 1 then
Var(Ti) = Var
=
. L
k=l
=. i( i - 1)
+2---Cov(Wt W2)
2
1
= iVar(T1) + i(i- l)Cov(W1, W2)

by exchangeability. Put i = k + 1, get
0 = Var(N + 1) = = (k + l)Var(Tt) + (k + l)kCov(Wt. W2)
¢::::::> Cov(Wt, W2) = -[Var(Tl)]lk.
So
Var(T;) = iVar(T + i( i - 1)Cov(Wt, W2)

1)
i - 1]
= iVar(Tt) [ 1 - --r-
i(k + 1- i) (N + 1)(N- k)k
k (k+1)2(k+2).
f) The distribution ofT; is the same as the distribution of N + 1 - Tk+I-• since T1 .... , would
have the same distribution if we had "read" the N elements in the reverse order.
196
Chapter 6: Review
g) Here k = 4, N = 52, so
E(T•) = iNk ++ 11 = 535 i . . (N + I)(N- k) 424. .
I
and Var(Ti) = a(k + 1- a) (k + 1)2(k + 2 ) = 25 1(5- a).
E(T;) Var(T;) SD(Ti)

1 10.6 67.84 8.24
2 21.2 101.76 10.09
3 31.8 101.76 10.09
4 42.4 67.84 8.24
30. a) For large n, A.,u has approximately the normal(t, distribution.
P(IA.,ll- .!I< c)= p (IAau- (1/2} I< c )

2 J1/12n
::::: y'i2;c) - cfl( -..Jl2;c) - 1 as n - oo
The distribution of order statistics of independent uniforms is described in Section 4.6: has
beta.(k, n- k +I} distribution.
1 I I n 1
E(Ae.rt) = 2[E(V1 + Vn)] = 2(n + 1 + n + 1) = 2
2 12 1 2 I
E(Aur) = 4E(Vt) + 2E(V1 Vn) + 4E(Vn)
=1 2 1 E(V.1 V. ) 1 n
4 (n + 1)(n + 2) + 2 n + 4n + 2
E(Vt Vn) = E[E(Vt Vn I Vi)]= E[Vt E(Vn I V!)]
n-1 1 2 n-1
= E[Vt(V1 + - - ( 1 - Vl))J = -E(V1) + --E(Vt)
n n n
1 2 n-1 1 1
=- +----=--
n(n+1)(n+2) n n+1 n+2
'2 1 1 n n2 + 3n + 4
= 2(n + 1)(n + 2) + 2(n + 2) + 4(n + 2) = 4(n + 1}(n + 2)
2
Var A _ n + 3n + 4 1 2_ 1
( e.rr)- 4(n+I)(n+2) -(2) - 2(n+l)(n+2}
Now use Chebychev's inequality to establish,
1 Var(Ae.-r) 1
c2 =1-2(n+l)(n+2)c2-1 asn-oo
E(Amod) = (n + 1)/ 2 = .!_

n+1 2
E(A'2. )= ((n+l)/2)((n+1)/2+I) = n+3
mod (n + l)(n + 2) 4(n + 2)
Var _ n + 3 _ .!. 2 _ 1
(Amod)-4(n+2) (2) -4{n+2)
Again, using Chebychev's inequality,
1 Var(Amod) 1
P(IAm.d- 2l 1- (2 =I-
4
(n + 2 )c'2 .-.. 1 asn-oo
b) Aezr is expected to be much closer tot. since it's variance is much smaller.
197
Chapter 6: Review
c) The Central Limit Theorem allows you to use the normal approximation for Aau. The beta(r, .!)
distribution is approximately normal for r and " large so Amid can also be approximated by the
normal distribution._
P 0.49 .
< A.,u < 0.51)::::: 4> ( 0.01 ) - 4> ( -0.01 )
( J1/(12 ·101) J1/(12 ·101)
= 4>(0.35) - 4>( -0.35) = 0.2736
P(0.49 < Amid < 0.51 ) ::::: ""

'*" ( 0.01 ) - ""
'*" ( -0.01 )
J1/(4. 103) J1/{4. 103)
= 4>(0.21)- 4>(0.21) = 0.1664
Another method is necessary for Au:t because its distribution is not close to normal. The pair
{'\tl, Vn) has joint density
j(x, y) = n(n- 1){y- rr- 2 {0 <X < Y < 1)
Let Z = '\tl + Vn. Then {or 0 $ z $ 1,
fz(z) = 1! n(n- I)(z- 2:rt- 2 d:r
n
= -z
2
n-1
For 1 $ z $ 2,
fz(z) = /.!
•-1
n(n- l){z- 2x)"- 2 d:r
z/2
n
= --(z-
2
2x)
n-1
ls-=z-1
n
= -(2- z)
2
n-1
Since Ae:t = .Z/2, it has density
for 0 $ x $
for :5 x ::5 1
Now the probability can be computed directly.
P(0.49 < Au:-c < 0.51) = 21

•
0.!>0
0.49
(101)2100:r100d:r = 2101:r101
10.!>0
.r=0.49
101
=I- (0.98) = 0.8700
31. No Solution
32. No Solution
33. No Solution
34. No Solution
35. No Solution
198

Complete Solutions PDF

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Complete Solutions PDF

Încărcat de

Drepturi de autor:

Formate disponibile

SPRINGER TEXTS IN STATISTICS

1. a) 2/3 'b) 66.67% c) 0.6667 d) 4/7 e) 57.14% f) 0.5714

2. a) 7/10 = 0.7 b) 4/10 = 0.4 c) 4/10 = 0.4

7. a.) ?(maximum :52)= P(both dice :52)= 4/36 = 1/9

e) As in the previous exercise, 2:;= 1 P(x) = 1.

4. a) Overall gain = ($8) x IO- ($I) x 90 = -$10.

_ r,..11 x (#wins)- 1 X (#losses) _ rpa 11 - r#

8. a) P(A u B) = P(A) + P(B)- P(AB) = 0.6 + 0.4- 0.2 = 0.8

b) P(exactly 1 of A, B, C)= P(AB"C") + P(A" B"C) + P(A" B"C)

= P(A) + P(B) + P(C)- 2 (P(AB) + P(AC) + P(BC)) + 3P(ABC)

c) P(exactly none)= 1- P(AU BUG)

P(A u B uC) = P[(AU B) u C)= P(Au B)+ P(C)- P[(Au B)C]

(uf=tA;) n An+l = u;=l (A;An+l),

= 2:: P(A;)- L P(A;A;) + 2:: P(A;A;Ak)- ...

(by regrouping terms.) So the claim holds for n + 1.

P (g A;) =P ( [Q A;] u An+ I) (Q

P(B1 B2 ·--En) = P(n Af) = P((U A;)") = I - P(U A;)

1-2:: P(A;) =I- L(l- P(R)) = L P(B;) - (n- 1)

Use the induction hypothesis to bound the first term:

P(ui'=tA;) 2::: P(A;)- 2::: P(A;A;)

and use Boole's inequality to bound the last term:

P ((uf=tA;)An+d = $ 2::: P(A;An+d

The claim is that

( -1)m A;)- S(m; At, ... , An)) O.

( -1)m+t A;)- S(m + 1; At, ... , An)) 0.

S(m + 1; At, ... , An)+ P!An+t)- S(m; AtAn+t, ... , AnAn+d

= S(m + 1;Ax, ... ,An+t). (**)

a.) (iii) ; since r is unknown.

P(Dc B) = P(DciB)P(B) = (1- P(DIB)JP(B) = 0.99 x 1/3 = 0.33

b) P(Ut) = t = P(U2); P(WIUt) = i; P(BIUt) = P(WIU2) = P(BIU2) =

P(x is selected) = P(School i is selected, and then x) = p; · ( . / I .) .

For scheme C to be equivalent to scheme A, this chance should be 1/1000. Therefore,

10. a) Let A; be the event that the ith source works.

a) P(B1B2) = P(B1B2I idcntical)P(identical) + P(B 1B2 I fraternal)P(fraternal)

P(G 2 IB 1 )= P(B1G2) = 1/4·(1-p) = !.(1 _ )

1. a) P(black) = P(black I odd)P(odd) + P{black I even)P(even) = tx t+ x t = -J.

a) P(W2) = P(Wt W2) + P(B1 W2) = • ts + 6

c) Repeat a.), with symbols:

clA) _ P(AIB.,)P(B•) _ o.ao _ 40 O B • 1 r dd

c) P(error in transmission)= U RtTo)

d) (.02){.8) + (.01){.2) = 9/500

a) P(+) = P(+ID)P(D)+P(+IDc)P(Dc) = .0575.

c) P(- n DC) = P( -IDC)P(Dc) = .9405

Hence P( Dl+) = 99t_:.16 = 1 1; .

e) Yes. See the explanation after Example 3.

Since P(Box i I white) is largest when i =

P(Box 3 and white)+ P(Box 1 and black) = 5/12.

b) Suppose your strategy is to guess box i with probability p; (i = 1, 2, 3; Pl + P2 + p 3 = 1)

P(Box 3 and white)+ P(Box 1 and black) = :6 + = 176 .

P(Box I and white) = !.Pt

to determine the posterior probabilities:

Similarly P(guess correctly I Box 2) =

1. This is just like the birthday problem:

So, the answer is 5.

b) P( exactly one H I at lea.st one H) + P( at lea.st two H I at lea.st one H) = 1,

6. a.) Ps = pg = ··· = 0. (Never need to roU more than seven times)

p, = (5/6) · (4/6) · (3/6)

7. a) P3(Pt + P2 - PtP2) = P3Ptq2 + P3qtP2 + P3PtP2.

1. P(both defective I item picked at random defective)

4. a) P(black I Box 1) = 3/5 = P(black I Box 2)

P(pass test)= P(S1S2) + P(SfS2S3) = zh + (1- z)hz = zh(2- z).

6. a) Since P(B) = P(AB) + P(A"B),

P(A" B)= P(B)- P(AB) = P(B)- P(A)P(B) = P(B)(1- P(A)) = P(B)P(A")