Sunteți pe pagina 1din 4

4. (a) Find E(2X ) for X ⇠ Pois( ) (simplify). 2.

2. A group of 360 people are going to be split into 120 teams of 3 (where the order of
P P1 teams and the order within a team don’t matter).
By LOTUS, E(2X ) = 1 k
k=0 2 e
k
/k! = e k=0 (2 )k /k! = e e2 = e . (a) How many ways are there to do this (simplified, in terms of factorials)?
(b) Let X and Y be independent Pois( ) r.v.s, and T = X + Y . Later in the course, Similarly to the Strategic Practice problem about partnerships, imagine lining the people
we will show that T ⇠ Pois(2 ); here you may use this fact. Find the conditional dis- up and saying the first 3 are a team, the next 3 are a team, etc. This overcounts by a
tribution of X given T = n, i.e., find the conditional PMF P (X = k|T = n) (simplify). factor of (3!)120 · 120! since the order within teams and the order of teams don’t matter.
Which “important distribution” is this conditional distribution, if any? So the number of ways is
360!
.
k n k
✓ ◆ 6120 · 120!
P (X = k, X + Y = n) P (X = k)P (Y = n k) e e e2 n! n 1
= = = ,
P (T = n) P (T = n) k! (n k)! (2 )n k 2n (b) The 360 people consist of 180 married couples, and a random split into teams
is chosen, with all possible splits equally likely. Find the expected number of teams
which is the PMF of the Bin(n, 1/2) distribution.
containing married couples. (You can leave your answer in terms of binomial coefficients
(c) Again let X and Y be Pois( ) r.v.s, and T = X + Y , but now assume now that and a product of a few terms, but you should not have summations or complicated
X and Y are not independent, and in fact X = Y . Prove or disprove the claim that expressions in your final answer.)
T ⇠ Pois(2 ) in this scenario. Let Ij be the indicator for the jth team having a married couple (taking the teams to
be chosen one at a time, or with respect to a random ordering). By symmetry and
The r.v. T = 2X is not Poisson: it can only take even values 0, 2, 4, 6, . . . , whereas linearity, the desired quantity is 120E(I1 ). We have
any Poisson r.v. has positive probability of being any of 0, 1, 2, 3, . . . .
Alternatively, we can compute the PMF of 2X, or note that Var(2X) = 4 6= 2 = 180 · 358
E(I1 ) = P (first team has a married couple) = 360 ,
E(2X), whereas for any Poisson r.v. the variance equals the mean. 3

since the first team is equally likely to be any 3 of the people, and to have a married
couple on the team we need to choose a couple and then any third person. So the
expected value is
120 · 180 · 358
360 .
3
120·180·358
(This simplifies to 360·359·358/6 = 360
359
. Another way to find the probability that the
first team has a married couple is to note that any particular pair in the team has
1
probability 359 of being married to each other, so since there are 3 disjoint possibilities
3
the probability is 359 .)

10 21
3. Let X ⇠ Bin(100, 0.9). For each of the following parts, construct an example showing 6 Stat 110 Midterm from 2012
that it is possible, or explain clearly why it is impossible.
1. A book has N typos. Two proofreaders, Prue and Frida, independently read the book.
(a) Is it possible to have Y ⇠ Pois(0.01) with P (X Y ) = 1? Prue catches each typo with probability p1 and misses it with probability q1 = 1 p1 ,
independently, and likewise for Frida, who has probabilities p2 of catching and q2 = 1 p2
This is impossible since there is a nonzero chance that Y is greater than 100, whereas
of missing each typo. Let X1 be the number of typos caught by Prue, X2 be the number
X must be less than or equal to 100.
caught by Frida, and X be the number caught by at least one of the two proofreaders.
(b) Is it possible to have Y ⇠ Bin(100, 0.5) with P (X Y ) = 1? (a) Treating N as a known constant, find the distribution of X (simplify).

This is possible since we can have a sequence of trials where we “set the bar” for success By the story of the Binomial, X ⇠ Bin(N, 1 q1 q2 ).
at 2 di↵erent places. That is, X and Y are based on the same trials but with X having (b) For this part only, assume that p1 = p2 . Find the conditional distribution of X1
an easier threshold for success. For example, consider U1 , . . . , U100 i.i.d. Unif(0,1), and given that X1 + X2 = t, again treating N as a known constant (simplify).
define “success” on the jth trial to be Uj  0.9 for X and Uj  0.5 for Y . As another
This has the exact same structure as the Fisher exact test, so X1 |(X1 + X2 =
example, consider the chicken and egg story, letting the probability of hatching be 0.9
t) ⇠ HGeom(N, N, t). Alternatively, use Bayes’ rule directly. Let p = p1 = p2 and
and of hatching and surviving be 0.5, and let X be the number of chicks that hatch and
T = X1 + X2 ⇠ Bin(2N, p). Then
Y be the number of chicks that survive.
N t+k N N N
(c) Is it possible to have Y ⇠ Bin(100, 0.5) with P (X  Y ) = 1? P (T = t|X1 = k)P (X1 = k) t k
pt k q N k
pk q N k t k k
P (X1 = k|T = t) = = 2N
= 2N
P (T = t) t
p t q 2N t
t
This is impossible since if Y X 0 has probability 1, then E(Y X) 0 (an average
of nonnegative numbers can’t be negative!). But this would imply E(Y ) E(X), for k 2 {0, 1, . . . , t}, so again the conditional distribution is HGeom(N, N, t).
contradicting E(X) = 90, E(Y ) = 50. Intuitively, it would be ridiculous if a success (c) Now suppose that N ⇠ Pois( ). Find P (X1 = a, N X1 = b) for all nonnegative
probability of 0.5 rather than 0.9 on trials could somehow guarantee doing better or integers a, b (simplify).
equal. Alternatively, note that P (X  Y ) = 1 would imply P (X = 0|Y = 0) = 1, but
we have This has the exact same structure as the chicken-egg story. So X1 ⇠ Pois( p1 ),
P (X = 0, Y = 0) P (X = 0) N X1 ⇠ Pois( q1 ) independently. Thus,
P (X = 0|Y = 0) =  < 1.
P (Y = 0) P (Y = 0) p1
e ( p1 ) a e q1
( q1 ) b
P (X1 = a, N X1 = b) = .
a! b!

22 24
10 Stat 110 Midterm from 2017 2. A survey is being conducted in a city with a million (106 ) people. A sample of size
1000 is collected by choosing people in the city at random, with replacement and with
equal probabilities for everyone in the city.
1. Monty Hall is trying out a new version of his game, with rules as follows. The
contestant gets to choose one of four doors. One door has a car behind it, another has (a) Find the probability that no one gets chosen twice (do not simplify).
an apple, another has a book, and another has a goat. All 24 permutations for which This has the exact same structure as the birthday problem. By the naive definition
door has which prize are equally likely. In order from least preferred to most preferred, of probability, the probability of no match is
the contestant’s preferences are: goat, apple, book, car.
✓ ◆✓ ◆ ✓ ◆
Monty, who knows which prize is behind each door, will open a door (other than 106 (106 1)(106 2) · · · (106 999) 1 2 999
the contestant’s initial choice) and then let the contestant choose whether to switch 6 1000
= 1 6
1 6
··· 1 6
.
(10 ) 10 10 10
to another unopened door. Monty will reveal the least preferred prize (among the 3
doors other than the contestant’s initial choice) with probability p, the intermediately
preferred prize with probability 1 p, and the most preferred prize never. (b) Find a simple, accurate approximation to the probability from (a) (don’t spend
The contestant decides in advance to use the following strategy: Initially choose much time on arithmetic, but for full credit you should obtain a simple answer, not
Door 1. After Monty opens a door, switch to one of the other two unopened doors, involving any big numbers. Indicator r.v.s are useful here, but creating 1 indicator for
randomly choosing between them (with probability 1/2 each). each of the million people is not recommended since it leads to a messy calculation. Feel
free to use the fact that 999 ⇡ 1000).
(a) Find the unconditional probability that the contestant will get the car. (Simplify.)
Let Iij be the indicator of the ith and jth sampled people being the same person, for
Let W be the event that the contestant wins the car, and Gj , Aj , Bj , and Cj be the
1  i < j  103 , and let X be the sum of the Iij . We want to approximate P (X = 0).
events that Door j has a goat, apple, book, and car, respectively. By LOTP,
By symmetry and linearity,
4
X ✓ ◆ ✓ ◆
1 1 1 1 3 1000 1 1000 · 999 1
P (W ) = P (W |Cj )P (Cj ) = 0+ + + = . E(X) = = ⇡ .
j=1
4 2 2 2 8 2 106 2 · 106 2

(b) Find the unconditional probability that Monty will reveal the apple. (Simplify.) By the Poisson Paradigm, X is approximately Pois( ) with = E(X). So

Let A be the event that Monty reveals the apple. By LOTP, P (X = 0) ⇡ e =e 1/2
.
P (A) = P (A|G1 )P (G1 ) + P (A|A1 )P (A1 ) + P (A|B1 )P (B1 ) + P (A|C1 )P (C1 ) (It turns out that (a) evaluates to 0.6067 and this approximation evaluates to 0.6065,
p + 0 + (1 p) + (1 p) so the approximation is very accurate. It is also surprising that there is almost a 40%
=
4 chance of sampling someone twice, even though the sample size is only 0.1% of the
2 p population size!)
= .
4
(c) Monty now opens a door, revealing the apple. Given this information, find the
conditional probability that the contestant will get the car. (Simplify.)
By Bayes’ rule,
P (A|C1 )P (C1 ) (1 p)/4 1 p
P (C1 |A) = = = .
P (A) (2 p)/4 2 p
For the contestant to win the car, C1 must not occur, and then the contestant’s random
choice of which door to switch to must be the door with the car. So
✓ ◆
1 p 1 1
P (W |A) = 1 · = .
2 p 2 2(2 p)

40 25
Joshua Doolan’s Cheat Sheet Negative Binomial Distribution Exponential Function (ex )
Let us say that X is distributed NBin(r, p). We know the following: ✓ ◆
Discrete Distributions Story X is the number of “failures” that we will have before we x
e =
X1
xn
=1+x+
x2
+
x3
+ · · · = lim 1+
x n
achieve our rth success. Our successes have probability p. n=0
n! 2! 3! n!1 n

Distributions for four sampling schemes Example Thundershock has 60% accuracy and can faint a wild
Raticate in 3 hits. The number of misses before Pikachu faints Example Problems
Replace No Replace Raticate with Thundershock is distributed NBin(3, 0.6).
Fixed # trials (n) Binomial HGeom Hypergeometric Distribution
(Bern if n = 1) Contributions from Sebastian Chiu
Draw until r success NBin NHGeom Let us say that X is distributed HGeom(w, b, n). We know the
(Geom if r = 1) following: Calculating Probability
Story In a population of w desired objects and b undesired objects,
X is the number of “successes” we will have in a draw of n objects, A textbook has n typos, which are randomly scattered amongst its n
Bernoulli Distribution without replacement. The draw of n objects is assumed to be a pages, independently. You pick a random page. What is the
1
simple random sample (all sets of n objects are equally likely). probability that it has no typos? Answer: There is a 1 n
The Bernoulli distribution is the simplest case of the Binomial probability that any specific typo isn’t on your page, and thus a
Examples Here are some HGeom examples. ✓ ◆
distribution, where we only have one trial (n = 1). Let us say that X is 1 n
distributed Bern(p). We know the following: • Let’s say that we have only b Weedles (failure) and w Pikachus 1 probability that there are no typos on your page. For n
n
Story A trial is performed with probability p of “success”, and X is (success) in Viridian Forest. We encounter n Pokemon in the
1
forest, and X is the number of Pikachus in our encounters. large, this is approximately e = 1/e.
the indicator of success: 1 means success, 0 means failure.
• The number of Aces in a 5 card hand.
Example Let X be the indicator of Heads for a fair coin toss. Then Linearity and Indicators (1)
X ⇠ Bern( 12 ). Also, 1 X ⇠ Bern( 12 ) is the indicator of Tails. • You have w white balls and b black balls, and you draw n balls.
You will draw X white balls. In a group of n people, what is the expected number of distinct
Binomial Distribution • You have w white balls and b black balls, and you draw n balls birthdays (month and day)? What is the expected number of birthday
without replacement. The number of white balls in your sample matches? Answer: Let X be the number of distinct birthdays and Ij
Let us say that X is distributed Bin(n, p). We know the following: is HGeom(w, b, n); the number of black balls is HGeom(b, w, n). be the indicator for the jth day being represented.
Story X is the number of “successes” that we will achieve in n • Capture-recapture A forest has N elk, you capture n of them, n
E(Ij ) = 1 P (no one born on day j) = 1 (364/365)
independent trials, where each trial is either a success or a failure, each tag them, and release them. Then you recapture a new sample
with the same probability p of success. We can also write X as a sum of size m. How many tagged elk are now in the new sample?
n
of multiple independent Bern(p) random variables. Let X ⇠ Bin(n, p) HGeom(n, N n, m) By linearity, E(X) = 365 (1 (364/365) ) . Now let Y be the
and Xj ⇠ Bern(p), where all of the Bernoullis are independent. Then number of birthday matches and Ji be the indicator that the ith pair
Poisson Distribution of people have the same birthday. The probability that any two
X = X1 + X2 + X3 + · · · + Xn ⇣ n⌘
Let us say that X is distributed Pois( ). We know the following:
specific people share a birthday is 1/365, so E(Y ) = /365 .
Example If Jeremy Lin makes 10 free throws and each one Story There are rare events (low probability events) that occur many 2
independently has a 34 chance of getting in, then the number of free di↵erent ways (high possibilities of occurrences) at an average rate of
throws he makes is distributed Bin(10, 34 ). occurrences per unit space or time. The number of events that occur
in that unit of space or time is X.
Linearity and Indicators (2)
Properties Let X ⇠ Bin(n, p), Y ⇠ Bin(m, p) with X ?
? Y. This problem is commonly known as the hat-matching problem.
Example A certain busy intersection has an average of 2 accidents
per month. Since an accident is a low probability event that can There are n people at a party, each with hat. At the end of the party,
• Redefine success n X ⇠ Bin(n, 1 p) they each leave with a random hat. What is the expected number of
happen many di↵erent ways, it is reasonable to model the number of
• Sum X + Y ⇠ Bin(n + m, p) accidents in a month at that intersection as Pois(2). Then the number people who leave with the right hat? Answer: Each hat has a 1/n
of accidents that happen in two months at that intersection is chance of going to the right person. By linearity, the average number
• Conditional X|(X + Y = r) ⇠ HGeom(n, m, r) distributed Pois(4). of hats that go to their owners is n(1/n) = 1 .
• Binomial-Poisson Relationship Bin(n, p) is approximately Properties Let X ⇠ Pois( 1) and Y ⇠ Pois( 2 ), with X ?
? Y.
Pois( ) if p is small.
1. Sum X + Y ⇠ Pois( 1 + 2) Linearity and First Success
• Binomial-Normal Relationship Bin(n, p) is approximately ⇣ ⌘
N (np, np(1 p)) if n is large and p is not near 0 or 1. 2. Conditional X|(X + Y = n) ⇠ Bin n, 1 This problem is commonly known as the coupon collector problem.
1+ 2
There are n coupon types. At each draw, you get a uniformly random
3. Chicken-egg If there are Z ⇠ Pois( ) items and we randomly coupon type. What is the expected number of coupons needed until
Geometric Distribution and independently “accept” each item with probability p, then you have a complete set? Answer: Let N be the number of coupons
Let us say that X is distributed Geom(p). We know the following: the number of accepted items Z1 ⇠ Pois( p), and the number of needed; we want E(N ). Let N = N1 + · · · + Nn , where N1 is the
rejected items Z2 ⇠ Pois( (1 p)), and Z1 ?? Z2 . draws to get our first new coupon, N2 is the additional draws needed
Story X is the number of “failures” that we will achieve before we to draw our second new coupon and so on. By the story of the First
achieve our first success. Our successes have probability p. Success, N2 ⇠ FS((n 1)/n) (after collecting first coupon type, there’s
1
Formulas (n 1)/n chance you’ll get something new). Similarly,
Example If each pokeball we throw has probability 10 to catch Mew, N3 ⇠ FS((n 2)/n), and Nj ⇠ FS((n j + 1)/n). By linearity,
1
the number of failed pokeballs will be distributed Geom( 10 ).
Geometric Series
X1 n
n n n
First Success Distribution 2 n 1
n
X1 k 1 rn E(N ) = E(N1 ) + · · · + E(Nn ) = + + ··· + = n
1 + r + r + ··· + r = r = n n 1 1 j
Equivalent to the Geometric distribution, except that it includes the k=0
1 r j=1

first success in the count. This is 1 more than the number of failures. 2 1
If X ⇠ FS(p) then E(X) = 1/p. 1 + r + r + ··· = if |r| < 1 This is approximately n(log(n) + 0.577) by Euler’s approximation.
1 r

S-ar putea să vă placă și