Sunteți pe pagina 1din 50

STAT 2 Lecture 21: Expected value and standard error

Recall: The law of averages

Which is more likely? When I roll sixty dice, between 15% to 20% are sixes When I roll six hundred dice, between 15% to 20% are sixes

Recall: The law of averages

Which is more likely? When I roll sixty dice, more than 20% are sixes When I roll six hundred dice, more than 20% are sixes

Recall: The law of averages

Which is more likely? When I roll sixty dice, exactly 10 are sixes When I roll six hundred dice, exactly 100 are sixes

Recall: Box models

Model random variable as draw of tickets from a box Work out how many tickets are in the box, and what they say Work out how many draws, and if they're with or w/out replacement What do we do once we've drawn the numbers?

Today

Expected value Standard error of a sum The Central Limit Theorem Some tips and tricks

Expected value

The box model for rolling one die

On average, what do we get when we roll one die? Like one draw from a box containing the numbers [123456]

The box model for rolling one die

Our best guess (giving the lowest RMS error) will be the mean of the tickets in the box We expect, on average, to get 3.5

The box model for rolling 12 dice

On average, what will be the sum when we roll twelve dice? Like 12 draws with replacement from a box containing the numbers [123456]

The box model for rolling 12 dice

We expect to get about 2 ones, 2 twos, 2 threes, ... We expect to get a sum of 2*1+2*2+3*3+...+2*6 = 42 Note this is just 12 times 3.5

Expected value of a sum

The expected value for the sum of draws made at random with replacement from a box is: (number of draws)*(average of box)

Example

In a certain gambling game, each time you either win $2 or lose $1. Each time, the chance of winning is 25%. I play the game 100 times. What are my expected winnings?

Example

Make the box: [ $2 -$1 -$1 -$1 ] Average of box is -$0.25 Expected sum of 100 draws will be 100 * 0.25 = -$25

Example

In a certain gambling game, each time you either win $10 or lose $1. Each time, the chance of winning is 7%. I play the game 100 times. What are my expected winnings?

Example

Make the box: 7 tickets saying +$10 93 tickets saying -$1 Average of box is -$0.23 Expected sum of 100 draws will be 100 * 0.23 = -$23

Example

My friend and I play a game where a player rolls two dice, losing $1 if they roll 8 or less, and winning $x if they roll 9 or more. We want the game to be fair: that is, the expected winnings should be zero. What should x be?

Example

Make a box: 26 tickets saying -$1 10 tickets saying +$x Average of box is (-26 + 10x)/36 Set average of box to 0 and solve: -26 + 10x = 0 x = $2.60

II

Standard error

Why standard error?

Would like to know how close a sum is likely to be to the expected sum We could find the sum millions of times, and find the SD of these sums This is the standard error of the sum

Standard error

Fortunately we don't have to find millions of sums: there's a formula

Standard error for the sum of draws = sqrt(no. of draws) * (SD of box)

Standard error

In the same way that the SD is the typical (RMS) error of a set of numbers compared to the mean, the standard error of a sum is the typical error of a sum compared to the expected sum

Example: die rolls Find the SE of the sum of 100 die rolls Box is [ 1 2 3 4 5 6 ] Average of tickets = 3.5 Average of (tickets squared) = 15.17 2 SD of box = sqrt(15.17-3.5 ) = 1.7 SE of sum = sqrt(100) * 1.7 = 17 The sum of 100 die rolls will be around 350, give or take 17

Example: die rolls

Find the SE of the sum of 1000 die rolls SD of box = 1.7 SE of sum = sqrt(1000) * 1.7 = 54 The sum of 100 die rolls will be around 3500, give or take 54 Consistent with law of averages

Example

Find the SE of the sum of 100 draws from the box [ +$4 +$1 -$2 -$2 -$2 ]

Example

Average of tickets = -$0.20 Average of (tickets squared) = 5.8 SD of tickets = sqrt(5.8-(-0.2)^2) = $2.40 SE of sum = sqrt(100) * 2.4 = $24 The sum of 100 draws will be around -$20, give or take $24

III

Return of the Central Limit Theorem

Remember the CLT?

Central Limit Theorem: the sum of a large number of independent random variables with the same distribution will have an approximately normal distribution

Example: rolling dice

I roll a die 100 times. The expected sum is 350 and the standard error of the sum is 17 The real value of the sum, by the CLT, will have an approximately normal distribution with mean 350 and SD 17

Example: rolling dice

By the 68-95-99.7 rule: 68% chance sum will be between 333 and 367 95% chance sum will be between 316 and 384 99.7% chance sum will be between 299 and 401

Example: roulette

I bet on red 3600 times in American roulette. Each time, I have an 18/38 chance of winning 1 chip, and and 20/38 chance of losing 1 chip What will be the distribution of my winnings?

Example: roulette

Box is [18 +1s, 20 -1s] Average of numbers in box = (1*18 -1*20)/38 = -0.05263 Average of (numbers squared) = (38*1)/38 = 1 SD of numbers = sqrt(1 (-.053)2) = 0.9986

Example: roulette

Expected sum = 3600 * (-0.05263) = -189.5 chips SE of sum = sqrt(3600) * 0.9986 = 59.9 chips

Example: roulette

68% chance of losing between 130 and 249 chips 95% chance of losing between 70 and 309 chips 99.7% chance of losing between 10 and 369 chips Not much chance of winning

IV

Tips and tricks

The SD of a two-valued box

If the tickets in the box have only only two different numbers on them, the SD of the numbers is (higher number smaller number) * sqrt{fraction with higher number * fraction with lower number}

Example

I have a box with 15 tickets marked -2 and 10 tickets marked -5. What is the box SD? Use formula: (-2 -5)*sqrt{15/25*10/25} = 1.47

Example: roulette

Higher number is +1 (18/38) Lower number is -1 (20/38) SD of box is (1 -1)*sqrt{18/38*20/38} = 0.9986, as before

Counting problems

Instead of finding for a sum, we might be asked to find a count e.g. instead of being asked for the sum of 100 die rolls, we might be asked for the number of sixes Different box, same formulae

Box models for counting problems

The tickets in the box will all be zeros or ones: 1 if we count the outcome, 0 if we don't e.g. for counting sixes, the tickets are [ 0 0 0 0 0 1 ]: different from the box for the sum

Counting problems

What is the expected number of sixes in 100 die rolls, and what is the standard error of the count? Box average = 1/6 Box SD for two-valued box = (1 0)*sqrt(1/6*5/6) = 0.3727

Counting problems

Expected count for 100 rolls = 100*1/6 = 16.7 SE of count = sqrt(100)*0.3727 = 3.7 Number of sixes will be about 17, give or take four

Comparison with binomial

But we recall the number of sixes in 100 draws has a binomial distribution We can find the exact probability for each possible number of sixes We find that most of the time (77%), we do get 13-21 sixes

NB: only works well for large number of rolls

Recap

Recap: Expected value of a sum

The expected value for the sum of draws made at random with replacement from a box is: (number of draws)*(average of box)

Recap: Standard error of a sum

Standard error for the sum of draws = sqrt(no. of draws) * (SD of box) Typical error of a sum compared to the expected sum

Recap: Counting problems

Can also use box models to study counting problems Tickets in boxes are all 0 or 1 Same formulae as for sums

Recap: SD of a two-valued box

If the tickets in the box have only only two different numbers on them, the SD of the numbers is (higher number smaller number) * sqrt{fraction with higher number * fraction with lower number}

Recap: The Central Limit Theorem

The sum of a large number of draws with replacement from a box will have an approximately normal distribution, with mean equal to the expected value, and SD equal to the standard error

Tomorrow:

The Central Limit Theorem: Using the normal distribution

S-ar putea să vă placă și