Sunteți pe pagina 1din 16

Chicago Booth BUSINESS STATISTICS 41000 Midterm Exam Fall 2010

Name: Section:

I pledge my honor that I have not violated the Honor Code Signature:
This exam has 16 pages. You have 90 minutes to complete this exam. There are 7 questions. Each part of each question is worth 2 points unless noted otherwise. You may use a calculator and one letter size (both sides) cheat sheet of your own notes. Present your answers in a clear and concise manner.

Question 1: 4 parts, 8 points Question 2: 6 parts, 12 points Question 3: 3 parts, 6 points Question 4: 5 parts, 10 points Question 5: 3 parts, 6 points Question 6: 5 parts, 10 points Question 7: 4 parts, 8 points

Total: 60 points

Question # 1.
This problem is based on the Bread and Peace model proposed by Douglas Hibbs to explain voting in U.S. Presidential elections.

Each point on the scatter plot below is a U.S. Presidential election from 1952 to 2008 (n = 15). On the vertical axis is a variable voteshare, the percentage of the total votes cast for Republican and Democratic candidates that went to the incumbent party.1 On the horizontal axis is a variable rigrowth, a weighted average of per capita real GDP growth over the previous presidential term (four years prior to the election).

1972
60

1964

1956 1996 1988 1960


50

1984

55

voteshare

2004 1976

2000 1968 Vietnam

2008 1980
45

1992 1952 Korea

40 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5

rigrowth

The incumbent party means the party that is currently in ofce. For example, since the current President Barack Obama is a Democrat, Democrats are currently considered the incumbent party. For the purposes of this model, votes cast for third party or independent candidates are ignored.

Answer parts (a)-(d) using the scatterplot on the previous page. 2 points each. (a) The sample correlation between voteshare and rigrowth is approximately:

Answer:

(i) 0.73 (ii) -0.51 (iii) 0.97 (iv) 0.17 (v) 0.05 (b) The sample mean of rigrowth is approximately:

Answer:

(i) 45 (ii) 2.1 (iii) 3.8 (iv) 52

(c) The largest voteshare in the sample occurred in the 1972 election, when incumbent President Richard Nixon defeated George McGovern with 61.8% of the popular vote. Approximately what was rigrowth over the 4 years prior to November, 1972?

Answer:

(i) 1.08 (ii) 4.30 (iii) 52.5 (iv) 3.62 (v) 61.3 (d) The two data points labeled 1968 Vietnam and 1952 Korea correspond to two elections that followed the beginnings of major U.S. military conicts. If we removed these two points from the sample, the sample correlation between voteshare and rigrowth would be:

Answer:

(i) exactly the same (ii) larger (iii) smaller (iv) not enough information to tell

Question # 2.
The data in this question is from a study conducted in 1975 of the sleeping habits of n = 239 Americans between the ages of 25 and 60. The variables in this data set include:

age: Persons age in years; educ: Years of education; sleep: Average of minutes per week spent sleeping; workhrs: Average time spent per week at work, in hours;

Summary statistics for each variable are given in the tables below:

Summary measures for selected variables age Mean 39.0 Standard deviation 11.1

educ 13.1 2.9

sleep 3369.7 502.8

workhrs 36.4 15.4

Table of correlations educ sleep workhrs age -0.231 0.044 -0.133 educ -0.105 0.081 sleep

-0.354

(a) What is the sample variance of age?

(b) What is the sample covariance between educ and age?

(c) Assume that the histogram of workhrs is approximately bell-shaped. About how many people in the sample worked between 21 and 52 hours per week on average? [Reminder: There are n = 239 people total in the sample.] 6

Suppose that for each person in our sample, we dene 1 sleep workhrs 60

leisure = 168

There are 24*7=168 hours in a week, so this is a measure of hours per week spent on activities other than sleep or work (leisure time).

(d) What is the sample mean of the 239 leisure values?

(e) What is the sample variance of the 239 leisure values?

(f) Suppose we survey each person in the sample exactly six years later. Let age81 be their age as of the same date in 1981 (the original data were collected in 1975). What is the sample covariance between age and age81?

Question # 3.
Below are histograms of three variables labeled A, B, C, and D. Each is based on a sample of n = 500 observations.

40 30 20

A
40

20 10

-6

-5

-4

-3

-2

-1

0 60

-6

-4

-2

C
150 100

40

50

20

-2

-1

(a) Which histogram is highly left skewed?

Answer:

(i) A (ii) B (iii) C (iv) D (b) Which histogram shows i.i.d. draws from a N (1, 4) distribution: Answer:

(i) A (ii) B (iii) C (iv) D 8

(c) The Netherlands is one of my favorite countries in the world. Suppose I believe the i.i.d. Normal model is a good description of the Dutch returns from the countries dataset (conret.xls).

If I am correct, which of the time series plots below shows the Dutch returns?

Answer:

(i) A (ii) B (iii) C (iv) D

20

40

60

80

100

20

40

60

80

100

20

40

60

80

100

20

40

60

80

100

Question # 4.
This question is about a survey technique known as randomized trials. Suppose we are conducting a survey where the question being asked is sensitive in nature. For example, suppose we want to ask, what percentage of Booth students smoked marijuana in college? For a variety of reasons, a lot of people may not answer this question honestly.

Instead we do the following. Each student who participates in the survey is given a sheet of paper with TWO questions:

Question #1: Is the last digit of your social security number odd? Question #2: Did you smoke marijuana in college?

They are then told to secretly ip a coin (they see whether the coin lands heads or tails, but we, the people asking the question, do not). If the coin lands heads, they answer Question #1. If the coin lands tails, they answer Question #2. The point of doing this is that a student can answer Yes and the person conducting the survey still doesnt know whether they smoked marijuana (so we assume they answer honestly).

For each student participating in the survey, dene two random variables:

Q = 1 if the student answered question #1, Q = 2 if they answered question #2. Y = 1 if the student answers Yes and Y = 0 if they answer No.

Assume that the coin is fair, so that P (Q = 1) = 0.5.

Also assume that 50% of Booth students have social security numbers that are odd (the other half are even), so P (Y = 1|Q = 1) = .5 and P (Y = 0|Q = 1) = .5.

10

(a) What is the joint probability P (Y = 1, Q = 1)? Put your answer in the box in the table below.

(b) Suppose that we conduct this survey described above using a sample of 250 Booth students. We nd that 145 of them answer yes (note that we do not know which of the two questions they were answering). Based on this, assume that the marginal probability that a randomly selected Booth student answers yes to our survey is P (Y = 1) = .58.

What is the joint probability P (Y = 1, Q = 2)? (Hint: It may help you to look at the table above; I lled in the marginal probabilities for you.)

(c) Remember, the purpose of our survey is to determine: what is the probability a randomly chosen Booth student smoked marijuana in college? Given how we dened Y and Q, which conditional probability involving Y and Q gives us the answer to this question? Briey explain.

11

$#" !  

 

  

(d) Based on your answers to (b) and (c), what is the probability that a randomly chosen Booth student smoked marijuana in college?

(e) Are the random variables Q and Y in this question independent? Are they identically distributed? Briey explain.

12

Question # 5.
In class we talked about modeling a rms sales as a discrete random variable. Suppose instead we modeled rm sales as a continuous random variable, S N (3.5, 0.64) where the random variable S denotes sales over the next quarter in thousands of units.

Further, suppose that our rm charges a price of $750 per unit, but pays a xed production cost of $1,425,000. The equation for prot, P , in thousands of dollars is therefore: P = 1, 425 + 750S (a) What is the rms expected prot this quarter, E[P ]?

(b) What is the variance of the rms prot this quarter, V[P ]?

(c) Assume that P is normally distributed (this is actually true by the way weve dened P ). What is the probability that prots this quarter are positive?

13

Question # 6.
The company that you own ACME, Inc. has an order from Wile E. Coyote for two ACME Giant Rubber Bands (For Tripping Road Runners).

Let X = 1 if the rst Giant Rubber Band is defective and 0 otherwise. Let Y = 1 if the second Giant Rubber Band is defective and 0 otherwise.

0
Y

0 1

.42 .28 .18 .12

(a) What is the marginal distribution of Y ?

(b) What is the conditional distribution of P (Y |X = 1)?

(c) What is the conditional distribution of P (Y |X = 0)?

(d) Is Y independent of X?

(e) Are X and Y i.i.d.?

14

Question # 7.
Below are the probability distributions of three random variables: W , X, and Y

w 0 1 2 3

p(w) 0.042 0.239 0.444 0.275

x 0 1 2 3

p(x) 0.125 0.375 0.375 0.125

y 0 1 2 3 4 5 6

p(y) 0.015625 0.09375 0.234375 0.3125 0.234375 0.09375 0.015625

These random variables have the following probability distributions (but NOT necessarily in the same order as they are given above):

binomial(3, 0.5)

binomial(6, 0.5)

binomial(3, 0.65)

(a) What are the expected value and variance of W ?

Now suppose we are about to ip a fair coin six times. Assume the coin ips are i.i.d.. (b) Before the rst coin ip, what is the probability that two of the six coin ips come up heads?

(c) Suppose we just did the rst three ips and they all came up tails. Now what is the probability that two out of our six coin ips come up heads?

15

(d) Suppose the San Francisco Giants and the Texas Rangers are playing in the World Series. The two teams play seven games, and the rst team to win four games is the 2010 Champion of Major League Baseball.

Furthermore, suppose the two teams have already played four of the seven possible games, and the Rangers have won three of the four games played so far (in other words they lead the series 3-1).

Now suppose I believe the two teams are evenly matched (the Rangers have a 0.5 chance to win each game) and the outcomes of each game are i.i.d..

What is the probability the Rangers win the World Series?

(Hint: There are multiple ways to answer this question; all giving the same answer. Remember that to win, the Rangers must win ONE OR MORE of the THREE games left to be played. While in reality the series may end early, one possible solution uses the binomial probabilities on the previous page if you act as if all three games will be played.]

16

S-ar putea să vă placă și