Sunteți pe pagina 1din 27

ECONE370 Exam 1 Review General Announcements (read this right now!

)

Thanks for utilizing TutoringZones ECONE370 Exam 1 Review. Please read the very important information below regarding this review. Right now, you should have in your possession the following items: 1. Review Handout (which is in your hands, and you are reading it, so well go ahead and call this one a yes). A Coupon Code to access the review videos on our video application (more below). A Formula Sheet, which can be used as a guideline to copy information to your colored Tool Cards

2. 3.

This is a comprehensive review packet of the exam 1 ECONE370 material. This packet grants you access to an online review in which our experienced instructor will guide you through all material laid out here. Your TZ tutors name is Jon Aleman and his phone number is 3522756496. Call or text him with ANY questions that you have WHENEVER you want (seriously). Through TutoringZones online video platform, tens of thousands of students across the country have utilized our instructional videos to get the grade they want in difficult classes such as this one. In order to make sure the videos are as effective as possible, please keep the following tips in mind: 1. Dont cram each segment is purposely kept to 1020 minutes in length. Why? Well, why are TV commercials every 15 minutes or so? Thats our attention span so now, you can watch the videos when you want, where you want and for as long as you want. 2. Take full advantage of the ability to hit pause, rewind, and watch the segments as many times as you want! 3. Ask Questions for many of you, these videos will provide better interaction than live instruction. You dont have to be afraid of raising your hand and asking a question and feeling judged by your classmates. Simply post your question on the wall. We will be checking frequently and responding to your requests. Thanks and enjoy the online review!

Summarizing Data 1. In a recent survey of 100 IU students, it was found that 60% of them have season tickets to all home IU basketball games. However, the database at the IU ticket office indicates that 62.3% of IU students have season tickets to all home IU basketball games. The values of 60% and 62.3% are (A) a statistic and parameter, respectively (B) a parameter and statistic, respectively (C) both statistics (D) both parameters

The value of 60% is a _________________ because it comes from the ____________________

The value of 62.3% is a _________________ because it comes from the ____________________

2.

Which of the following variables are qualitative? I. II. III. An IU students gender An IU students GPA An IU students social security number

(A) I only (B) I and II only (C) I and III only (D) II and III only (E) I, II, and III

3. 4. 5.

Which of the following variables are discrete quantitative? I. II. III. The number of times an IU student gets high each day The number of bongs present in an IU students dorm room The number of ounces in a can of beer

(A) I only (B) I and II only (C) I and III only (D) II and III only (E) I, II, and III

Which of the following variables are continuous quantitative? I. II. III. The amount of time spent drinking by an IU student The number of bars an IU student visits in one night An IU students height

(A) I only (B) I and II only (C) I and III only (D) II and III only (E) I, II, and III

Suppose that you intend to take a cruise for Spring Break. Identify each of the following as either an ordinal or nominal variable (or neither). P. Q. Nominal (A) P and Q (B) P and R (C) P and R (D) Q and R (E) None of the above Ordinal R and S Q Q and S P R. S. The cruise line you selected (e.g. Carnival, Royal Caribbean, etc) The class of cabin purchased on a cruise ship (e.g. interior, balcony, suite, etc) The credit card number you used to book the cruise The size of the ship you selected (in tonnage)

Sampling Methods 6. Which of the following is NOT true of a stratified random sample? (A) Randomization is employed within the sampling procedure (B) The population is divided into groups of units that are similar in some respect (C) The division of items into strata is done so based upon information obtained prior to sample selection (D) Each individual in the population can belong to multiple strata 7. The manager of a local Best Buy wants to estimate the mean battery life of all remote controls in a shipment of remote controls from a supplier. The shipment consists of 40 boxes each containing 25 remote controls. The manager lists all boxes from 140 and randomly selects 6 boxes determines the battery life of all remote controls within the boxes selected. What type of sampling is being used? (A) simple random sample (B) systematic sampling (C) stratified sampling (D) convenience sample 8. A statistician is attempting to gather a systematic random sample of n = 20 football coaches for all BCS division I teams (e.g. Indiana, Ohio State, Florida, etc). All 360 such coaches have been identified, sorted by school and numbered from 1 to 360 in alphabetical order based first upon the state in which they reside (e.g. Alabama) and then, within each state, by the name of the school itself (e.g. Alabama, Auburn, etc). The first school in this sequence is assigned as number 1. The statistician wishes to estimate the mean number of wins per season for these coaches. The 266th of all coaches was selected as the fourth observation in the sample. What is the number of the first observation in the sample? (A) 200th (B) 212th (C) 230th (D) More information is needed to answer this question

k =

N 360 = = 18 n 20

Summary Statistics Exercise Suppose a group of IU students is asked how many nights per week they go out drinking. The results are shown below. 4 2 5 3 4 5 7 8 (a) Determine the mean for these values assuming these students represent (i) a random sample and (ii) the entire population.

sample mean

population mean

x =
(b)

!x
i =1

4 + 2 + ... + 8 8

38 8

= 4.75

!x
i =1

4 + 2 + ... + 8 8

38 8

= 4.75

Determine the median for these data.

First, lets order these values 2 3 4 4 5 5 7 8

sample median = M =
(c)

4 +5 = 4.5 2

Determine the sample range.

sample range = Max Min = 8 2 = 6

4 2 5 3 4 5 7 8 (d)

Determine the variance for these values assuming these students represent (i) a random sample and (ii) the entire population.

sample variance:

s =
2

" (x ! x )
i =1

n !1

(4 ! 4.75) 2 + (2 ! 4.75) 2 + ... + (8 ! 4.75) 2 8!1

27.5 7

= 3.929

population variance:

!2 =

# (x " )
i =1

(4 " 4.75) 2 + (2 " 4.75) 2 + ... + (8 " 4.75) 2 8

27.5 8

= 3.4375

(e)

Determine the standard deviation for these values assuming these students represent (i) a random sample and (ii) the entire population.

sample SD

population SD

s =

s 2 = 3.929 = 1.98

! = ! 2 = 3.4375 = 1.85

4 2 5 3 4 5 7 8

(f)

Determine the interquartile range for these data values.

First, lets order these values 2


Recall: L p = (n + 1)

p
100

Q1 L25 = (8 + 1)

25 100

= 9(.25) = 2.25

lower + (decimal)(upper ! lower) =


Q3 L75 = (8 + 1)

75 100

= 9(.75) = 6.75

lower + (decimal)(upper ! lower) =


IQR = Q3 Q1 =

TI83/84 Instructions for finding summary statistics Step #1: Press the STAT key, and press ENTER while on Option #1 of EDIT. Step #2: Enter data values into L1. If there is already data in L1, arrow up until L1 is highlighted, press CLEAR and arrow back down. Step #3: Press the STAT key, arrow over to CALC and press ENTER while on Option #1.

Step #4: If you have your data in L1, just press ENTER. Otherwise, specify the list, e.g. press 2nd 2 for L2, and press ENTER.

Step #5: Youre done! x = 4.75 & s = 1.98. [Ignore ]

If you arrow down, youll find the 5 number summary. However, we CANNOT use the given values of Q1 and Q3

We can see that: sample mean = x = 4.75 (which is the same as the population mean), sample standard deviation = s = 1.982, & population standard deviation = = 1.85 To find the sample variance, just square s, i.e. s2 = (1.982)2 = 3.928 times2. [Dont forget when reporting the variance that the variance is in SQUARED units.] Note: If you accidentally deleted one or more of your lists by pressing DEL instead of CLEAR, to get your lists back, press the STAT key and press ENTER will on Option #5 of EDIT, i.e. SetUpEditor. You should now have your lists back.

Summary Statistics 9. Suppose a stockbroker is deciding between investing the same amount of money in either stock A or stock B. He estimates that the mean and standard deviation for the returns resulting from the investment in stock A are $100 and $30, respectively. The mean and standard deviation for the returns resulting from the investment in stock B are $500 and $100, respectively. Which stock is riskier? (A) Stock A (B) Stock B (C) Stocks A and B exhibit the same amount of riskiness (D) It is impossible to determine which stock is riskier without knowing the medians return for each stock

CV A =
10.

s ! 100% x

CV B =

s ! 100% x

At a local ChickfilA, the mean amount of money spent per order (in one day) is determined to be $12.50 with a standard deviation of $2. Within this same day, a person is standing in line waiting to order a chicken/waffle fry goodness. The standard deviation will increase the most if the order is for _________ and it will decrease the most if the order is for _________. (A) $0; $0 (B) $30; $12.50 (C) $0; $12.50 (D) $30; $0

11.

A sample of 10 observations has a variance of 16. The sum of squared deviations from the mean is (A) 160 (B) 144 (C) 40 (D) 36

s =
2

" (x ! x )
n !1

Use the following information for the next FOUR questions. A large travel agency is interested in the price and length of various cruises that depart from Miami, Florida. To this end, she randomly samples several departures and determines the pretax price of the cruise (in dollars) as well as the duration of the cruise (in days). The following Excel output is generated for the aforementioned sample. Price Mean Median Standard Deviation Sample Variance Minimum Maximum Sum Count 12. (A) $17,210 (B) $19,480 (C) $15,690 (D) Excel is needed to answer this question 608.73 592.54 157.62 299.00 1249.00 Duration Mean Median Standard Deviation Sample Variance Minimum Maximum Sum Count 5.29 4.65 200.00 32

If an adventurous individual insisted upon going on all of the cruises sampled, what would be the approximate pretax price he/she would have to pay?

13.

The mean duration for the first 30 cruises sampled was 6.10 days. What was the mean duration for the remaining cruises sampled? (A) 8.2 days (B) 8.5 days (C) 7.6 days (D) Excel is needed to answer this question

sum of last 2 cruises = 200 (30)(6.10) = 17

10

Price Mean Median Standard Deviation Sample Variance Minimum Maximum Sum Count 14. 608.73 632.17 157.62

Mean Median

Duration

6.25
5.29

Standard Deviation Sample Variance Minimum Maximum Sum Count

2.16
4.65

24844.06
299.00 1249.00

??? ???
200.00 32

19479.36 32

On which characteristic are these cruises the least symmetrical, price or duration? (A) Price (B) Duration (C) The symmetry for price and duration are identical (D) Excel is needed to answer this question

Symmetry measured by Pearsons 2nd Skewedness Coefficient For Price


3(x ! median)

s
3(x ! median)

For Duration
15.

Suppose that we considered all of these cruises to represent the entire population of all such cruises. What would be the variance for the durations? (A) less than 4.65 (B) equal to 4.65 (C) greater than 4.65 (D) Cannot be determined solely by this information

11

16.

Use the following information for the next TWO questions A random sample of IU students is taken and each student is asked to report how many hours per week they spend Facebook stalking. The data are summarized in the table below. Number of hours Number of students 14 35 57 30 89 22 1012 13 What are our estimates for the mean, median, and mode for this data set? (A) mean 5.975, median 6, mode 2.5 (B) mean 6, median 5.975, mode 2.5 (C) mean 5.975, median 6, mode 6 (D) mean 6.25, median 6.5, mode 2.5

mean xest =

!M f !f
i i

median L50 = (100 + 1)

50 100

17.

Assume that the estimated mean of this data set is 7. What is our estimate for the variance of this data set (rounded to two decimal places)? (A) 9.27 (B) 9.96 (C) 10.06 (D) None of the above is/are correct

2 variance s est =

"f

(M i ! xest ) 2
n !1

12

Graphical Displays Exercise Suppose a random sample of 12 Purdue students is asked the question What is three plus five? Their responses are shown below (as an ordered array). 1 1 4 4 8 9 9 9 10 10 Note: Two students did not respond but instead began crying uncontrollably. They were removed from the sample. (a) Create a frequency distribution for the responses. Response f 1 4 8 9 10 (b) Create a relative frequency distribution for the responses. Response Relative Frequency 1 4 8 9 10 (c) Create a cumulative frequency and cumulative relative frequency distribution for the responses. Response Cum. Frequency Cum. Rel. Freq. 1 4 8 9 10

13

Graphical Displays 18. An Kazakhstanbased airline is attempting to determine what most of their customer complaints reference in an effort to improve the overall quality of air travel on their airline. They noted that the two most common complaints (as a percentage of overall complaints) were relating to the smell of the aircraft and the presence of livestock on board the aircraft. Which of the following displays would be the most appropriate and informative display for these data? (A) ogive (B) pie chart (C) histogram (D) Pareto diagram 19. A recent study attempted to discern the level of douchebaggery at three Big 10 schools. For the most recent school year, Ohio State admitted 3,000 douche bags, Michigan admitted 4,200 douche bags, and Purdue admitted 39,726 douche bags (their entire student body!). Which of the following displays would be the most appropriate and informative display for these data? (A) ogive (B) pie chart (C) histogram (D) Pareto diagram 20. Which of the following statements regarding qualitative displays is FALSE? (A) The choice of utilizing a bar chart vs. a pie chart is subjective (B) A bar chart can be used for ordinal data (C) Pareto diagrams are organized left to right from smallest frequency to largest frequency (D) It is inappropriate to use a Pareto diagram for interval data because interval data already have an inherent order

14

21.

The following is a histogram showing the opening prices on the exchange of a particular stock for 50 randomly selected days (rounded to the nearest $10). Based on the frequency histogram for the NASDAQ, the 80th percentile is closest to which of the following values? (A) $20 (B) $30 (C) $40 (D) $50
Histogram of Stock Prices
16 14 12
Frequency

10 8 6 4 2 0 10 20 30 40 50 60 Stock Prices 70 80 90 100

22.

After the Florida Gators beat Ohio State twice for a National Championship, many residents immediately spent money on antiOSU paraphernalia. A sample of n = 30 UF students was each asked how much money they had spent on their most recent purchase of said items. The data was used to construct the cumulative frequency plot shown below.
100

Cumulative Relative Frequency

80 60 40 20 0 20 30 40 50 60 70 80 90 100 Amount ($)

Which of the following statements regarding this display is TRUE? (A) The median for this data set is approximately $60 (B) The IQR for this data set cannot be approximated by this display alone (C) The mean for this data set can be approximated by this display alone (D) There were no observations in this data set between $80 and $90

15

Use the following information for the next TWO questions The Pug Factory turns regular dogs into fantastically smushyfaced pugs (illustration is to the right). However, there is variation in the process. In order to understand this variation, a random sample of observations is taken where the amount of time required to adequately smush the dogs face (in seconds) is recorded. The following stem and leaf diagram resulted. 0 3 6 2 5 1 1 2 8 5 2 6 4 1 1 3 0 4 6 9

23.

What is the 80th percentile of these observations? (A) 13.0 (B) 25.6 (C) 29.2 (D) 30.0

L80 = (15 + 1)

80 100

= 16(.80) = 12.8

lower + (decimal)(upper ! lower) =


24.

Suppose that this data set were to be transformed into a frequency distribution in which the classes were 10 seconds wide. What would be the upper bound of the third class? (A) 29 (B) 29.9 (C) 30 (D) 19.9

16

Chebychevs Theorem and the Empirical Rule Use the following information for the next FOUR questions Suppose the amount of money spent getting shwasted on a typical Saturday night by all IU students has a mean of $30 and a true standard deviation of $5. 25. Assume that the distribution of observations is approximately normal. According to the Empirical Rule, what percent of observations should fall between $20 and $40? (A) 68% (B) 95% (C) 99.7% (D) 16% (E) None of the above How many standard deviations above/below mean?

# SDs above =
26.

40 ! 30 5

# SDs below =

30 ! 20 5

Assume that the distribution of observations is bellshaped and symmetrical. If there are 40,000 students at IU, how many of these students (to the nearest whole number) does the Empirical Rue predict will spend between $25 and $45 getting shwasted? (A) 33,540 (B) 20,000 (C) 19,940 (D) 38,000 (E) This quantity cannot be determined exactly

How many standard deviations above/below mean? # SDs above =


45 ! 30 5

# SDs below =

25 ! 20 5

17

27.

According to Chebychevs Theorem, what percent of observations should fall within 4 standard deviations of the mean? (A) approximately 89% (B) approximately 97% (C) at least 89% (D) at least 93.75% (E) at least 96%

" 1 % At least $ 1 ! 2 ' % = k & #

28.

Chebychevs Theorem suggests that at least 60% of students will spend how much money getting shwasted (to the nearest dollar)? (A) between $20 and $40 (B) between $25 and $35 (C) between $19 and $46 (D) between $18 and $43 (E) between $22 and $38

k =

1! p

Now, lets go up/down that many SDs from the mean $30 + ____________ ($5) = $30 ____________ ($5) =

18

Covariance and Correlation Use the following information for the next THREE questions A statistician random sample of IU students playing Sink the Biz at Nicks and each is given the following threequestion survey:
How many times in the past month have you played Sink the Biz? _____________ How many minutes have you been sitting here playing Sink the Biz? _____________ How many fingers am I holding up? _____________

The following covariance matrix is computed in Excel. Times/Month Minutes Fingers Times/Month 31.695 95.364 1.062 Minutes 448.495 10.209 Fingers 3.695

29.

As per the covariance matrix, the covariance between the first and third responses is 1.062 timesfingers. Which of the following is a correct statement regarding this value? (A) Generally, those students who play Sink the Biz more often per month saw more fingers and the relationship is quite strong. (B) Generally, those students who play Sink the Biz more often per month saw more fingers and the relationship is quite weak. (C) Generally, those students who play Sink the Biz more often per month saw fewer fingers, though we cant comment as to the strength of the relationship. (D) Generally, those students who play Sink the Biz more often per month saw more fingers, though we cant comment as to the strength of the relationship. The researchers realized that although only 3 fingers were actually being held up, nearly all responses were well above 5 implying that the participants were so drunk, they were seeing double. Therefore, the researcher took all responses to this question and cut them in half. After this alteration, the covariance between minutes and fingers will be (A) 10.209 (B) 10.209 (C) 5.105 (D) 5.105

30.

19

Times/Month Minutes Fingers 31.

Times/Month 31.695 95.364 1.062

Minutes 448.495 10.209

Fingers 3.695

Between which pairs of variables is the linear relationship the strongest? (A) Times/month and minutes (B) Times/month and fingers (C) Minutes and Fingers (D) Excel is needed to answer this question

In order to determine the strength/direction of the relationship, we need to get the correlations, not covariances. Lets start off by getting the standard deviation for each variable Times/Month SD = Minutes SD = Fingers SD =

31.695 = 5.629

448.495 = 21.178 3.695 = 1.922

Now, lets get the correlation for each pair of variables using
r = s XY s X ! sY

Times/month and minutes r = Times/month and fingers r = Minutes and Fingers r =

20

Linear Regression Use the following information for the next FOUR questions Ten IU guys were randomly selected at a local bar and each was asked how many beers they had consumed and how many of the women in the bar looked good to them. The following information resulted from the analysis: r = .698 women = 22.316 + 2.508(beers) 32. To the nearest whole number, how many women are predicted to look good to a guy who consumes 5 beers? (A) 32 (B) 33 (C) 35 (D) 39

x = 5 22.316 + 2.508(5) =

33.

Suppose an individual in this sample drank 8 beers and thought that 40 women in the bar looked good. How much (to the nearest whole number) did this individual contribute to the sum of squared errors? (A) 2 (B) 4 (C) 6 (D) 12

x = 8 22.316 + 2.508(8) =
error = y y = SSE =

" (Y

! Y ') 2

21

r = .698

women = 22.316 + 2.508(beers)

34.

35.

Which of the following statements is FALSE regarding this model? (A) This model predicts that for each additional 2 beers consumed, approximately 5 more women are predicted to look good. (B) This model predicts that before a guy consumes any beers, about 22 of the women in a bar are predicted to look good. (C) Supposing there were 500 women in the bar at the time this survey took place, if the data values were gathered as the percent of women in the bar who look good as compared to the number who look good, the correlation between these variables would no longer be r = .698 (D) The covariance between number of women and number of beers could have been 225.6 beerwomen. The regression line for these data was given earlier as women = 22.316 + 2.508(beers). Which of the following is true regarding this line? (A) This is the line (of all possible lines) that passes though the largest number of actual data points. (B) This is the line (of all possible lines) that minimizes the sum of the vertical distances between the actual values and those values predicted by the regression line. (C) This is the line (of all possible lines) that minimizes the square of the sum of the vertical distances between the actual values and those values predicted by the regression line. (D) This is the line (of all possible lines) that minimizes the sum of the squared vertical distances between the actual values and those values predicted by the regression line.

22

Probability Rules Use the following information for the next FOUR questions Suppose we are interested in the association between the school a student goes to and the gang he/she is in. The data from random sample of students is shown below in this two-way contingency table. Gang TOTAL School Tree Trail Eastside Boys Da Woods Eastside 10 5 40 Santa Fe Buchholz 20 25 60 TOTAL 40 55 200 36. What is the probability that a randomly selected student either went to Santa Fe High School or was in Da Woods? (A) .225 (B) .775 (C) .750 (D) .025

P(A or B) = P(A) + P(B) P(A and B)


37.

If we know a student did NOT attend Buchholz High School, what is the probability that they were NOT in Tree Trail? (A) .821 (B) .179 (C) .659 (D) None of the above answers is/are correct

23

School Eastside Santa Fe Buchholz TOTAL 38.

Tree Trail 10

Gang Eastside Boys 5

Da Woods

15 15
40

80
20

25 5
25 55

TOTAL 40

100
60 200

105

Among the choices below, which of the following is least likely to occur? (A) A randomly selected student is in Da Woods and attends Buchholz. (B) A randomly selected student attending Santa Fe High School is in Tree Trail. (C) A randomly selected student in Tree Trail attends Santa Fe High School (D) A randomly selected student attends Buchholz High School

39.

If gang and school are, in fact, independent, how many students should we expect to see in the greyed cell? (A) 15 (B) 20 (C) 25 (D) 30

expected =

row total ! column total grand total

24

Discrete Random Variables Use the following information for the next FIVE questions Barney Stinson is looking to insure his $10,000 diamond suit. An insurance company has determined the following distribution for the percentage of value lost on the value of such a legend wait for it dary suit. Percent Loss Probability of Loss Total Loss 0.02 50% loss 0.03 25% loss 0.15 40. From the perspective of the insurance company, what is the expected loss? (A) $0 (B) $725 (C) $2,500 (D) $5,000

E(X) = =

& "X #

! P (X ) $ %

41.

What premium should the insurance company charge Barney if they wish to make an expected profit of $200 per policy? (A) $525 (B) $925 (C) $1,025 (D) Excel is needed to answer

42.

Again, from the perspective of the insurance company, what is the value of the median expected loss? (A) $0 (B) $725 (C) $10,000 (D) Excel is needed to answer this question

25

Loss $10,000 $5,000 $2,500 $0


43.

Percent Loss Total Loss 50% loss 25% loss

Probability of Loss 0.02 0.03 0.15

0% loss

.80

What is the shape of this probability distribution? (A) Strongly positively skewed (B) Strongly negatively skewed (C) Slightly positively skewed (D) Slightly negative skewed Assume that the insurance companys expected loss is $1,200. What is expected the standard deviation of the loss (to the nearest $10)? (A) $350 (B) $2,420 (C) $3,100 (D) $3,710

44.

!2 =

( $ (X %

" ) 2 # P (X ) & ! = '

( $ (X %

" ) 2 # P (X ) & '

26

Random Variable Transformations 45. A large data set has a mean of 100 and a variance of 20. If each data value is divided by 2, the new mean and variance of the data set are (A) 50 and 10, respectively (B) 50 and 5, respectively (C) 100 and 10, respectively (D) 100 and 5, respectively 46. A professor at a large university is instituting a curve by applying a transformation to all students current score on a recent exam. The transformation is as follows: Curved Score = 2.5(current score) + 10 Which of the following is NOT true? (A) The mean of the curved scores is 2.5(the mean of the current scores) + 10 (B) The standard deviation of the curved scores is 2.5(the standard deviation of the current scores) (C) The variance of the curved scores is 2.5(the variance of the current scores) (D) None of the above. That is, all of the above are true.

27

S-ar putea să vă placă și