Sunteți pe pagina 1din 3

Statistics AP Ch 7 Coefficient of determination (r2) and Residuals Notes

Wording recipe for coefficient of determination (r2): Approximately, r2 % of the variation in y can be explained by the LSRL (or linear
model) of y on x.

Residuals: the differences between data values and the corresponding values predicted by the regression model. e = y - yˆ

Example 1:
x 200 210 220 230 240
y 813.7 785.3 960.4 1118.0 1076.2

a) Find the predicted y and the residuals for each point:

ŷ __________ ___________ ___________ ___________ ___________

e __________ ___________ ___________ ___________ ___________

b) Sketch the residual plot by graphing e vs. x c) Now make a sketch of e vs. ŷ d) What does a residual plot tell you about the data?
(How does it compare to part b)

e)Here are some residual plots, what can you conclude about the relationship between the explanatory and response variable in each case?

Example 2: The following chart displays the data about the association between the amounts of fat and calories in fast-food hamburgers.
Fat (g) 19 31 34 35 39 39 43
Calories 410 580 590 570 640 680 660
a. Create a scatterplot of calories vs fat content. b. Write the equation of the line of regression.

c. Find the correlation coefficient: _________. d. Find the coefficient of determination: _________.
Interpret this value in context of this problem. Interpret this value in context of this problem

e. How effective do you think the amount of fat would be in predicting the number of calories in fast-food hamburgers? Explain.
f. Predict the number of calories for a burger with 36 grams of fat.

g. Would it be reasonable to use this regression line predict the number of calories for a burger with 55 grams of fat?

h. Use the residual plot to explain whether your linear model is appropriate. Explain.

i. Explain the meaning of the y intercept of the line.

j. Explain the meaning of the slope.

k. A new burger containing 28 grams of fat is introduced. According to this model, its residual for calories is +33. How many calories
does the burger have?

l. A new burger containing 40 grams and 650 calories is introduced. According to this new model, find its residual.

m. A new burger containing 600 calories is introduced. According to this model, its residual for calories is -25. How many grams of fat
does the burger have?

Example 3:
Given the following two scatter plots, circle the outlier on each plot. Determine whether the correlation would be stronger or weaker
without the outlier. Explain.

a) b)

Ch 7 Homework
1. People who responded to a July 2004 Discovery Channel poll named the 10 best roller coasters in the United States. A regression to
predict duration (seconds) from drop (feet) has R2 = 12.4% and the least squares regression line is: duration = 91.033 + 0.242drop
a. What are the variables and units in this regression?

b. What units does the slope have?

d. Write a sentence in context of the problem summarizing what the R2 says about his regression.

e. Explain what the slope means in context of this problem.

f. A new roller coaster advertises an initial drop of 200ft. How long would you expect it to last?

g. Another coaster with a 150ft initial drop advertises a 2-minute ride. Is this longer or shorter than you’d expect? By how much? What’s
that called?

2. A sociology student investigated the association between a country’s literacy rate and life expectancy, then drew the conclusions listed
below. Explain why each statement is incorrect. (Assume that all calculations were done properly)
a. The literacy rate determines 64% of the life expectancy for that country.

b. The slope of the line shows that an increase of 5% in literacy rate will produce a 2-year improvement in life expectancy.

3. Cost-to-charge ratio (the percentage of the amount billed that represents the actual cost) for inpatient and outpatient services at 11
Oregon hospitals was found to have a moderate correlation (r = .7299). The average cost-to-charge ratio for outpatient care was 59.6%
with standard deviation of 9.59% while the average cost-to-charge ratio for inpatient care was 75.6% with standard deviation of 16.9%.
Determine the regression line to predict inpatient cost-to-charge ratio.

4. Do # 58 and #60 in Ch7 Page 207

S-ar putea să vă placă și