Documente Academic
Documente Profesional
Documente Cultură
Biol/Stat 2250
Exam 2010
Time Limit:
• 2 hours
Aids permitted:
• Course notes, handouts, your own personal notes, material printed from website (including old
tests and assignments), stats textbooks, scientific calculator
Directions:
• SHOW YOUR WORK for part marks!
• The exam has 7 pages.
• There are 8 major questions on 6 pages after this one.
• The exam is marked out of 60 [grades per question are shown in square parentheses]
• For short and long answer questions, a better answer gets a higher mark.
• If you need more space to organize your thoughts, use the back of the previous page.
• Answers must be given in the spaces provided.
Grades:
Question Grade
1 /12
2 /4
3 /8
4 /6
5 /1
6 /10
7 /10
8 /9
Total /60
2
1. Black wheatears are small birds of Spain and Morocco. Males of the species demonstrate an exaggerated
sexual display by carrying many heavy stones to the nesting cavities. A study was done to determine whether
males that carry heavier stones are healthier. The response variable is a measure of health called tcell, which is
related to strength of the immune system. The explanatory variable, mass, represents the mass of the stone in
grams. A plot of the data and output of a simple linear regression model are shown below. [from Exercise 7.29 in
Statistical Sleuth].
Coefficients:
Value Std.Error t value Pr(>|
t|)
(Intercept) 0.0875 0.0787 1.1121 0.2800
mass 0.0328 0.0106 3.0843 0.0061
a) [4] Comment on whether the assumptions required for linear regression appear to be satisfied.
b) [3] What conclusions can you draw about the relationship between tcell response and the mass of
stones carried? (address significance, direction and magnitude).
c) [2] Give a formula for a (2-sided) 95% confidence interval for the slope parameter.
d) [1] Is the estimated value for the intercept meaningful for this data set? Explain.
3
e) [1] What is the estimated regression equation? Please give numerical values for the parameter
estimates and use variable names.
2. A study was designed to test the effects of a pesticide (factor A, with 3 levels of formulation) and areal
application procedures (factor B, with 2 levels) on spruce budworm abundance in a Canadian forest. A
total of 72 forest sites were randomly selected. Each of the three pesticide formulations (A) was applied
to 24 randomly selected forest sites. For each pesticide formulation, half of the 24 sites were randomly
chosen to be treated by one application method (B) and the remaining half were treated by the other
application method.
a. [1] How many experimental units are there for each level of A: pesticide formulation?
b. [1] How many experimental units are there for each level of B: areal application?
c. [2] What will be the numerator and denominator DF for testing the interaction of A and B
3. Canopy cover (X), and undergrowth density (Y), were recorded at each of n = 27 randomly selected
sites in a forest. The Pearson’s r correlation coefficient was found to be r = -0.67.
A test will be conducted with
H0: There is no association between canopy cover and undergrowth intensity, versus
HA: There is an association between canopy cover and undergrowth intensity.
Determine the strength of the evidence against H0 using the tdata method shown in the course, using the
following steps:
b. [2] The numeric value for the test statistic is with DF.
c. [2] Assuming that the P-value for the evidence against H0 is P = 0.0034, the biological
conclusion is:
4
d. [2] What assumptions are required for the above test to be valid? Give at least two assumptions.
e. [1] The proportion of variation in undergrowth density that is related to canopy cover in this
dataset is:
4. Phosphorus is implicated in the invasion of native vegetation by exotic weeds. Clements (1983)
investigated how phosphorus varies with topographic location and soil type, in an area around Sydney,
Australia. Two types of SOIL (Shale-derived and sandstone-derived) and four different topographies
(TOPO) (valleys, north-facing slopes, south facing slopes and hilltops) were examined. There were
three plots in each of the eight combinations of soil type and topography. The response variable was
total phosphorus per plot in ppm.
Topographies (TOPO)
VALLEY NORTH SOUTH HILLTOP
SHALE 98 78 117 83
172 77 54 12
SOIL 185 100 96 14
SANDSTONE 19 27 28 55
39 49 53 21
25 24 72 19
b. [1] Does the effect of Topography (TOPO) depend on soil type? Support your answer.
5
5. [1] A study estimated the relationship between age (days) and body weight in pigeons from hatching
to molting in a wild population. Five pigeon eggs were randomly sampled from the population and
grown under standardized natural conditions. At hatching (day 0) and every second day thereafter until
molting (day 28), each fledgling was weighed. Simple linear regression was used to assess the
relationship between the body weights of chicks at each time and age (n=75 observations). Identify the
flaw in this analysis.
b. [1] Measuring a covariate after applying the treatment is always as good as measuring the covariate
before applying the treatment in ANCOVA.
True False
c. [1] A completely randomized design (CRD) requires as much work to set up as a randomized
complete block design (RCBD), all else being equal.
True False
d. [1] The purpose of a randomized complete block design is to remove the effects of a categorical
confounding variable (or a continuous confounding variable classified into categories) in the
study.
True False
e. [1] ‘Randomization’ in the randomized complete block design refers to the choice of blocks
used.
True False
f. [1] In regression, interpolation refers to prediction within the range of observed data values
True False
g. [1] One important purpose of good experimental design in multi-factor studies is to add or include
variation in each factor that is independent of all other factors of interest.
True False
6
h. [1] A principle goal of both ANCOVA and RCBD ANOVA analyses is to reduce the error sums of
squares in the analysis.
True False
True False
j. [1] Suppose that the 95% confidence interval for a coefficient (β) in a multiple linear regression is
(-2.0, 3.2). From this we can infer that the true coefficient is significantly different from 0.
True False
7. Researchers were interested in determining if hair color was linked to gender in humans. A 2 by 4
contingency table is provided below showing the frequency of individuals from a random sample of
humans by sex and hair color category.
Sex Black Brown Blond Red Total
Male 32 43 16 9 100
Female 55 65 64 16 200
Total 87 108 80 25 300
b. [2] Provide the formulae and calculate the value expected only for the Female Blond category if
Ho is true.
e. [2] Provide the formula for a residual, and complete the table of residual values for this data set
if the expected value for the Male Blond category is 26.67
Formula:
7
f. [2] The value of X2data in this case was 8.987 with a P-value = 0.029. Provide your biological
conclusions, and describe the direction of the effect.
A General Linear Model was fit to the data and the output is shown below:
INHIBIT ~ UVB * DEEPV
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.967 9.726 0.305 0.7651
UVB 258.936 309.612 0.836 0.4181
DEEPV -1.467 10.538 -0.139 0.8914
UVB:DEEPV 980.039 381.539 2.569 0.0234
Residual standard error: 8.521 on 13 degrees of freedom
Multiple R-squared: 0.7289, Adjusted R-squared: 0.6663
F-statistic: 11.65 on 3 and 13 DF, p-value: 0.0005498
c. [2] Make a sketch showing the geometry of this model with UVB on the x axis and INHIBIT on
the y axis. Be sure to label lines and axes.