Assignment1 Solution

PAGE 1
STAT443 Assignment # 1 Solution Winter 2017 Instructor: S. Chenouri
Due: January, 26, 2017
You may work in pairs if you choose; both names and ID numbers should appear on it, and both will receive
the same mark. (No extra credit will be given for working alone.)
For any parts involving R, you should hand in the R code and output, as well as your interpretations of the
output. You will NOT receive marks for uncommented R code or output. You must submit your assignment
through CrowdMark, one per pair.
Problem 1. In an R session you can load a dataset, called name, using data(name). Load the time series
objects Nile, UKgas, co2, nhtemp, and JohnsonJohnson. For each time series comment briefly on the
following aspects. Justifying your comments if possible. [25 mark]
a) What is the period of the time series? [5 mark]
b) Is there a seasonal effect and, if so, is it additive or multiplicative? [5 mark]
c) What can you say about the level and trend? [5 mark]
d) Do you think that there are any change points? [5 mark]
e) Are the time series stationary? [5 mark]
You may use the R function decompose( ) for exploring additive and multiplicative forms.
data(Nile)
frequency(Nile)
[1] 1
plot(Nile,ylab="Annual flow", xlab="year",main="Annual flow of the river Nile at Aswan")
Annual flow of the river Nile at Aswan

1400
1200
1000
Annual flow
800
600
1880 1900 1920 1940 1960
year
1
For the Nile time series, the frequency is 1, and therefore period is yearly. There is no obvious seasonal
component. There seems to be a downward trend prior to 1910, and then flattening after 1920. In addition,
variability seems to be more prior to 1920. This signals a possible change in variability around 1920. The
mean of the process also seems to have changed after 1920. From all these we conclude that the process
does not look like a stationary process.
data(UKgas)
frequency(UKgas)
[1] 4
plot(UKgas,ylab="Quarterly UK gas Consumption", xlab="year",main="Quarterly UK gas Consumption from 1960Q1 to 1986Q4")

plot(decompose(UKgas,type="additive"))
plot(decompose(UKgas,type="multiplicative"))
Quarterly UK gas Consumption from 1960Q1 to 1986Q4

1200
1000
Quarterly UK gas Consumption
800
600
400
200
1960 1965 1970 1975 1980 1985
year
Decomposition of additive time series

1000
observed
600
700 200
500
trend
300
100
150
seasonal
50
200 -150 -50
random
0
-200
1960 1965 1970 1975 1980 1985
Time
2
Decomposition of multiplicative time series
1000
observed
600
700 200
500
trend
300
100
1.4
seasonal
1.0
1.6 0.6
random
1.2
0.8
1960 1965 1970 1975 1980 1985
Time
For the UKgas dataset, the frequency is 4, and therefore period is quarterly, so a seasonal component is
present. There is an obvious upward trend in the data. But it is not easy to decide about the additivity
or multiplicativity of the components. We have depicted both in the figures. The decompositions seem
quite similar except that there is big spike right after 1970 in the random part of the multiplicative case.
Because of the presence of trend and seasonality, both the mean and variance functions change with time
and therefore we conclude that the underlying process is not stationary. After removing the seasonality
and trend, by looking at the random component of the time series, we see two change points in terms of
variability. The first around 1970 and the second around 1978. So the random part is still not stationary.
data(co2)
frequency(co2)
[1] 12
plot(co2,ylab="Atmospheric concentrations of CO2", xlab="year",main="Atmospheric concentrations of CO2 per million")

plot(decompose(co2,type="additive"))
Atmospheric concentrations of CO2 per million

360
Atmospheric concentrations of CO2
350
340
330
320
1960 1970 1980 1990
year
3
360
observed
340
360 320
trend
340
1 2 3 320
seasonal
-1
-3
0.5
random
0.0
-0.5
1960 1970 1980 1990
Time
For the co2 dataset, the frequency is 12, and therefore period is monthly. Obviously, a seasonal component is
present. There is an obvious upward linear trend in the data. The trend and seasonality seem to be additive.
and therefore we conclude that the underlying process is not stationary. After removing the seasonality and
trend, by looking at the random component of the time series, we see at least two change points in terms of
data(nhtemp)
frequency(nhtemp)
[1] 1
plot(nhtemp,ylab="Average annual temperature", xlab="year",main="Average annual temperature in degrees Fahrenheit in New Haven, from 1912 to 1971")
#plot(decompose(nhtemp,type="additive"))
Average annual temperature in degrees Fahrenheit in New Haven, from 1912 to 1971
54
Average annual temperature
53
52
51
50
49
48
1910 1920 1930 1940 1950 1960 1970
year
4
For the nhtemp time series, the frequency is 1, and therefore period is yearly. There is no obvious seasonal
component. There seems to be an upward trend though. There is no obvious change point. Because of the
upward trend, the mean function is not constant, signalling the process is not stationary.
data(JohnsonJohnson)
frequency(JohnsonJohnson)
[1] 4
plot(JohnsonJohnson,ylab="Quarterly earnings (dollars) per Johnson & Johnson", xlab="year",main="Quarterly earnings (dollars) per Johnson & Johnson")
plot(decompose(JohnsonJohnson,type="additive"))
plot(decompose(JohnsonJohnson,type="multiplicative"))
Quarterly earnings (dollars) per Johnson & Johnson
Quarterly earnings (dollars) per Johnson & Johnson
15
10
5
0
1960 1965 1970 1975 1980
year

15
observed
10
5
140
10
trend
6
2
seasonal
0.0
-0.4
-0.8
12
random
0
-1
-2
-3
1960 1965 1970 1975 1980
Time
Decomposition of multiplicative time series

15
observed
10
5
140
10
trend
6
2
1.05
seasonal
0.95
0.85
1.2
random
1.0
0.8
1960 1965 1970 1975 1980
Time
5
For the JohnsonJohnson dataset, the frequency is 4, and therefore period is quarterly, so a seasonal
component is present. There is an obvious upward trend in the data. But it is not easy to decide about the
additivity or multiplicativity of the components. We have depicted both in the figures. The decompositions
seem quite similar except that there is big spike right before 1980 in the random part of the additive case.
and therefore we conclude that the underlying process is not stationary. After removing the seasonality
and trend, by looking at the random component of the time series, we see two change points in terms of
i.i.d.
Problem 2. A model for a non-stationary time series may be Xt = + t + Yt , where Yt N (0, 2 ). [10
mark]
a) How many parameters need to be estimated in this model? [2 mark]
Solution: Three parameters, , and 2 .
b) What might be a problem with using such a model to forecast far into the future? [8 mark]
Solution: It only says that the times series is generated as a linear function of time plus noise. It
does not account for possible seasonality, change point, heteroscedasticity of variance, past values, etc.
Problem 3. Download the Global-mean monthly Land-Ocean Temperature dataset from D2L under the
name GLOTemp1880-2016.csv. Use R command similar to below to read the data into R [25 mark]
glotemp<-read.csv("GLOTemp1880-2016.csv",header=TRUE)
glotemp<-glotemp[,-1]
a) Produce a time series plot of the data. Plot the aggregated annual mean series and a boxplot that
summarizes the observed values for each season, and comment on the plots. [5 mark]
glotemp<-read.csv("GLOTemp1880-2016.csv",header=TRUE)
glotemp<-glotemp[,-1]
glotemp
class(glotemp)
# converting to a time series object

glotemp.ts<-ts(c(t(glotemp)), start=c(1880,1),end=c(2016,11),freq=12)
class(glotemp.ts)
plot(glotemp.ts,ylab="Mean monthly temperature index",main=

"Global-mean monthly Land-Ocean Temperature Index")
6
Global-mean monthly Land-Ocean Temperature Index
1.0
Mean monthly temperature index
0.5
0.0
-0.5
1880 1900 1920 1940 1960 1980 2000 2020
Time
There is a clear upward trend starting around 1950, but no obvious seasonality present in the data.
The aggregated annual mean series, depicted below, indicates this upward trend more clearly.
Yrglotemp.ts<-aggregate(glotemp.ts,FUN=mean)
plot(Yrglotemp.ts,ylab="Mean yearly temperature index",main=
"Global-mean yearly temperature index: 1880 to 2016")
Global-mean yearly temperature index: 1880 to 2016

0.8
0.6
Mean yearly temperature index
0.4
0.2
0.0
-0.2
-0.4
1880 1900 1920 1940 1960 1980 2000 2020
Time
In the following figure, we provide boxplots of time series for each months of year. These show that the
global mean temperature is more or less the same for all moths but the variability changes. Variabilities
7
in the months June through November seem to be less than the other months. In addition, there are
large outliers in the data, indicating skewness to the right for the global mean temperature. This
observation corresponds to the upward trend in the global temperature.
boxplot(glotemp.ts~cycle(glotemp.ts),xlab="Month",ylab="Monthly rates %",

main="Summary of Global-mean yearly temperature index: 1880-2016")
Summary of Global-mean yearly temperature index: 1880-2016

1.0
0.5
Monthly rates %
0.0
-0.5
1 2 3 4 5 6 7 8 9 10 11 12
Month
b) Decompose the series into the components trend, seasonal effect, and residuals, and plot the decom-
posed series. Produce a plot of the trend with a superimposed seasonal effect. [5 mark]
# Decomposition of the global mean temperature time series

gmtdecomp<-decompose(glotemp.ts)
plot(gmtdecomp)
8
1.0
observed
0.5
0.0
-0.5
1.0
0.5
trend
0.0
-0.5
0.02
seasonal
0.00
0.4 -0.02
0.2
random
0.0
-0.2
1880 1900 1920 1940 1960 1980 2000 2020
Time
The decomposition clearly indicates that the global mean temperature has been rising since 1950.
There is no obvious seasonality in the dataset as it is the global temperature. The random part of the
decomposition seems to have constant mean function but its variability changes with time.
plot(gmtdecomp$trend,lwd=2,ylab="Mean yearly temperature index",

main="Trend and Trend + seasonality components superimposed")
points(gmtdecomp$trend+gmtdecomp$seasonal,col="red",type="l")
Trend and Trend + seasonality components superimposed

1.0
0.5
0.0
-0.5
1880 1900 1920 1940 1960 1980 2000 2020
Time
9
Superimposing seasonality over the trend, indicates that the seasonality effect is minimal and behaves
like noise.
c) Fit an appropriate Holt-Winters model to the monthly data. Explain why you chose that particular
Holt-Winters model, and give the parameter estimates. [5 mark]
# Holt-Winters
gmtHW<-HoltWinters(glotemp.ts, seasonal="additive",start.periods=12)
plot(gmtHW)
The function HoltWinters finds the optimal values of , , and by minimizing the squared one-step
prediction error if these parameters are set to NULL which is the default option in HoltWinters.
> gmtHW
Holt-Winters exponential smoothing with trend and additive seasonal component.
Call:
HoltWinters(x = glotemp.ts, seasonal = "additive", start.periods = 12)
Smoothing parameters:
alpha: 0.4206628
beta : 8.667418e-05
gamma: 0.0926661
Coefficients:
[,1]
a 0.891751423
b -0.000804561
s1 -0.006871003
s2 0.039762988
s3 0.059108293
s4 0.101251553
s5 0.038239190
s6 0.013137417
s7 -0.018456018
s8 -0.011760986
s9 0.018610438
s10 0.029048824
s11 0.037910274
s12 0.031457408
10
Holt-Winters filtering
1.0
Observed / Fitted
0.5
0.0
-0.5
1880 1900 1920 1940 1960 1980 2000 2020
Time
d) Using the fitted model, forecast values for the years 2017 to 2020. Add these forecasts to a time series
plot of the original series. Under what circumstances would these forecasts be valid? What comments
of caution would you make to an economist or politician who wanted to use these forecasts to make
statements about the potential impact of global warming on the world economy? [10 mark]
> gmtHW.Forecast
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2016 0.8840759
2017 0.9299053 0.9484460 0.9897847 0.9259678 0.9000615 0.8676635 0.8735539 0.9031208 0.9127546 0.9208115 0.9135541 0.8744211
2018 0.9202506 0.9387913 0.9801300 0.9163131 0.8904067 0.8580087 0.8638992 0.8934661 0.9030999 0.9111568 0.9038994 0.8647664
2019 0.9105958 0.9291366 0.9704753 0.9066583 0.8807520 0.8483540 0.8542445 0.8838113 0.8934452 0.9015021 0.8942446 0.8551117
2020 0.9009411 0.9194818 0.9608205 0.8970036 0.8710973 0.8386993 0.8445898 0.8741566 0.8837904 0.8918473 0.8845899 0.8454569
plot(glotemp.ts,lwd=2,ylab="Mean yearly temperature index",main="Holt-Winters Forecasts")

lines(gmtHW.Forecast,col="red")
Holt-Winters Forecasts
1.0
0.5
0.0
-0.5
1880 1900 1920 1940 1960 1980 2000 2020
Time
11
In the above forecasts, we assume the additivity of the trend and seasonality components. As long as
the seasonality and trend remain the same, the short term forecasts are expected to be reliable.
Problem 4. (From the book James, Witten, Hastie and Tibshirani (2013)) In this problem you will create
some simulated data and fit simple linear regression models to it. Make sure to use set.seed(1) prior to
starting part (a) to ensure consistent results. [40 mark]
set.seed(1)
a) Using the rnorm() function, create a vector, x, containing 100 observations drawn from a N (0, 1)
distribution. This represents a feature, X. [2 mark]
# a)
x<-rnorm(n=100,mean=0,sd=1)
b) Using the rnorm() function, create a vector, eps, containing 100 observations drawn from a N (0, 0.25)
distribution that is a normal distribution with mean zero and variance 0.25. [3 mark]
# b)
eps<-rnorm(n=100,mean=0,sd=sqrt(0.25))
c) Using x and eps, generate a vector y according to the model
Y = 1 + 0.5 X + (1)
What is the length of the vector y? What are the values of 0 and 1 in this linear model? [3 mrak]
# c)
y<--1+0.5*x+eps
length(y)
[1] 100
Note that 0 = 1 and 1 = 0.5.
d) Create a scatterplot displaying the relationship between x and y. Comment on what you observe. [3
mark]
# d)
plot(x,y,main="Scatterplot of x vs y")
12
Scatterplot of x vs y
0.5
0.0
-0.5
-1.0
y
-1.5
-2.0
-2.5
-2 -1 0 1 2
Just looking at the scatterplot of x and y values, we may conclude that there is either a linear or
quadratic relationship between x and y.
e) Fit a least squares linear model to predict y using x. Comment on the model obtained. How do b0
and b1 compare to 0 and 1 ? [4 mark]
# e)
fitlin1<-lm(y~x)
summary(fitlin1)
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-0.93842 -0.30688 -0.06975 0.26970 1.17309
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.01885 0.04849 -21.010 < 2e-16 ***
x 0.49947 0.05386 9.273 4.58e-15 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.4814 on 98 degrees of freedom

Multiple R-squared: 0.4674,Adjusted R-squared: 0.4619
F-statistic: 85.99 on 1 and 98 DF, p-value: 4.583e-15
13
The estimates are b0 = 1.01885 and b1 = 0.49947 which are very close to the true values 0 = 1
and 1 = 0.5. The respective standard errors of the estimates are 0.04849 and 0.05386. The adjusted
R-squared of the fit is 0.4619.
f) Display the least squares line on the scatterplot obtained in (d). Draw the population regression line
on the plot, in a different color. Use the legend() command to create an appropriate legend. [5 mark]
# f)
plot(x,y,main="Scatterplot of x vs y along with the fitted and true regression lines")
abline(a=fitlin1$coefficient[1],b=fitlin1$coefficient[2],col="blue")
abline(a=-1,b=0.5,col="red")
legend(x=c(-2,-1),y=c(0.0,0.5),c("Fitted line","True line"), col=c("blue","red"),lty=c(1,1))
Scatterplot of x vs y along with the fitted and true regression lines

0.5
Fitted line
True line
0.0
-0.5
-1.0
y
-1.5
-2.0
-2.5
-2 -1 0 1 2
As it is seen, the fitted line is very close to the true line, with almost the same slope with slightly
smaller intercept estimate.
g) Now fit a polynomial regression model that predicts y using x and x2 . Is there evidence that the
quadratic term improves the model fit? Explain your answer. [5 mark]
# g)
fitpoly1<-lm(y~x+I(x^2))
summary(fitpoly1)
14
Call:
lm(formula = y ~ x + I(x^2))
Residuals:
-0.98252 -0.31270 -0.06441 0.29014 1.13500
Coefficients:
(Intercept) -0.97164 0.05883 -16.517 < 2e-16 ***
x 0.50858 0.05399 9.420 2.4e-15 ***
I(x^2) -0.05946 0.04238 -1.403 0.164
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
P<-fitpoly1$coef
plot(x,y,main="Scatterplot of x vs y along with the fitted and true regression")
t<-seq(-3,3,0.1)
Py<-P[1]+P[2]*t+P[3]*t^2
points(t,Py,"l",col="blue")
legend(x=c(-2,-1),y=c(0.0,0.5),c("Fitted curve","True line"), col=c("blue","red"),lty=c(1,1))
Scatterplot of x vs y along with the fitted and true regression

0.5
Fitted curve
True line
0.0
-0.5
-1.0
y
-1.5
-2.0
-2.5
-2 -1 0 1 2
The fitted quadratic function has the adjusted R-squared of 0.4672. The estimated coefficients are
-0.97164, 0.50858, -0.05946. Among the three coefficients, only the coefficient of the quadratic term is
not significantly different from 0, which is consistent with the true model. This tells that the linear
model fits better.
15
h) Repeat (a) to (f) after modifying the data generation process in such a way that there is less noise in
the data. The model (1) should remain the same. You can do this by decreasing the variance of the
normal distribution used to generate the error term in (b). Describe your results. [5 mark]
# h)
eps<-rnorm(n=100,mean=0,sd=sqrt(0.05))
y<--1+0.5*x+eps
length(y)
fitlin2<-lm(y~x)
summary(fitlin2)
Call:
lm(formula = y ~ x)
Residuals:
-0.61308 -0.12553 -0.00391 0.15199 0.41332
Coefficients:
(Intercept) -0.98917 0.02216 -44.64 <2e-16 ***
x 0.52375 0.02152 24.33 <2e-16 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
F-statistic: 592.1 on 1 and 98 DF, p-value: < 2.2e-16

0.5
Fitted curve
True line
0.0
-0.5
-1.0
y
-1.5
-2.0
-2.5
-3 -2 -1 0 1 2
16
legend(x=c(-2,-1),y=c(0.0,0.5),c("Fitted line","True line"), col=c("blue","red"),lty=c(1,1))

0.5
Fitted line
True line
0.0
-0.5
-1.0
y
-1.5
-2.0
-2.5
-3 -2 -1 0 1 2
summary(fitpoly2)
Call:
Residuals:
-0.61297 -0.12369 -0.00475 0.14707 0.43183
Coefficients:
(Intercept) -0.98386 0.02677 -36.754 <2e-16 ***
x 0.52279 0.02179 23.995 <2e-16 ***
I(x^2) -0.00498 0.01396 -0.357 0.722
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

F-statistic: 293.5 on 2 and 97 DF, p-value: < 2.2e-16
17
P<-fitpoly2$coef
t<-seq(-3,3,0.1)
Py<-P[1]+P[2]*t+P[3]*t^2
legend(x=c(-2,-1),y=c(0.0,0.5),c("Fitted curve","True line"), col=c("blue","red"),lty=c(1,1))
Fitted curve
True line
0.0
-0.5
-1.0
y
-1.5
-2.0
-2 -1 0 1 2
By reducing the variance of the error distribution, all the results stayed the same except that the
quality of the fits have gotten better. For example the Adjusted R-squared of the fits have almost
doubled. This is also quite clear from the scatter plots and the fitted models.
i) Repeat (a) to (f) after modifying the data generation process in such a way that there is more noise
in the data. The model (1) should remain the same. You can do this by increasing the variance of the
normal distribution used to generate the error term in (b). Describe your results. [5 mark]
# i)
eps<-rnorm(n=100,mean=0,sd=sqrt(1))
y<--1+0.5*x+eps
length(y)
18
fitlin3<-lm(y~x)
summary(fitlin3)
Call:
lm(formula = y ~ x)
Residuals:
-2.51014 -0.60549 0.02065 0.70483 2.08980
Coefficients:
(Intercept) -1.04745 0.09676 -10.825 < 2e-16 ***
x 0.42505 0.08310 5.115 1.56e-06 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Scatterplot of x vs y
1
0
-1
y
-2
-3
-3 -2 -1 0 1 2 3 4

legend(x=c(-2,-0.7),y=c(-0.5,0.5),c("Fitted line","True line"), col=c("blue","red"),lty=c(1,1))
19
1
Fitted line
True line
0
-1
y
-2
-3
-3 -2 -1 0 1 2 3 4
summary(fitpoly3)
Call:
Residuals:
-2.53612 -0.62004 0.00828 0.75138 2.05661
Coefficients:
(Intercept) -1.01481 0.11669 -8.697 8.68e-14 ***
x 0.43295 0.08487 5.101 1.68e-06 ***
I(x^2) -0.02385 0.04724 -0.505 0.615
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
P<-fitpoly3$coef
t<-seq(-3,4,0.1)
Py<-P[1]+P[2]*t+P[3]*t^2
legend(x=c(-2,-0.7),y=c(-0.5,0.5),c("Fitted curve","True line"), col=c("blue","red"),lty=c(1,1))
20
1
Fitted curve
True line
0
-1
y
-2
-3
-3 -2 -1 0 1 2 3 4
By increasing the variance of the error distribution, all the results stayed the same except that the
quality of the fits have gotten worse. For example the Adjusted R-squared of the fits have dropped to
almost half of the original fits. This is also quite clear from the scatter plots and the fitted models.
j) What are the confidence intervals for 0 and 1 based on the original data set, the noisier data set,
and the less noisy data set? Comment on your results. [5 mark]
# First dataset, the 95% confidence intervals are

confint(fitlin1)
2.5 % 97.5 %
(Intercept) -1.1150804 -0.9226122 # |-1.1150804 -0.9226122|=0.1924682
x 0.3925794 0.6063602 # |0.3925794 - 0.6063602|=0.2137808
# Second dataset
confint(fitlin2)
2.5 % 97.5 %
(Intercept) -1.033141 -0.9451916 # |-1.033141 + 0.9451916|=0.0879494
x 0.481037 0.5664653 # | 0.481037 - 0.5664653|=0.0854283
# Third dataset
confint(fitlin3)
2.5 % 97.5 %
(Intercept) -1.2394772 -0.8554276 # |-1.2394772 +0.8554276|=0.3840496
x 0.2601391 0.5899632 # |0.2601391 - 0.5899632|=0.3298241
None of the intervals include 0, indicating that none of the coefficients 0 and 1 are zero at 95%.
In addition as the variance of the error term increases the confidence intervals become wider and
conversely as the variance decreases, the confidence intervals become narrower.
21

Assignment1 Solution

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Assignment1 Solution

Încărcat de

Drepturi de autor:

Formate disponibile

PAGE 1

STAT443 Assignment # 1 Solution Winter 2017 Instructor: S. Chenouri

Due: January, 26, 2017

a) What is the period of the time series? [5 mark]

b) Is there a seasonal effect and, if so, is it additive or multiplicative? [5 mark]

d) Do you think that there are any change points? [5 mark]

e) Are the time series stationary? [5 mark]

Annual flow of the river Nile at Aswan

1880 1900 1920 1940 1960

plot(UKgas,ylab="Quarterly UK gas Consumption", xlab="year",main="Quarterly UK gas Consumption from 1960Q1 to 1986Q4")

Quarterly UK gas Consumption from 1960Q1 to 1986Q4

1960 1965 1970 1975 1980 1985

Decomposition of additive time series

1960 1965 1970 1975 1980 1985

1960 1965 1970 1975 1980 1985

plot(co2,ylab="Atmospheric concentrations of CO2", xlab="year",main="Atmospheric concentrations of CO2 per million")

Atmospheric concentrations of CO2 per million

1960 1970 1980 1990

1960 1970 1980 1990

1910 1920 1930 1940 1950 1960 1970

Quarterly earnings (dollars) per Johnson & Johnson

Quarterly earnings (dollars) per Johnson & Johnson

1960 1965 1970 1975 1980

Decomposition of additive time series

1960 1965 1970 1975 1980

Decomposition of multiplicative time series

1960 1965 1970 1975 1980

a) How many parameters need to be estimated in this model? [2 mark]

Solution: Three parameters, , and 2 .

# converting to a time series object

plot(glotemp.ts,ylab="Mean monthly temperature index",main=

1880 1900 1920 1940 1960 1980 2000 2020

Global-mean yearly temperature index: 1880 to 2016

1880 1900 1920 1940 1960 1980 2000 2020

boxplot(glotemp.ts~cycle(glotemp.ts),xlab="Month",ylab="Monthly rates %",

Summary of Global-mean yearly temperature index: 1880-2016

# Decomposition of the global mean temperature time series

1880 1900 1920 1940 1960 1980 2000 2020

plot(gmtdecomp$trend,lwd=2,ylab="Mean yearly temperature index",

Trend and Trend + seasonality components superimposed

1880 1900 1920 1940 1960 1980 2000 2020

1880 1900 1920 1940 1960 1980 2000 2020

plot(glotemp.ts,lwd=2,ylab="Mean yearly temperature index",main="Holt-Winters Forecasts")

1880 1900 1920 1940 1960 1980 2000 2020

c) Using x and eps, generate a vector y according to the model

Note that 0 = 1 and 1 = 0.5.

Residual standard error: 0.4814 on 98 degrees of freedom

Scatterplot of x vs y along with the fitted and true regression lines

Scatterplot of x vs y along with the fitted and true regression

Scatterplot of x vs y along with the fitted and true regression

plot(x,y,main="Scatterplot of x vs y along with the fitted and true regression lines")

Scatterplot of x vs y along with the fitted and true regression lines

Residual standard error: 0.2225 on 97 degrees of freedom

Scatterplot of x vs y along with the fitted and true regression

plot(x,y,main="Scatterplot of x vs y along with the fitted and true regression lines")

# First dataset, the 95% confidence intervals are

S-ar putea să vă placă și