Sunteți pe pagina 1din 8

MAST20005 Practice Class/Computing Laboratory 1.

Based on Sections 3.1 & 3.2 of the text. In the Tutorial we will review some sample characteristics that are commonly used to describe a set of data. We will then proceed to how these are calculated on a computer. Consider the set of numbers, 43.1, 48.9, 42.6, 43.7, 41.0. (
5 i=1 5 i=1

xi = 219.3,

x2 = 9654.27) i

1. Simple numerical characteristics. (a) Mean and standard deviation. The sample mean, x = n1 measure of location. The sample variance, s2 = (n 1)1
n i=1

xi , is a

n i=1 (xi

x)2 , is

a measure of spread. The standard deviation is given by s = i. Show


n i=1 (xi

s2 .

x) = 0 and s2 = (n 1)1 (

n i=1

x2 n2 ). x i

ii. Compute the mean and standard deviation for the data set above. (b) The 100pth sample percentile has approximately np sample observations less than it. Write (n+1)p = r +a/b so that r is an integer and 0 a/b 1. Let y1 , . . . , yn be the order statistics, and take p = yr +(a/b)(yr+1 yr ). (Sample percentiles are not uniquely dened and dierent computer packages give dierent results. The R package gives 9 options!). i. The median q2 is the 50th percentile. Compute the median of the above data set. ii. The rst quartile q1 is the 25th percentile and the third quartile q3 is the 75th percentile. Compute the rst and third quartiles of the above data set. iii. The interquartile range (IQR) is q3 q1 . Compute the interquartile range of the above data set. iv. The ve number summary is the minimum, q1 , q2 , q3 , maximum. Compute the 5 number summary of the above data set. v. A boxplot or box and whisker diagram is a graphical display of the 5 number summary. It consists of a rectangle with left and right sides drawn at q1 and q3 and a vertical line segment at q2 . Whiskers then 1

extend from the center of the lhs of the rectangle to the minimum and from the centre of the rhs to the maximum.

Draw a boxplot for the above data set. (c) Outliers are observations that dont seem to belong with the rest of the data. They can occur through data entry errors or problems with an experiment. They can be identied on a boxplot by constructing inner and outer fences at distances of 1.5 and 3 IQRs. Observations between the inner and outer fences are suspected outliers and those beyond the outer fences are called outliers. Add inner and outer fences to your boxplot. Are there any suspected outliers or outliers? 2. Graphical Methods. (a) A histogram divides the range of the observations into intervals and counts the number of observations in each interval. It plots either the frequencies or relative frequencies. Use the intervals 40 42, 42 44, 44 46, 46 48 and 48 50 to construct a histogram of the above data set. (b) Stem and leaf plots divide each observation into a stem and a leaf so thast for example 6.2 has a stem of 6 and a leaf of 2. This allows plots of the form:

-2 | 64420 -1 | 7654444332221100 -0 | 9999888877776555554222111111 0 | 0111122223334445555777788999999999 1 | 00111222336689 2 | 24 3 | 2 Give a stem and leaf plot of the above data. 3. Estimation. (a) (6.1-1) Let X1 , . . . , Xn be a random sample from N (, 2 ) where < < and 2 > 0 is known. Show the maximum likelihood estimator of is = X. (b) (6.1-3) A random sample X1 , . . . , Xn of size n is taken from a Poisson distribution with mean > 0. i. Show the maximum likelihood estimator of is = X. ii. Suppose with n = 40 we observe 5 zeros, 7 ones, 12 twos, 9 threes, 5 fours, 1 ve, and 1 six. What is the maximum likelihood estimate of . (c) (6.1-5) Let X1 , , Xn be random samples from he following probability density functions. In each case nd the maximum likelihood estimator . i. f (x; ) = (1/2 )x exp(x/), 0 < x < , 0 < < . ii. f (x; ) = (1/23 )x2 exp(x/), 0 < x < , 0 < < . iii. f (x; ) = (1/2) exp(|x |), < x < , 0 < < . The last part involves minimizing
n i=1

|xi |, which is dicult. Try

n = 5 and a sample 6.1, -1.1, 3.2, 0.7, 1.7.Then deduce the mle. 4. Double click on the R icon and enter the following commands: x=c(43.1,48.9,42.6,43.7,41.0); x; 3

sum(x); ?sum; sum(x^2); mean(x); help(mean); var(x); sd(x); summary(x); fivenum(x); quantile(x,type=7); quantile(x,type=6); #should agree with the numbers you got by hand median(x); boxplot(x); hist(x); stem(x,scale=2); and compare the outcomes with the results you previously obtained by hand. If you are not sure what a command does enter help(command) or ?command. 5. Complete these on the computer. (a) Two saws X and Y are used to cut timber of a nominal 8ft length. Random sample of 9 lengths cut by each saw yielded Saw X: Saw Y: 8.02 8.04 8.10 8.04 8.04 8.10 8.04 8.06 8.00 8.08 8.11 8.10 8.07 8.07 8.02 8.08 8.04 8.06

i. Find the mean and standard deviation of each set of measurements. To enter the data type X=c(8.02,8.10,8.04,8.04,8.00,8.11,8.07,8.02,8.04); Y=c(8.04,8.04,8.10,8.06,8.08,8.10,8.07,8.08,8.06); ii. Find the 5 number summary of each set and construct box plots of each set of measurements on the same graph. iii. How would you compare the two saws? 4

Length=c(X,Y); Saw=c(rep("X",9),rep("Y",9)); fivenum(Length[Saw=="X"]); fivenum(Length[Saw=="Y"]); boxplot(Length~Saw); mean(Length[Saw=="X"]); mean(Length[Saw=="Y"]); sd(Length[Saw=="X"]); sd(Length[Saw=="Y"]); (b) (3.1-5 & 3.2-1) During the course of an internship at a company that manufactures diesel engine fuel injector pumps, a student had to measure the category of the plungers that force the fuel out of the pumps. The category is based on a relative scale measuring the dierence in diameter (in microns or micrometers) of a plunger from that of an absolute minumum diameter. The results for 96 plungers randomly taken from the production line are in the le Exercise 3 1-05.txt stored on the server. Use the instructions below to load this data set. i. Calculate the sample mean and standard deviation of these data. ii. Give a histogram of these data. iii. Construct a stem and leaf plot of the data. iv. Find a ve number summary and the sample mean and variance. v. Give a boxplot of the data. Are there any outliers? vi. Plot the empirical cumulative distribution function of the data. plot(ecdf(X)) data.7 = read.delim("Exercise_3_1-05.txt",header=T); names(data.7); X=data.7$microns; 6. (6.1-1) Let X1 , . . . , X5 be a random sample from N (, 2 ) where < < and 2 > 0 is known. Note that in computing the likelihood, Maple omits constant terms. 5

(a) Start Maple 12 and click on [> on the bar to obtain the prompt. (b) Enter with(Statistics): (c) Enter Y := RandomVariable(Normal(mu, sigma)); Mean(Y); Variance(Y); PDF(Y,y); Likelihood(Y, y, samplesize = 5); l:=LogLikelihood(Y, y, samplesize = 5); d1:=diff(l,mu); simplify(d1); s:=Score(Y, y, samplesize = 5); s[1]; solve(s[1] = 0, mu); Conclude the maximum likelihood estimator of is = X. (d) (6.1-3) A random sample X1 , . . . , Xn of size 40 is taken from a Poisson distribution with mean > 0. i. Show the maximum likelihood estimator of is = X. X := RandomVariable(Poisson(lambda)); MaximumLikelihoodEstimate(X, x, samplesize = 10); from which we can deduce the MLE in general is = X. (e) (6.1-5) Let X1 , , X10 be random samples from he following probability density functions. In each case nd the maximum likelihood estimator . i. f (x; ) = (1/2 )x exp(x/), 0 < x < , 0 < < . f := piecewise(x < 0, 0, x*exp(-x/theta)/theta^2); f := unapply(f, x); Z := RandomVariable(Distribution(PDF = f)); l := LogLikelihood(Z, z, samplesize = 10); s := diff(l, theta); 6

solve(s = 0, theta); from which we can deduce = X/2. ii. f (x; ) = (1/23 )x2 exp(x/), 0 < x < , 0 < < . f := piecewise(x < 0, 0, x^2*exp(-x/theta)/(2*theta^3)); f := unapply(f, x); Z := RandomVariable(Distribution(PDF = f)); l := LogLikelihood(Z, z, samplesize = 10); s := diff(l, theta); solve(s = 0, theta); from which we can deduce = X/3.

Using R and Maple

1. Start R by clicking on the R icon. 2. To access les from the text rst click on the M & S Lab materials (Z) icon. Then go to the appropriate subdirectory. To copy the data for Question 3.1-5 scroll down the directories: 620-202 & 620-205 Chapter 03 Section 3 1 Then copy the le Exercise 3 1-05.txt to your home directory (Student Data (D)) You can change default directories using the item in the File menu. 3. To read the le into R use data= read.delim( Exercise 3 1-05.txt,header=T); this results in a data frame called data. To see what it contains and extract columns into use commands like: names(data); and X=data[,1]; data$microns; 4. Start Maple from the programs menu. Click on the [> button on the menu bar to obtain the prompt.

5. Assignments in Maple use := and each line ends with a ;, so you need to enter commands such as [> y:=sin(x);

S-ar putea să vă placă și