Ie352l1 Labmanual

Laboratory Exercise 1
Comparative Experiments Z-test (One-sample mean test)

Course Code:
Program:
Course Title:
Date Performed:
Section:
Date Submitted:
Members:
Instructor:
1. Objective(s):
The activity aims to introduce the z-test as another test under the parametric statistics that requires
normality of distribution using MiniTab.
2. Intended Learning Outcomes (ILOs):
The students shall be able to:
2.1 describe the use of z-test in comparing means, sample mean, and population mean,
2.2 solve for the z-value using MiniTab,
2.3 interpret and compare the result in the table of tabular value of the z-test.
3. Discussion:
Statistics is all about understanding the role of chance in our measurements and we often want to know
what the chances are of obtaining sample means given the population mean is a certain value. The
standard error of the mean identifies how much the sample mean varies from sample to sample (it is the
standard deviation of the population mean given a particular sample size). The empirical rule tells us that
95% of the time the sample mean will fall within two standard errors of the population mean. We can
extend the principle of the empirical rule and use the normal curve to find the probabilities for a given
sample mean using a statistical test called the 1-sample z-test.
Z-test is any statistical test for which the distribution of the test statistic under the null hypothesis can be
1
approximated by a normal distribution. Because of the central limit theorem, many test statistics are
approximately normally distributed for large samples. For each significance level, the Z-test has a single
critical value (for example, 1.96 for 5% two tailed) which makes it more convenient than the Student's ttest which has separate critical values for each sample size. Therefore, many statistical tests can be
conveniently performed as approximate Z-tests if the sample size is large or the population variance
known.
The Z-test is typically with standardized tests, checking whether the scores from a particular sample are
within or outside the standard test performance. The z value indicates the number of standard deviation
units of the sample from the population mean. Note that the z-test is not the same as the z-score,
although they are closely related.
The tabular value of the z test at 0.01 and 0.05 level of significance is shown below:
Test
Level of Significance
0.01
0.05
One-tailed
+2.33
+1.645
Two-tailed
+2.575
+1.96
The formula is:
Where:
x = sample mean
= hypothesized value of the population mean
= population standard deviation
n = sample size
4. Resources:
MiniTab Software/Manual
Textbooks
5. Procedure:
Practice Problem: A school principal claimed that the average score of their students in the reading
2
comprehension test should have an average of 75.00, with a standard deviation of 7.5. If 50 randomly
selected students have an average of 82.5, use z-test to test the null hypothesis that = 75.00 against the
alternative hypothesis of 75.00 at 0.05 level of significance.
Procedure:
1. Open a blank worksheet in the MiniTab.
2. Choose the Stat option from the menu bar of the Minitab window.
3. Select Basic Statistics > 1 Sample Z test
4. Input the following information given in the problem in the 1-Sample Z (Test and Confidence
Interval) dialog box
5. Click on the Graphs button to select the type of graphical representation needed. Click the OK
button to continue.
6. Click on the Options button to define the Confidence Level and the Alternative. Click OK to
continue.
7. Click the OK button on the main window to run the analyses. The output will be displayed in the
Session window.
6. Data and Results:
7. Data Analysis and Conclusion:
8. Assessment (Rubric for Laboratory Performance):

TIP-VPAA054D
Revision Status/Date:0/2009 September 09
CRITERIA
TECHNOLOGICAL INSTITUTE OF THE PHILIPPINES

RUBRIC FOR LABORATORY PERFORMANCE
BEGINNER
ACCEPTABLE
PROFICIENT
1
2
3
Laboratory Skills
Manipulative Members do not
Skills
demonstrate needed
skills.
Experimental Members are unable to
Set-up
set-up the materials.
Members occasionally
demonstrate needed
skills.
Members are able to
set-up the materials
with supervision.
demonstrate targeted
process skills.
Members always
demonstrate needed
skills.
Members are able to
set-up the material with
minimum supervision.
Members always
process skills.
Process
Skills
Members do not
process skills.
Safety
Precautions
Members do not follow

safety precautions.
Members follow safety

precautions most of the
time.

precautions at all
times.
Members do not finish

on time with incomplete
data.
Members finish on time

with incomplete data.
Members do not know

their tasks and have no
defined responsibilities.
Group conflicts have to
be settled by the
teacher.
Neatness and Messy workplace during
Orderliness
and after the
experiment.
Members have defined

responsibilities most of
the time. Group
conflicts are
cooperatively managed
most of the time.
Clean and orderly
workplace with
occasional mess during
and after the
experiment.
Members require
occasional supervision
by the teacher.
Members finish ahead

of time with complete
data and time to revise
data.
Members are on tasks
and have
responsibilities at all
times. Group conflicts
are cooperatively
managed at all times.
Clean and orderly
workplace at all times
during and after the
experiment.
Work Habits
Time
Management/
Conduct of
Experiment
Cooperative
and
Teamwork
Ability to do
Members require
independent
supervision by the
work
teacher.
Other Comments/Observations:
SCORE
Members do not need

to be supervised by the
teacher.
TOTAL SCORE
RATING=
x 100%
Laboratory Exercise No. 2

Hypothesis Testing and Confidence Intervals
Course Code:
Program:
Course Title:
Date Performed:
Section:
Date Submitted:
Members:
Instructor:
1. Objective(s):
The activity aims to introduce hypothesis testing and confidence intervals applied to 1-sample t-test.
2.1
compare the mean data of a sample to a known value using 1-sample T-test
2.2
use a hypothesis test to make inferences about one or more populations when sample data are
available.
2.3
quantify the precision of the estimate using confidence interval.
2.4
interpret results and draw conclusions about the output provided by Minitab.
3. Discussion:
A hypothesis test uses a sample data to test a hypothesis about the population from which the sample was
taken. The 1-sample t-test is one of many procedures available for hypothesis testing in Minitab. For
example, to test whether the mean length, measure several rods and the use of mean length of these
samples to estimate mean length of the total rod population. Using the information from a sample to make
a conclusion about a population is known as statistical inference.
Use a 1-sample t-test to determine whether (the population mean) is equal to a hypothesized value (the
hypothesized mean). The test uses the standard deviation of the sample to estimate (the population
7
standard deviation). If the difference between the sample mean and the hypothesized mean is large
relatively to the variability of the sample mean, then is unlikely to be equal to the hypothesized mean.
Use a 1-sample t-test with continuous data from a single random sample. The test assumes the population
is normally distributed. However, the test is robust to violations of this assumption, provided the
observations are collected randomly and the data are continuous, unimodal, and reasonably symmetric.
4. Resources:
Training Data Sets
Textbooks
5. Procedure:
Practice Problem: A cereal manufacturer wants to determine whether the box-filling process is on target.
The target fill weight for cereal boxes is 365 grams. Engineers choose six boxes of cereal at random, weigh
them, and use the sample data to estimate the mean of the population (the process mean). The
manufacturer needs to determine whether the mean weight for the packaging process differs significantly
from the target weight of 365 grams. In statistical terms, the process mean is the population mean, or
(mu).
Part 1:
1. Open CEREALBX.MPJ
2. Choose Stat Basic Statistics 1-Sample t
3. Complete the dialog box as shown below.

8
4. Click OK.
5. Interpret the results.
6. Make a decision.
7. Draw conclusions.
Part 2: Testing the assumption of normality
1. Choose Stat Basic Statistics Normality Test

9
3. Click OK.
4. Interpret the result
Part 3: Confidence Intervals
1. Choose Stat Basic Statistics 1-Sample t
2. Click Graphs.
10
4. Click OK in each dialog box.

5. Interpret the results
11

TIP-VPAA054D
CRITERIA

BEGINNER
ACCEPTABLE
PROFICIENT
1
2
3
Laboratory Skills
Skills
demonstrate needed
skills.
Set-up
demonstrate needed
skills.
Members are able to
with supervision.
process skills.
Members always
demonstrate needed
skills.
Members are able to
Members always
process skills.
Process
Skills
Members do not
process skills.
Safety
Precautions

safety precautions.

time.

precautions at all
times.

data.

Members do not know

be settled by the
teacher.
Orderliness
and after the
experiment.

the time. Group
conflicts are
most of the time.
Clean and orderly
workplace with
and after the
experiment.
Members require
by the teacher.

data.
and have
are cooperatively
Clean and orderly
experiment.
Work Habits
Time
Management/
Conduct of
Experiment
Cooperative
and
Teamwork
Ability to do
Members require
independent
supervision by the
work
teacher.
SCORE
Members do not need

teacher.
TOTAL SCORE
RATING=
x 100%
12

Power and Sample Size
Course Code:
Program:
Course Title:
Date Performed:
Section:
Date Submitted:
Members:
Instructor:
1. Objective(s):
The activity aims to introduce the basic information for sample size calculation and power analysis using
Minitab.
2.1
determine the sample size
2.2
evaluate the power to detect the difference of the collected data
3. Discussion:
Power is the ability of a test to detect a difference when one exists. A hypothesis test has the following
possible outcomes:
Null hypothesis
Decision
Fail to reject
Reject
True
Correct Decision
p=1-
Type I error
p=
(power)
False
Type II error
p=
Correct Decision
p=1
The power of the test is the probability that you will correctly reject the null hypothesis, given that the null
hypothesis is false. Use a power analysis to determine how much power a test has or to design a new test
with adequate power.
13
Values
To estimate power, you must specify values for any two of the following parameters of the test; Minitab
calculates the remaining parameter.
8. Sample sizes the number of observations in the sample
9. Differences a meaningful shift away from the target that you are interested in detecting with high
probability
10. Power values the power (probability of rejecting H0 when it is false) that you would like the test to
have.
4. Resources:
Textbooks
5. Procedure:
Practice Problem: The engineers are concerned about the results of the fill weight analysis (Laboratory
Exercise 2) because of its small sample size. They decide to conduct a power analysis to determine
whether they collected enough sample data to detect a difference. They want to be sure the process mean
fill weight does not differ from the target weight of 365 grams by more than 2.5 grams. The engineers base
the power analysis on the result of t-test from Laboratory exercise 2.
1. Choose File New, select Minitab Project, and click OK.
2. Choose Stat Power and Sample Size 1-Sample t.
14
4. Click OK.
Part 2: Determining power: With 6 observations, the power of the test was only 0.5377. To have a better
chance of detecting a difference, increase the power of the test to at least 0.80 by increasing the sample
size. Calculate the sample sizes required to achieve power levels of 0.80, 0.85, 0.90, and 0.95.
6. Choose Stat Power and Sample Size 1-Sample t
15
8. Click OK.
16

TIP-VPAA054D
CRITERIA

BEGINNER
ACCEPTABLE
PROFICIENT
1
2
3
Laboratory Skills
Skills
demonstrate needed
skills.
Set-up
demonstrate needed
skills.
Members are able to
with supervision.
process skills.
Members always
demonstrate needed
skills.
Members are able to
Members always
process skills.
Process
Skills
Members do not
process skills.
Safety
Precautions

safety precautions.

time.

precautions at all
times.

data.

Members do not know

be settled by the
teacher.
Orderliness
and after the
experiment.

the time. Group
conflicts are
most of the time.
Clean and orderly
workplace with
and after the
experiment.
Members require
by the teacher.

data.
and have
are cooperatively
Clean and orderly
experiment.
Work Habits
Time
Management/
Conduct of
Experiment
Cooperative
and
Teamwork
Ability to do
Members require
independent
supervision by the
work
teacher.
SCORE
Members do not need

teacher.
TOTAL SCORE
RATING=
x 100%
17

1-Sample t-Test
Course Code:
Program:
Course Title:
Date Performed:
Section:
Date Submitted:
Members:
Instructor:
1. Objective(s):
The activity aims to introduce 1-sample t-test used for independent samples as a more powerful test
compared with other tests of difference of two independent groups.
2.1 evaluate the difference between a process (population) mean and a target value using 1-sample ttest.
3. Discussion:
A one sample t-test measures whether a sample value significantly differs from a hypothesized value.
For example, a Movielens researcher might hypothesize it takes 50 seconds for a new user to add a
friend to their buddy list. The researcher conducts an experiment and measures how long it takes
several new users to perform the task. The one sample t-test measures whether the mean amount of
time it took the experimental group to complete the task varies significantly from the hypothesized 50
second value.
The one sample t-test requires that the dependent variable follow a normal distribution. When the
number of subjects in the experimental group is 30 or more, the central limit theorem shows a normal
distribution can be assumed. If the number of subjects is less than 30, the researcher should plot the
18
results and examine whether they appear to follow a normal distribution. If the distribution appears to
be non-normal, and/or if the number of test cases is significantly less than 30, then a one sample
median test, which does not require a normal distribution, should be used to test the hypothesis. Values
to report are the following: the mean of the test group, degrees of freedom for the t-test, t-value, and p
value.
4. Resources:
Training Data Set
Textbooks
5. Procedure:
Practice Problem: The result of the first power analysis suggest that a larger sample would be useful in
evaluating the process. Six observations did not have enough power to detect a 2.5-gram difference.
Engineers randomly select 12 boxes of cereal and weigh them. Analyze the new sample to determine
whether the process mean is different from 365 grams.
11. Open CEREALBX.MPJ
12. Choose Window Worksheet 2.
13. Choose Stat Basic Statistics 1-Sample t.
15. Click Graphs.

19
16. Check Boxplot of Data.

17. Click OK in each dialog box
Part 2: The 1-Sample t-test assumes the data are sampled from a normally distributed population. Use a
normality test to determine whether the assumption of normality is valid.
9. Click OK.
20
21

TIP-VPAA054D
CRITERIA

BEGINNER
ACCEPTABLE
PROFICIENT
1
2
3
Laboratory Skills
Skills
demonstrate needed
skills.
Set-up
demonstrate needed
skills.
Members are able to
with supervision.
process skills.
Members always
demonstrate needed
skills.
Members are able to
Members always
process skills.
precautions at all
times.
Process
Skills
Members do not
process skills.
Safety
Precautions

safety precautions.

time.

data.

Work Habits
Time
Management/
Conduct of
Experiment
Cooperative
and
Teamwork
Members do not know

be settled by the
teacher.
Orderliness
and after the
experiment.
Ability to do
Members require
independent
supervision by the
work
teacher.
SCORE

data.
Members have defined Members are on tasks
responsibilities most of and have
the time. Group
conflicts are
cooperatively managed are cooperatively
most of the time.
Clean and orderly
Clean and orderly
workplace with
occasional mess during during and after the
and after the
experiment.
experiment.
Members require
Members do not need
occasional supervision to be supervised by the
by the teacher.
teacher.
TOTAL SCORE
RATING=
x 100%
22

Power and Sample Size for 2-Sample t-Test
Course Code:
Program:
Course Title:
Date Performed:
Section:
Date Submitted:
Members:
Instructor:
1. Objective(s):
The activity aims to introduce basic ideas of power and sample size calculations for 2-sample t-Test.
2.1
test for a difference between two population means using a 2-sample t-tes
2.2
determine the sample size required to detect an effect of a given size with a given degree of
confidence.
3. Discussion:
In a 2-sample t-test, power is the probability that you will detect a difference between the two means
when they actually differ while the sample size is the number of samples per group that you need to
achieve a specified power. The analysis can be used either: before collecting the data, to determine the
sample size or after collecting the data, to evaluate the power to detect a difference between means.
Power and sample size can determine the following:
12. The sample size per group that you need to detect a difference between means with a specified
power
13. The power of a test to detect a difference between means based on a specified sample size
14. The size of a detectable difference with a specified power and sample size
Determining the sample size for 2-sample t-test:
23
Sample sizes do not enter sample size when you want to determine the sample size.
Values of difference and standard deviation the power of a test depends on the difference you want to
detect relative to the standard deviation. To detect a 1-standard deviation (or 1-sigma) difference, enter
a difference of 1 and -1, and a standard deviation of 1.
Power values enter the desired power value(s). Power values higher than 0.80 are typically
considered acceptable.
4. Resources:
Textbooks
5. Procedure:
Practice Problem: A calculator manufacturer is selecting a plastic supplier. The quality team has a policy for
critical quality metrics that states: Assuming similar variability and costs, mean strengths more than one
standard deviation apart are an important difference. Determine the sample size needed to detect a
difference of one standard deviation between two suppliers with similar variability. (Minitab assumes equal
variability in the sample size calculation.) The power to detect this difference should be at least 80%.
20. Choose File New, select Minitab Project , and click OK
21. Choose Stat Power and Sample Size 2-Sample t.
24
23. Click OK
25

TIP-VPAA054D
CRITERIA

BEGINNER
ACCEPTABLE
PROFICIENT
1
2
3
Laboratory Skills
Manipulative
Members do not
Skills
demonstrate needed
skills.
Set-up
demonstrate needed
skills.
Members are able to
with supervision.
process skills.
Members always
demonstrate needed
skills.
Members are able to
Members always
process skills.
Process
Skills
Members do not
process skills.
Safety
Precautions

safety precautions.

time.

precautions at all
times.

data.

Members do not know

be settled by the
teacher.
Messy workplace during
and after the
experiment.

the time. Group
conflicts are
most of the time.
Clean and orderly
workplace with
and after the
experiment.
Members require
by the teacher.

data.
and have
are cooperatively
Clean and orderly
experiment.
Work Habits
Time
Management/
Conduct of
Experiment
Cooperative
and
Teamwork
Neatness and
Orderliness
Ability to do
Members require
independent
supervision by the
work
teacher.
SCORE
Members do not need

teacher.
TOTAL SCORE
RATING=
x 100%
26

2-Sample t-Test
Course Code:
Program:
Course Title:
Date Performed:
Section:
Date Submitted:
Members:
Instructor:
1. Objective(s):
2.1
test for a difference between two population means using a 2-sample t-test
2.2
confidence.
3. Discussion:
An independent 2-sample t-test helps determine whether two population means are different. The test
uses the sample standard deviations to estimate for each population. If the difference between the
sample means is large relative to the estimated variability of the sample means, then the population
means are unlikely to be the same. Independent 2-sample t-test can also be used to evaluate whether
the means of two populations are different by a specific amount.
When to use an independent 2-sample t-test?
Use an independent 2-sample t-test with continuous data from two independent random samples.
Samples are independent if observations from one sample are not related to the observations from the
other sample. The test also assumes that the data come from normally distributed populations.
27
However, the test is robust to violations of this assumption, provided the observations are collected
randomly and the data are continuous, unimodel, and reasonably symmetric.
Why use an independent 2-sample t-test?
An independent 2-sample t-test answers questions such as:
3 Are the means of a product characteristic between two suppliers comparable?
4 Is one formulation of a product better on average than other?
4. Resources:
Training Data Set, Textbooks
5. Procedure:
Practice Problem: A calculator manufacturer is selecting a plastic supplier. Using a sample size of 20 plastic
pellets from each supplier, the manufacturer must compare samples from the two suppliers for strength.
26. Open PLASTIC.MPJ
27. Choose Stat Basic Statistics 2-Sample t.
29. Click Graphs

30. Check Individual value plot and Boxplots of data
28

Part 2: Testing the normality assumption: The 2-sample t-test assumes the data are sampled from normally
distributed populations.
2
Choose Stat Basic Statistics Normality Test
In Variable, enter SupplrA
Click OK.
Choose Stat Basic Statistics Normality Test
In Variable, enter SupplrB
Click OK.
Interpret the results
Draw conclusions.
Part 3: Comparing variances: The 2-sample t-test compares the means of two populations. Often it is of
interest to know whether the variances (or standard deviations) of two groups are different.
1. Choose Stat Basic Statistics 2 Variances
29
3. Click OK.
6.
30

TIP-VPAA054D
CRITERIA

BEGINNER
ACCEPTABLE
PROFICIENT
1
2
3
Laboratory Skills
Skills
demonstrate needed
skills.
Set-up
demonstrate needed
skills.
Members are able to
with supervision.
process skills.
Members always
demonstrate needed
skills.
Members are able to
Members always
process skills.
precautions at all
times.
Process
Skills
Members do not
process skills.
Safety
Precautions

safety precautions.

time.

data.

Work Habits
Time
Management/
Conduct of
Experiment
Cooperative
and
Teamwork
Members do not know

be settled by the
teacher.
Orderliness
and after the
experiment.
Ability to do
Members require
independent
supervision by the
work
teacher.
SCORE

data.
Members have defined Members are on tasks
responsibilities most of and have
the time. Group
conflicts are
cooperatively managed are cooperatively
most of the time.
Clean and orderly
Clean and orderly
workplace with
occasional mess during during and after the
and after the
experiment.
experiment.
Members require
Members do not need
occasional supervision to be supervised by the
by the teacher.
teacher.
TOTAL SCORE
RATING=
x 100%
31

Paired t-Test
Course Code:
Program:
Course Title:
Date Performed:
Section:
Date Submitted:
Members:
Instructor:
1. Objective(s):
2.1
test for a difference between two population means using a 2-sample t-test
2.2
confidence.
3. Discussion:
A paired t-Test helps determine whether the mean differences between paired observations is significant.
Statistically, the paired t-test is equivalent to performing a 1-sample t-test on the differences. A paired ttest also helps you to evaluate whether the mean difference is equal to a specific value.
Paired observations are related. Examples include:
1. Weights recorded for individuals before and after an exercise program
2. Measurements of the same part taken with two different measuring devices.
Paired t-test with a random sample of paired observations. The test also assumes that the paired
differences come from a normally distributed population. However, the test is robust to violations of this
32
assumption, provided the observations are collected randomly and the data are continuous, unimodal,
and reasonably symmetric.
Why use a paired t-test?
A paired t-test answers questions such as:
1. Does a new treatment result in a difference in the product?
2. Do two different instruments provide similar measurements for the same sample?
4. Resources:
Training Data Set
Textbooks
5. Procedure:
Practice Problem: A consumer group wants to determine whether drivers can park one car more quickly
than the other. Because the data are paired (each individual parked both cars), use a paired t-test to teatv
the following hypothesis:
H0: The mean difference between paired observations in the population is zero.
H1: The mean difference between paired observations in the population is not zero.
Use the default confidence level of 95%. Display individual value plots and boxplots to help visualize the
data.
1. Open CARCTL.MPJ
2. Choose Stat Basic Statistics Paired t.
33
4. Click Graphs
5. Check Individual value plot and Boxplots of differences.
Part 2: Testing the normality: The paired t-test

2. In Variable, enter SupplrA
3. Click OK.
5. In Variable, enter SupplrB
7. Click OK.
Part 3: Checking for Normality: the paired t-test is actually a 1-sample t-test on the pair wise difference.
Therefore, the pair wise differences must satisfy the 1-sample t-test assumptions, including normality.
34
Before checking for normality, store the pair wise differences in the worksheet.
1. Choose Stat Basic Statistics 2 Variances
3. Click OK.
35
36

TIP-VPAA054D
CRITERIA

BEGINNER
ACCEPTABLE
PROFICIENT
1
2
3
Laboratory Skills
Skills
demonstrate needed
skills.
Set-up
demonstrate needed
skills.
Members are able to
with supervision.
process skills.
Members always
demonstrate needed
skills.
Members are able to
Members always
process skills.
Process
Skills
Members do not
process skills.
Safety
Precautions

safety precautions.

time.

precautions at all
times.

data.

Members do not know

be settled by the
teacher.
Orderliness
and after the
experiment.

the time. Group
conflicts are
most of the time.
Clean and orderly
workplace with
and after the
experiment.
Members require
by the teacher.

data.
and have
are cooperatively
Clean and orderly
experiment.
Work Habits
Time
Management/
Conduct of
Experiment
Cooperative
and
Teamwork
Ability to do
Members require
independent
supervision by the
work
teacher.
SCORE
Members do not need

teacher.
TOTAL SCORE
RATING=
x 100%
37
Laboratory Exercise No.8

Correlation
Course Code:
Program:
Course Title:
Date Performed:
Section:
Date Submitted:
Members:
Instructor:

2.1 Evaluate the linear relationship between two variables using scatterplot, correlation, and fitted line
plot.
2.2 Analyze and interpret results and draw conclusions about the output provided by Minitab.
3. Discussion:
The sample correlation coefficient , r, measures the degree of linear association between two variables
(the degree to which one variable changes with another). A positive correlation indicates that both
variables tend to increase or decrease together. A negative correlation indicates that, as one variable
increases, the other tends to decrease.
Use correlation when you have data for two continuous variables and wish to determine whether a linear
relationship exists between them. The correlation does not tell you whether the variables are related in a
non linear fashion.
Some statisticians argue that correlation should not be used if one variable is a dependent response of the
other.
Correlation can help answer questions such as
38
1. Are two variables related in a linear manner?

2. What is the strength of the relationship?
Example
A. Is there a linear relationship between dollars spent on training and customer satisfaction ratings?
B. What is the relationship between revenue and the number of sales calls made?
Additional Considerations
Correlation quantifies the degree of linear association between two variables.
A strong correlation does not imply a cause-and-effect relationship. For example, a strong correlation
between two variables may be due to the influence of a third variable not under consideration.
A correlation coefficient close to zero does not necessarily mean no association. The variables may have a
nonlinear association. Always plot the data so that you can identify nonlinear relationships when they are
present.
Some statisticians argue that correlation should not be used if one variable is a dependent response of
the other.
Correlation assumes that the values of both variables are free to vary. Correlation is not appropriate if you
fix the values of one variable to study changes in another.
4. Resources:
Training Data Sets
Textbooks
5. Procedure:
Practice Problem: The sales department for a software company wants to determine whether a relationship
exists between the number of sales calls made and the revenue earned. Analysts record the number of
sales calls and the revenue earned each day for a period of 420 days.
Variable
Description
39
Revenue
Daily Revenue in thousands of dollars, rounded to the nearest dollar
Sales Calls Number of sales calls made each day.

Part 1:
1. Open SoftRev1.MPJ
2. Choose Graph Scatterplot
3. Choose Simple, then click OK
5. Click OK.
Part 2: Calculating the correlation
11. Choose Stat Basic Statistics Correlation
40
13. Click OK
41
42

TIP-VPAA054D
CRITERIA

BEGINNER
ACCEPTABLE
PROFICIENT
1
2
3
Laboratory Skills
Skills
demonstrate needed
skills.
Set-up
demonstrate needed
skills.
Members are able to
with supervision.
process skills.
Members always
demonstrate needed
skills.
Members are able to
Members always
process skills.
Process
Skills
Members do not
process skills.
Safety
Precautions

safety precautions.

time.

precautions at all
times.

data.

Members do not know

be settled by the
teacher.
and after the
experiment.

the time. Group
conflicts are
most of the time.
Clean and orderly
workplace with
and after the
experiment.
Members require
by the teacher.

data.
and have
are cooperatively
Clean and orderly
experiment.
Work Habits
Time
Management/
Conduct of
Experiment
Cooperative
and
Teamwork
Neatness and
Orderliness
Ability to do
Members require
independent
supervision by the
work
teacher.
SCORE
Members do not need

teacher.
TOTAL SCORE
RATING=
x 100%
43
Laboratory Exercise No.9

Simple Linear Regression
Course Code:
Program:
Course Title:
Date Performed:
Section:
Date Submitted:
Members:
Instructor:
1. Objective(s):
The activity aims to measure the degree of linear association between two variables using graphs and
correlation
Model the relationship between a continuous response variable and one or more predictor variables.
2.1 Evaluate the linear relationship between two variables using scatterplot, correlation, and fitted line plot.
3. Discussion:
Simple Linear Regression examines the relationship between a continuos response variable (y) and one
predictor variable (x) . The general equation for a simple linear regression model is:
Y O 1
Where Y is the response, X is the predictor, O is the intercept (the value of Y when X equals zero), 1 is
the slope and is random error.
Use simple linear regression when you have a continuos y and one predictor , x. The following conditions
44
should also be met:

1. X can be ordinal or continuos
2. In theory, x should be fixed by the investigator. In practice, however, it is often allowed to vary.
3. Any random variation in the measurement of x is assumed to be negligible compared to the range
in which x is measured.
The y-values obtained in your sample differ from those predicted by the regression model (unless all points
happen to fall on a perfectly straight ine). These differences are called residuals.
To confirm that the analysis is valid, verify all assumptions about the model error term. Use residual plots to
check that the errors have the following characteristics:
1. Normally distributed
2. Constant variance for all fitted values
3. Random over time
Simple Linear Regression can help answer the following questions such as
1. How important is x in predicting y?
2. What value can you expect for y when x is 5?
3. How much does y change if x increases by one unit?
For example,
Is the number of mistakes made in processing loans related to cycle time?
What salary can you expect to make with five years experience in a particular field?
How much does salary increase for every additional year of experience?
S is an estimate of the average variability about the regression line. S is the positive square root of the
mean square error (MSE). For a given problem, the better the equation predicts the response, the lower S
is.
2
R (R Sq )
R 2 is the proportion of variability in the response that is explained by the equation. Acceptable values for
R
vary depending on the study. For example for engineers studying chemical reactions may require an
R 2 of 90% or more. However, someone studying human behavior ( which is more variable) may be
satisfied with much lower R 2 values.
45
R adjusted (R q (adj))
S
2
R adjusted is sensitive to the number of terms in the model and is important when comparing models
with different number of terms.
The Least Squares regression line
The coefficients for the regression equation are chosen to minimize the sum of the squared differences
between the response values observed in the sample and those predicted by the equation.
In other words the squared vertical distances between the points and line are minimized. The result is
called the Least squares regression line.
Confidence and prediction bands
Confidence bands provide the estimated range in which the mean response for a given value of the
predictor is expected to fall.
Prediction bands provide the estimated range in which a single new observation for a given value of the
predictor is expected to fall.
Analysts want to be confident that the mean and the individual points of the y-variable, Revenue, fall within
certain limits of variability.
Use the default confidence level of 95%
Confidence Interval
The 95% confidence interval defines a likely range of values for the population mean of y. For any given
value of x, you can be % confident that the population mean for y is between the indicated lines.
Prediction interval
The 95% prediction interval defines a likely range of y values for future individual observations. For any
given value of x, you can be 95% confident that the corresponding value of y for a single future observation
is between the indicated lines.
Note : The prediction interval is always wider than the confidence interval because of the added uncertainty
46
involved in predicting a single response versus the mean response.

Residuals
The residuals for each observation is the difference between the observed value of the response and the
value predicted by the model ( the fitted value). For example, if the observed response value is 12 and the
model predicts 10, the residual is 2.
Assumptions
1. To confirm that the analysis is valid. Verify all assumptions about the model error term. Use residual
plots to check that the errors have the following characteristics.
2. Normally distributed
3. Constant variance for all fitted values
4. Random over time
Normal Probability Plot
The normal probability plot should roughly follow a straight line. Use this plot to verify that the residuals do
not deviate substantially from a normal distribution.
Histogram
Use the normal probability plot to make decisions about the normality of the residuals. With a reasonably
large sample size, The histogram displays compatible information with the normal probability plot
The histogram of the residuals should appear approximately bell-shaped with no unusual values or outliers.
Use the histogram as an exploratory tool to learn about the following characteristics of the data.
-Typical values, spread or variation, and shape
-Unusual values in the data
Residual versus fits
Use the plot of the residuals versus fits to verify that the residuals are scattered randomly about zero.
This pattern.
Indicates
..
47
Curvilinear
A quadratic term may be missing from the model
Fanning or uneven spread

Of residuals across the different fitted values
Non constant variance of the residuals
Points far away from zero relative to other

Data points
Outliers exist
Residual versus order

The plot of the residuals versus order displays the residuals in the order of data collection (provided the
data were entered in the same order in which they were collected.)
If the data collection order affects the results, residuals near each other may be correlated , and thus , not
independent.
This pattern.
Residuals are not randomly scattered around zero
Residuals are randomly scattered around zer
Points far away from zero
Indicates
..
Residuals are not independent over time

Residuals are independet
Outliers exist
1. Be careful when using regression analysis to assert that changes in the predictor values were fixed
at predetermined levels in a controlled experiment. If the values of the predictors are allowed to
vary randomly, other factors may influence both the predictors and the response.
2. Do not apply regression results to values of x that are outside the sample range. The relationship
between Sales calls and Revenue may be very different for sales calls above 168.
3. Be alert for outliers when using regression procedures. Some outliers (called high leverage points)
have a large effect on the calculation of the least squares regression line. In such cases, the line
may no longer represent the rest of the data very well.
4. Time order trends in the data can violate the assumption of independence,. A run chart or individual
chart is a useful tool for detecting such efforts.
4. Resources:
48
Training Data Sets
Textbooks
5. Procedure:
Practice Problem: The sales department for a software company wants to determine whether a relationship
exists between the number of sales calls made and the revenue earned. Analysts record the number of
sales calls and the revenue earned each day for a period of 420 days.Determine the effect of Sales calls on
Revenue. Use fitted line plot to calculate and plot the regression equation.
Variable
Description
Revenue
Daily Revenue in thousands of dollars, rounded to the nearest dollar
Sales Calls Number of sales calls made each day.

Part 1: Fitted Line Plot
1. Open SoftRev1.MPJ
2. Choose Stat Regression Fitted Line Plot
4. Click OK.
49

6. Evaluate the results using the ANOVA results to evaluate whether the simple regression model is
useful for predicting revenue. State Hypothesis
7. Interpret the p-value (P) .
8. Make a conclusion.
Part 2: Adding confidence and prediction bands
1. Choose Stat Regression Fitted Line Plot or Press (Ctrl)+(E)
2. Click Options
4. Click OK
5. Click Graphs
6. Complete the dialog box shown below
50

8. Interpret Results
5. Normal Probability Plot
6. Histogram
7. Residual versus fits
8. Residual versus order
9. Make conclusions
51
52

TIP-VPAA054D
CRITERIA

BEGINNER
ACCEPTABLE
PROFICIENT
1
2
3
Laboratory Skills
Skills
demonstrate needed
skills.
Set-up
demonstrate needed
skills.
Members are able to
with supervision.
process skills.
Members always
demonstrate needed
skills.
Members are able to
Members always
process skills.
Process
Skills
Members do not
process skills.
Safety
Precautions

safety precautions.

time.

precautions at all
times.

data.

Members do not know

be settled by the
teacher.
and after the
experiment.

the time. Group
conflicts are
most of the time.
Clean and orderly
workplace with
and after the
experiment.
Members require
by the teacher.

data.
and have
are cooperatively
Clean and orderly
experiment.
Work Habits
Time
Management/
Conduct of
Experiment
Cooperative
and
Teamwork
Neatness and
Orderliness
Ability to do
Members require
independent
supervision by the
work
teacher.
SCORE
Members do not need

teacher.
TOTAL SCORE
RATING=
x 100%
53

Multiple Linear Regression
Course Code:
Program:
Course Title:
Date Performed:
Section:
Date Submitted:
Members:
Instructor:
1. Objective(s):
The activity aims to measure the degree of linear association between two variables using graphs and
correlation
Model the relationship between a continuous response variable and one or more predictor variables.
2.1 Evaluate the linear relationship between two variables using scatterplot, correlation, and fitted
line plot.
3. Discussion:
Multiple Linear regression examines the relationship between a continuous response variable (Y) and
more than one predictor variable (X) . The general equation for a multiple regression model
is: Y 0 1 X 1 2 X 2 3 X 3 .......
Where y is the response, 0 is the intercept, each xi is a predictor variable with a slope of i and
is random error.
Use multiple linear regression when you have a continuous y and more than one x.
54
1. X can be categorical , ordinal, or continuos.

2. Any random variation in the measurement of x is assured to be neglible compared to the range
within which x is measured.
Before accepting the results of a regression analysis, verify that the following assumptions about the errors
are valid:
1. They must be independent
2. They must be normally distributed
3. They must have a constant variance across all values of x.
4. They are not correlated with a predictor.
Multiple Linear regression can help answer the following question such as:
1. How important are the x variables in predicting y?
2. What value is expected for y when x1 is 20 and x2 is 3?
3. How much will y change if X3 increases by one unit (when x1 and x2 are fixed)?
For example,
1. How do flight- delay length and the number of empty seats relate to customer satisfaction rating?
2. How is the satisfaction affected by a flight delays and lost luggage?
4. Resources:
Training Data Sets
Textbooks
5. Procedure:
Practice Problem: You are selling your house, and want to establish a fair sale price.
Data Collection
The following data were collected for a random sample of houses sold in 1991:
1. Sale Price
2. Size of the House
3. Number of Bedrooms
4. Age of the house
55
5. Area in which the house was built

6. Real estate agent
Variabl
e
Price
Bedroo
ms
Size
Age
Are
a
Agen
cy
Description
Sale price of the house in thousands of dollars
Number of bedrooms in the house
Size of the house in square feet

Age of the house in years
Area in which the house was built (Dallas, Fort Worth,
or
Suburbs)
Selling agent (ClientFirst or Other)
Part 1:Using of matrix plot to examine potential relationships between sales price, size of house, number of
bedrooms, and age of house.
1. Open HouseSale.MPJ
2. Choose Graph Matrix Plot
3. Under Matrix of Plots, choose Simple, then click OK
56
4. In Graph Variables, enter Price, Bedrooms, Size Age
5. Click Matrix Options.

6. Under Matrix Options, choose Lower left.
57

Part 2: Calculate the correlation coefficient for each pair of variables.

1. Choose Stat Basic StatisticsCorrelation
58
2.
In Variables, enter Price, Bedrooms, Size and Age, then click OK

Part 3: Use General Regression to identify an appropriate model for the data
59
1. Choose Stat Regression General Regression
2. In Response, enter Price

3. In Model, enter Bedrooms, Size , Age, Area, Agency.
4. In Categorical predictors, enter Area , Agency.
5. Click OK

60
Part 4: Refit the model excluding the variable Age.

2. In Model, remove Age
3. Click OK
Part 5: Refit the model excluding the variable Agency.
2. In Model, remove Agency
3. In Categorical Predictors, remove Agency
4. Click OK
Part 6: Refit the model excluding the variable Bedroom
2. In Model, remove Bedrooms
3. Click Graphs
4. Under Residual Plots, choose Four in one
61
62

TIP-VPAA054D
CRITERIA

BEGINNER
ACCEPTABLE
PROFICIENT
1
2
3
Laboratory Skills
Skills
demonstrate needed
skills.
Set-up
demonstrate needed
skills.
Members are able to
with supervision.
process skills.
Members always
demonstrate needed
skills.
Members are able to
Members always
process skills.
Process
Skills
Members do not
process skills.
Safety
Precautions

safety precautions.

time.

precautions at all
times.

data.

Members do not know

be settled by the
teacher.
and after the
experiment.

the time. Group
conflicts are
most of the time.
Clean and orderly
workplace with
and after the
experiment.
Members require
by the teacher.

data.
and have
are cooperatively
Clean and orderly
experiment.
Work Habits
Time
Management/
Conduct of
Experiment
Cooperative
and
Teamwork
Neatness and
Orderliness
Ability to do
Members require
independent
supervision by the
work
teacher.
SCORE
Members do not need

teacher.
TOTAL SCORE
RATING=
x 100%
63

One way Analysis of Variance
Course Code:
Program:
Course Title:
Date Performed:
Section:
Date Submitted:
Members:
Instructor:
`
1. Objective(s):
The activity aims to introduce one way analysis of variance by comparing means of samples collected at
different levels using a one-way model and Interpret the main effects plot and multiple comparisons
2.1 Evaluate differences between group means for a single factor using one-way ANOVA
2.2 Interpret results and draw conclusions about the output provided by Minitab.
3. Discussion:
Analysis of variance (ANOVA)
Tests the hypothesis that the means of two or more populations are equal. ANOVAs evaluate the
importance of one or more factors by comparing the response variable means at the different factor levels.
The null hypothesis states that all population means (factor level means) are equal while the alternative
hypothesis states that at least one is different.
To run an ANOVA, you must have a continuous response variable and at least one categorical factor with
two or more levels. ANOVAs require data from normally distributed populations with roughly equal
variances between factor levels.
64
For example, you design an experiment to assess the durability of four experimental carpet products. You
place a sample of each carpet type in ten homes and you measure durability after 60 days. Because you
are examining one factor (carpet type) you use a one-way ANOVA.
If the p-value is less than your alpha, then you conclude that at least one durability mean is different. To
further explore the differences between specific means, use a multiple comparison method such as
Tukey's.
The name "analysis of variance" is based on the manner in which the procedure uses variances to
determine whether the means are different. The procedure works by comparing the variance between
group means versus the variance within groups as a method of determining whether the groups are all part
of one larger population or separate populations with different characteristics.
Minitab has different types of ANOVAs to allow for additional factors, types of factors, and different designs
to suit your specific needs.
ANOVA type
One-way
Model and Design Properties

One fixed factor (levels set by investigator) which can have either an unequal
(unbalanced) or equal (balanced) number of observations per treatment
combination.
Two-way
Two fixed factors and requires a balanced design.
Balanced
Model may contain any number of fixed and random factors (levels are
randomly selected), and crossed and nested factors, but requires a balanced
design.
General
Expands on Balanced ANOVAs by allowing unbalanced designs and covariates
Linear Model
(continuous variables).
One way Anova

The one way ANOVA (analysis of variance) procedure is a generalization of the independent samples of
T- test. Unlike the T-test. However, You can use one way ANOVA to analyze the means of more than two
65
groups (samples)at once.

Use one way ANOVA ( also called single-factor ANOVA) when you have continuous response data for
two or more fixed levels of single factor.
Before accepting the results of an ANOVA, you must verify that the following assumptions about the errors
are valid for your data. They must be:
1. Be independent (and thus random)
2. Not deviate substantially from a normal distribution
3. Have constant variance across all factor levels
One way ANOVA can help answer questions such as:
1. Are all branches of your company achieving comparable customer satisfaction ratings?
2. Do treatment group means differ?
For example:
1. Do mean customer satisfaction ratings differ between a companys branches in New Hamphshire,
Maine, and Vermont?
2. Which of the three training courses is the most successful in decreasing mean application
processing errors?
Dot plot
A dot plot gives a first look at the data to graphically compare the central tendencies and spreads for the 3
commission types. This graph can also reveal whether outlying data points are present and need to be
investigated.
Degrees of Freedom
The degrees of freedom (DF) Statistic measures how much independent information is available to
calculate each sum of squares (SS):
1. DF factor k 1, where k is the number of factor levels
2. DFerror n k , where n is the total number of observations
3. DFTotal n 1,
66
Sum of Squares
The sum of squares (SS) measures the amount of variability each source contributes to the data. Notice
that,
SS Total SS between SS error
Mean Square
The mean square (MS) for each source is equal to the SS divided by the DF.
F statistic
F is the ratio of the variability contributed by the factor to the variability contributed by error.
MS factor
F
MS error
1. If between- group variability is similar to within group variability , F is close to 1, indicating that the
factor does not affect the responsible variable
2. If between group variability is larger than within group variability, F is greater than 1.
P value
A large F suggests that the factor level means are more different than expected by chance, thus the Pvalue is small.
Individual Confidence Interval
When the p-value in the analysis of variance table indicates a difference among the factor level means, the
table individual confidence intervals is sometimes used to assess the differences.
4. Resources:
Training Data Sets
Textbooks
5. Procedure:
Practice Problem: Sales representatives at a software company are offered one of three types of salaries:
commission, fixed, and a combination of fixed and commission (mixed). The manager of the sales
department wants to compare the revenue earned for different salary types.
67
Data Collection
The manager records the salary type and revenue earned by each sales representative in a four-month
period.
Variable
Revenue
Salary Type
Description
Revenue earned in dollars by each sales representative
Type of salary received by each sales representative
(Commission , Fixed , Mixed)
Part 1: Compare Distributions using Dotplot

1. Open Commission.MPJ
2. Choose Graph Dotplot
3. Under One Y, Choose With Groups, then click OK.
4. Complete the dialog box shown below

68
5. Click OK
Part 2 : Perform the one-way ANOVA
1. Choose Stat ANOVA One-Way
3. Click Graphs.
4. Under Residual Plots, choose Four in one.
6. Interpret the results. Ensure that the results are valid, determine whether all the assumptions about
the residuals have been met.
69
70

TIP-VPAA054D
CRITERIA

BEGINNER
ACCEPTABLE
PROFICIENT
1
2
3
Laboratory Skills
Skills
demonstrate needed
skills.
Set-up
demonstrate needed
skills.
Members are able to
with supervision.
process skills.
Members always
demonstrate needed
skills.
Members are able to
Members always
process skills.
Process
Skills
Members do not
process skills.
Safety
Precautions

safety precautions.

time.

precautions at all
times.

data.

Members do not know

be settled by the
teacher.
Orderliness
and after the
experiment.

the time. Group
conflicts are
most of the time.
Clean and orderly
workplace with
and after the
experiment.
Members require
by the teacher.

data.
and have
are cooperatively
Clean and orderly
experiment.
Work Habits
Time
Management/
Conduct of
Experiment
Cooperative
and
Teamwork
Ability to do
Members require
independent
supervision by the
work
teacher.
SCORE
Members do not need

teacher.
TOTAL SCORE
RATING=
x 100%
71

Analysis of Variance ( General Linear Model using Tukey-Kramer Method)
Course Code:
Program:
Course Title:
Date Performed:
Section:
Date Submitted:
Members:
Instructor:
1. Objective(s):
2.1 Evaluate differences between group means for a single factor using one-way ANOVA and General
Linear Model
3. Discussion:
Tukey's method
Used in ANOVA to create confidence intervals for all pairwise differences between factor level means while
controlling the family error rate to a level you specify. It is important to consider the family error rate when
making multiple comparisons because your chances of making a type I error for a series of comparisons is
greater than the error rate for any one comparison alone. To counter this higher error rate, Tukey's method
adjusts the confidence level for each individual interval so that the resulting simultaneous confidence level
is equal to the value you specify.
For example, you are measuring the response times for memory chips. You sampled 25 chips from five
different manufacturers. The ANOVA resulted in a p-value of 0.01, leading you to conclude that at least one
of the manufacturer means is different from the others.
72
You decide to look at all 10 comparisons between the five plants to determine specifically which means are
different. Using Tukey's method, you specify that the entire set of comparisons should have a family error
rate of 0.05 (equivalent to a 95% joint confidence level). Minitab calculates that the 10 individual
confidence levels need to be 99.35% in order to obtain the 95% joint confidence level. These wider Tukey
confidence intervals provide less precise estimates of the population parameter but limit the probability that
one or more of the confidence intervals does not contain the true difference to a maximum of 5%.
Understanding this context, you can then look at the confidence intervals to see if any do not include zero,
suggesting a significant difference.
Confidence intervals with
95% individual confidence levels
Confidence intervals with 99.35% individual

confidence levels to obtain a 95% joint confidence
level using Tukey's
Comparison of 95% confidence intervals (left) to the wider 99.35% confidence intervals used by
Tukey's in the above example (right). The reference line at 0 illustrates how the wider Tukey
confidence intervals can change your conclusions. onfidence intervals that contain zero suggest
no difference. (Only 5 of the 10 comparisons are sh own due to space considerations.)
1. Comparing multiple factor levels with a single ANOVA is preferable to comparing two levels at a
time with separate two-sample t-tests. Extra tests would increase the chances of Type I error
(rejecting Ho when Ho is actually true.)
2. The assumption of independence for ANOVA is critical. If observations are symmetrically affected
by factors other than the one you are studying (including tinme order effects), the results of one
way ANOVA may be meaningless.
3. The assumption of normality for ANOVA is generally not crucial, especially if the sample sizes are
large.
4. Resources:
Training Data Sets
Textbooks
73
5. Procedure:
Practice Problem: Sales representatives at a software company are offered one of three types of salaries:
commission, fixed, and a combination of fixed and commission (mixed). The manager of the sales
department wants to compare the revenue earned for different salary types.
Data Collection
The manager records the salary type and revenue earned by each sales representative in a four-month
period.
Variabl
e
Revenue
Salary
Type
Description
Revenue earned in dollars by each sales representative
Type of salary received by each sales representative (Commission , Fixed , Mixed)
Part 1: Understanding the Effects

1. Open Commission.MPJ
2. Choose Stat ANOVA General Linear Model
3. In Responses, enter Revenue

4. Click Factor Plots
74
5. Complete the dialog box as shown below
6. Click OK
7. Click Comparisons
75
9. Click OK.in each dialog box

11. Make a decision.
76
77

TIP-VPAA054D
CRITERIA

BEGINNER
ACCEPTABLE
PROFICIENT
1
2
3
Laboratory Skills
Skills
demonstrate needed
skills.
Set-up
demonstrate needed
skills.
Members are able to
with supervision.
process skills.
Members always
demonstrate needed
skills.
Members are able to
Members always
process skills.
Process
Skills
Members do not
process skills.
Safety
Precautions

safety precautions.

time.

precautions at all
times.

data.

Members do not know

be settled by the
teacher.
Orderliness
and after the
experiment.

the time. Group
conflicts are
most of the time.
Clean and orderly
workplace with
and after the
experiment.
Members require
by the teacher.

data.
and have
are cooperatively
Clean and orderly
experiment.
Work Habits
Time
Management/
Conduct of
Experiment
Cooperative
and
Teamwork
Ability to do
Members require
independent
supervision by the
work
teacher.
SCORE
Members do not need

teacher.
TOTAL SCORE
RATING=
x 100%
78

Analysis of Variance
( General Linear Model application to conduct a one-way ANOVA)
Course Code:
Program:
Course Title:
Date Performed:
Section:
Date Submitted:
Members:
Instructor:
1. Objective(s):
2.1 Evaluate differences between group means for a single factor using General Linear Model to
conduct a one-way ANOVA
3. Discussion:
.
Use General Linear Model (GLM) to perform univariate analysis of variance with balanced and unbalanced
designs, analysis of covariance, and regression, for each response variable.
Calculations are done using a regression approach. A "full rank " design matrix is formed from the factors
and covariates and each response variable is regressed on the columns of the design matrix.
You must specify a hierarchical model. In a hierarchical model, if an interaction term is included, all lower
order interactions and main effects that comprise the interaction term must appear in the model.
Factors may be crossed or nested, fixed or random Covariates may be crossed with each other or with
factors, or nested within factors. You can analyze up to 50 response variables with up to 31 factors and 50
79
covariates at one time

Balanced ANOVA and general linear model (GLM) are ANOVA procedures for analyzing data collected with
many different experimental designs. Your choice between these procedures depends upon the
experimental design and the available options. The experimental design refers to the selection of units or
subjects to measure, the assignment of treatments to these units or subjects, and the sequence of
measurements taken on the units or subjects. Both procedures can fit univariate models to balanced data
with up to 31 factors. Here are some of the other options:
Can fit unbalanced data

Can specify factors as random and obtain
expected means squares
Fits covariates
Performs multiple comparisons
Fits restricted/unrestricted forms of mixed
model
Balanced GLM
ANOVA
no
yes
yes
yes
no
no
yes
yes
yes
unrestricted only
You can use balanced ANOVA to analyze data from balanced designs
Your design must be balanced to use balanced ANOVA, with the exception of a one-way design. A
balanced design is one with equal numbers of observations at each combination of your treatment levels. A
quick test to see whether or not you have a balanced design is to use Stat > Tables > Cross Tabulation and
Chi-Square. Enter your classification variables and see if you have equal numbers of observations in each
cell, indicating balanced data.
4. Resources:
Training Data Sets
Textbooks
5. Procedure:
Practice Problem: The manager of a call center for a software firm wants to know whether the center needs
the same number of people answering the phones each day of the week.
80
Data Collection
The number of customer calls to the technical support department is recorded for 205 business days (MonFri)
Varia
ble
Date
Week
Calls
Description
Business date on which data were recorded
Day of the week (Mon-Fri)
Number of calls to technical support
Part 1: Creating Dotplot to show the distribution for the five days.
1. Open SupCalls.MPJ
2. Choose GraphDotplot Under One Y, choose With groups, then click OK.
3. In Graph variables, enter Calls

4. In Categorical variables for grouping, enter Weekday
5. Click OK.
81

7. Draw Conclusions.
Part 2: Fit a general linear model to the data
2. In Response, enter Calls. In Model, enter Weekday
3. Click Graphs, then choose Four in One
82

Part 3: Create time series plots for each business day
4. Choose Graph Time Series Plot
5. Choose Simple , then click OK
83
6. In Series, enter Calls

7. Click Multiple Graphs, then choose the By Variables tab.
8. In By Variables with groups in separate panels, enter Weekday.

Part 4. Create main effects plot for the days of the week and conduct Tukeys pairwise comparison to
determine which weekdays have significantly different means from each other.
84
7. Click Factor Plots.

8. Under Main Effects Plot, enter Weekday, then click OK.
9. Click Comparisons.
10. In Terms, enter Weekday, Check Test
85

12. Interpret the results (Examine Tukey comparisons.)
13. Draw Conclusions
Part 5: Conduct a test for equal variances to determine if week to week variability is different for different
weekdays.
3. Choose Stat ANOVA Test for Equal Variances
86
4. In Response, enter Calls.

5.
In Factors, enter Weekday

87
88

TIP-VPAA054D
CRITERIA

BEGINNER
ACCEPTABLE
PROFICIENT
1
2
3
Laboratory Skills
Skills
demonstrate needed
skills.
Set-up
demonstrate needed
skills.
Members are able to
with supervision.
process skills.
Members always
demonstrate needed
skills.
Members are able to
Members always
process skills.
Process
Skills
Members do not
process skills.
Safety
Precautions

safety precautions.

time.

precautions at all
times.

data.

Members do not know

be settled by the
teacher.
and after the
experiment.

the time. Group
conflicts are
most of the time.
Clean and orderly
workplace with
and after the
experiment.
Members require
by the teacher.

data.
and have
are cooperatively
Clean and orderly
experiment.
Work Habits
Time
Management/
Conduct of
Experiment
Cooperative
and
Teamwork
Neatness and
Orderliness
Ability to do
Members require
independent
supervision by the
work
teacher.
SCORE
Members do not need

teacher.
TOTAL SCORE
RATING=
x 100%
89

Ie352l1 Labmanual

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Ie352l1 Labmanual

Încărcat de

Drepturi de autor:

Formate disponibile

Laboratory Exercise 1

Comparative Experiments Z-test (One-sample mean test)

The formula is:

7. Data Analysis and Conclusion:

8. Assessment (Rubric for Laboratory Performance):

TECHNOLOGICAL INSTITUTE OF THE PHILIPPINES

Members do not follow

Members follow safety

Members follow safety

Members do not finish

Members finish on time

Members do not know

Members have defined

Members finish ahead

Members do not need

Laboratory Exercise No. 2

quantify the precision of the estimate using confidence interval.

3. Complete the dialog box as shown below.

2. Complete the dialog box as shown below.

4. Click OK in each dialog box.

7. Data Analysis and Conclusion:

8. Assessment (Rubric for Laboratory Performance):

TECHNOLOGICAL INSTITUTE OF THE PHILIPPINES

Members do not follow

Members follow safety

Members follow safety

Members do not finish

Members finish on time

Members do not know

Members have defined

Members finish ahead

Members do not need

Laboratory Exercise No. 3

determine the sample size

evaluate the power to detect the difference of the collected data

7. Data Analysis and Conclusion:

8. Assessment (Rubric for Laboratory Performance):

TECHNOLOGICAL INSTITUTE OF THE PHILIPPINES

Members do not follow

Members follow safety

Members follow safety

Members do not finish

Members finish on time

Members do not know

Members have defined

Members finish ahead

Members do not need

Laboratory Exercise No. 4

15. Click Graphs.

16. Check Boxplot of Data.

7. Data Analysis and Conclusion:

8. Assessment (Rubric for Laboratory Performance):

TECHNOLOGICAL INSTITUTE OF THE PHILIPPINES

Members do not follow

Members follow safety

Members do not finish

Members finish on time

Members do not know

Members finish ahead

Laboratory Exercise No. 5

7. Data Analysis and Conclusion:

8. Assessment (Rubric for Laboratory Performance):

TECHNOLOGICAL INSTITUTE OF THE PHILIPPINES