Sunteți pe pagina 1din 24

Laboratory Exercise No.

7
Paired t-Test
Course Code: IE 301

Program: BSIE

Course Title: Advanced Statistics for IE

Date Performed:Sep. 12, 2015

Section:IE31FB3

Date Submitted:Sep. 12, 2015

Members:Madrigal, Dominic R.

Instructor: Engr. Rica Navarro

Cruz II, Robert D.

1. Objective(s):
The activity aims to introduce basic ideas of power and sample size calculations for 2-sample t-Test.
2. Intended Learning Outcomes (ILOs):
The students shall be able to:
2.1

test for a difference between two population means using a 2-sample t-test

2.2

determine the sample size required to detect an effect of a given size with a given degree of
confidence.

3. Discussion:
A paired t-Test helps determine whether the mean differences between paired observations is significant.
Statistically, the paired t-test is equivalent to performing a 1-sample t-test on the differences. A paired ttest also helps you to evaluate whether the mean difference is equal to a specific value.
Paired observations are related. Examples include:
1. Weights recorded for individuals before and after an exercise program
2. Measurements of the same part taken with two different measuring devices.
Paired t-test with a random sample of paired observations. The test also assumes that the paired
differences come from a normally distributed population. However, the test is robust to violations of this
32

assumption, provided the observations are collected randomly and the data are continuous, unimodal,
and reasonably symmetric.
Why use a paired t-test?
A paired t-test answers questions such as:
1. Does a new treatment result in a difference in the product?
2. Do two different instruments provide similar measurements for the same sample?
4. Resources:
MiniTab Software/Manual
Training Data Set
Textbooks
5. Procedure:
Practice Problem: A consumer group wants to determine whether drivers can park one car more quickly
than the other. Because the data are paired (each individual parked both cars), use a paired t-test to teatv
the following hypothesis:
H0: The mean difference between paired observations in the population is zero.
H1: The mean difference between paired observations in the population is not zero.
Use the default confidence level of 95%. Display individual value plots and boxplots to help visualize the
data.
1. Open CARCTL.MPJ
2. Choose Stat Basic Statistics Paired t.
3. Complete the dialog box as shown below.

33

4. Click Graphs
5. Check Individual value plot and Boxplots of differences.
6. Click OK in each dialog box
7. Interpret the results
8. Draw conclusions.

Part 2: Testing the normality: The paired t-test


1. Choose Stat Basic Statistics Normality Test
2. In Variable, enter SupplrA
3. Click OK.
4. Choose Stat Basic Statistics Normality Test
5. In Variable, enter SupplrB
7. Click OK.
8. Interpret the results
9. Draw conclusions.
Part 3: Checking for Normality: the paired t-test is actually a 1-sample t-test on the pair wise difference.
Therefore, the pair wise differences must satisfy the 1-sample t-test assumptions, including normality.
34

Before checking for normality, store the pair wise differences in the worksheet.
1. Choose Stat Basic Statistics 2 Variances
2. Complete the dialog box as shown below.

3. Click OK.
4. Interpret the results.
5. Draw conclusions.
6. Data and Results:
Part 1:

Boxplot of Differences shows the mean is within the hinges of the box while the Individual Value Plot of Differences
shows the most of the red dots is within the mean, therefore, there is no mean difference between paired observations in the
population.

35

7. Data Analysis and Conclusion:


Part 2:

The two graphs shows that the data points approximately follow the straight line, therefore, the two products has no
Difference.
Part 3:

The individual value plot shows the mean of SupplrA increases than SupplrB. In histogram, SupplrA has a normal shape
while SupplrB is left-skewed. The individual plot of SupplrA and SupplrB do not overlap and the boxplot shows the medians are
close, therefore, the mean probability is significant.

36

8. Assessment (Rubric for Laboratory Performance):


TIP-VPAA054D
Revision Status/Date:0/2009 September 09

CRITERIA

TECHNOLOGICAL INSTITUTE OF THE PHILIPPINES


RUBRIC FOR LABORATORY PERFORMANCE
BEGINNER
ACCEPTABLE
PROFICIENT
1
2
3

Laboratory Skills
Manipulative Members do not
Skills
demonstrate needed
skills.
Experimental Members are unable to
Set-up
set-up the materials.

Members occasionally
demonstrate needed
skills.
Members are able to
set-up the materials
with supervision.
Members occasionally
demonstrate targeted
process skills.

Members always
demonstrate needed
skills.
Members are able to
set-up the material with
minimum supervision.
Members always
demonstrate targeted
process skills.

Process
Skills

Members do not
demonstrate targeted
process skills.

Safety
Precautions

Members do not follow


safety precautions.

Members follow safety


precautions most of the
time.

Members follow safety


precautions at all
times.

Members do not finish


on time with incomplete
data.

Members finish on time


with incomplete data.

Members do not know


their tasks and have no
defined responsibilities.
Group conflicts have to
be settled by the
teacher.
Neatness and Messy workplace during
Orderliness
and after the
experiment.

Members have defined


responsibilities most of
the time. Group
conflicts are
cooperatively managed
most of the time.
Clean and orderly
workplace with
occasional mess during
and after the
experiment.
Members require
occasional supervision
by the teacher.

Members finish ahead


of time with complete
data and time to revise
data.
Members are on tasks
and have
responsibilities at all
times. Group conflicts
are cooperatively
managed at all times.
Clean and orderly
workplace at all times
during and after the
experiment.

Work Habits
Time
Management/
Conduct of
Experiment
Cooperative
and
Teamwork

Ability to do
Members require
independent
supervision by the
work
teacher.
Other Comments/Observations:

SCORE

Members do not need


to be supervised by the
teacher.
TOTAL SCORE
RATING=
x 100%

37

Laboratory Exercise No.8


Correlation
Course Code:IE 301

Program:BSIE

Course Title:Advanced Statistics for IE

Date Performed:Sep. 12, 2015

Section:IE31FB3

Date Submitted:Sep. 12, 2015

Members:Madrigal, Dominic R.

Instructor:Engr. Rica Navarro

Cruz II, Robert D.

2. Intended Learning Outcomes (ILOs):


The students shall be able to:
2.1 Evaluate the linear relationship between two variables using scatterplot, correlation, and fitted line
plot.
2.2 Analyze and interpret results and draw conclusions about the output provided by Minitab.
3. Discussion:
The sample correlation coefficient , r, measures the degree of linear association between two variables
(the degree to which one variable changes with another). A positive correlation indicates that both
variables tend to increase or decrease together. A negative correlation indicates that, as one variable
increases, the other tends to decrease.
Use correlation when you have data for two continuous variables and wish to determine whether a linear
relationship exists between them. The correlation does not tell you whether the variables are related in a
non linear fashion.
Some statisticians argue that correlation should not be used if one variable is a dependent response of the
other.
Correlation can help answer questions such as
38

1. Are two variables related in a linear manner?


2. What is the strength of the relationship?
Example
A. Is there a linear relationship between dollars spent on training and customer satisfaction ratings?
B. What is the relationship between revenue and the number of sales calls made?
Additional Considerations
Correlation quantifies the degree of linear association between two variables.
A strong correlation does not imply a cause-and-effect relationship. For example, a strong correlation
between two variables may be due to the influence of a third variable not under consideration.
A correlation coefficient close to zero does not necessarily mean no association. The variables may have a
nonlinear association. Always plot the data so that you can identify nonlinear relationships when they are
present.
Some statisticians argue that correlation should not be used if one variable is a dependent response of
the other.
Correlation assumes that the values of both variables are free to vary. Correlation is not appropriate if you
fix the values of one variable to study changes in another.
4. Resources:
MiniTab Software/Manual
Training Data Sets
Textbooks
5. Procedure:
Practice Problem: The sales department for a software company wants to determine whether a relationship
exists between the number of sales calls made and the revenue earned. Analysts record the number of
sales calls and the revenue earned each day for a period of 420 days.
Variable

Description
39

Revenue

Daily Revenue in thousands of dollars, rounded to the nearest dollar

Sales Calls Number of sales calls made each day.


Part 1:
1. Open SoftRev1.MPJ
2. Choose Graph Scatterplot
3. Choose Simple, then click OK
4. Complete the dialog box as shown below.

5. Click OK.
6. Interpret the results
Part 2: Calculating the correlation
11. Choose Stat Basic Statistics Correlation

40

12. Complete the dialog box as shown below.

13. Click OK
14. Interpret the results
15. Draw conclusions.
6. Data and Results:

41

The plot shows the data values of variable x and y. As the sales calls increases, the
revenue also increases.
Therefore they are directly proportional and the correlation is postive.

7. Data Analysis and Conclusion:

The correlation is equal to 0.802. The relationship between revenue to sales calls is directly proportional but the amount
is not consistent.

42

8. Assessment (Rubric for Laboratory Performance):


TIP-VPAA054D
Revision Status/Date:0/2009 September 09

CRITERIA

TECHNOLOGICAL INSTITUTE OF THE PHILIPPINES


RUBRIC FOR LABORATORY PERFORMANCE
BEGINNER
ACCEPTABLE
PROFICIENT
1
2
3

Laboratory Skills
Manipulative Members do not
Skills
demonstrate needed
skills.
Experimental Members are unable to
Set-up
set-up the materials.

Members occasionally
demonstrate needed
skills.
Members are able to
set-up the materials
with supervision.
Members occasionally
demonstrate targeted
process skills.

Members always
demonstrate needed
skills.
Members are able to
set-up the material with
minimum supervision.
Members always
demonstrate targeted
process skills.

Process
Skills

Members do not
demonstrate targeted
process skills.

Safety
Precautions

Members do not follow


safety precautions.

Members follow safety


precautions most of the
time.

Members follow safety


precautions at all
times.

Members do not finish


on time with incomplete
data.

Members finish on time


with incomplete data.

Members do not know


their tasks and have no
defined responsibilities.
Group conflicts have to
be settled by the
teacher.
Messy workplace during
and after the
experiment.

Members have defined


responsibilities most of
the time. Group
conflicts are
cooperatively managed
most of the time.
Clean and orderly
workplace with
occasional mess during
and after the
experiment.
Members require
occasional supervision
by the teacher.

Members finish ahead


of time with complete
data and time to revise
data.
Members are on tasks
and have
responsibilities at all
times. Group conflicts
are cooperatively
managed at all times.
Clean and orderly
workplace at all times
during and after the
experiment.

Work Habits
Time
Management/
Conduct of
Experiment
Cooperative
and
Teamwork

Neatness and
Orderliness

Ability to do
Members require
independent
supervision by the
work
teacher.
Other Comments/Observations:

SCORE

Members do not need


to be supervised by the
teacher.
TOTAL SCORE
RATING=
x 100%

43

Laboratory Exercise No.9


Simple Linear Regression
Course Code:IE 301

Program:BSIE

Course Title:Advanced Statistics for IE

Date Performed:Sep. 12, 2015

Section:IE31FB3

Date Submitted:Sep. 12, 2015

Members:Madrigal, Dominic R.

Instructor:Engr. Rica Navarro

Cruz II, Robert D.

1. Objective(s):
The activity aims to measure the degree of linear association between two variables using graphs and
correlation
Model the relationship between a continuous response variable and one or more predictor variables.
2. Intended Learning Outcomes (ILOs):
The students shall be able to:
2.1 Evaluate the linear relationship between two variables using scatterplot, correlation, and fitted line plot.
2.2 Analyze and interpret results and draw conclusions about the output provided by Minitab.
3. Discussion:
Simple Linear Regression examines the relationship between a continuos response variable (y) and one
predictor variable (x) . The general equation for a simple linear regression model is:

Y O 1
Where Y is the response, X is the predictor, O is the intercept (the value of Y when X equals zero), 1 is
the slope and is random error.
Use simple linear regression when you have a continuos y and one predictor , x. The following conditions
44

should also be met:


1. X can be ordinal or continuos
2. In theory, x should be fixed by the investigator. In practice, however, it is often allowed to vary.
3. Any random variation in the measurement of x is assumed to be negligible compared to the range
in which x is measured.
The y-values obtained in your sample differ from those predicted by the regression model (unless all points
happen to fall on a perfectly straight ine). These differences are called residuals.
To confirm that the analysis is valid, verify all assumptions about the model error term. Use residual plots to
check that the errors have the following characteristics:
1. Normally distributed
2. Constant variance for all fitted values
3. Random over time
Simple Linear Regression can help answer the following questions such as
1. How important is x in predicting y?
2. What value can you expect for y when x is 5?
3. How much does y change if x increases by one unit?
For example,
Is the number of mistakes made in processing loans related to cycle time?
What salary can you expect to make with five years experience in a particular field?
How much does salary increase for every additional year of experience?
S is an estimate of the average variability about the regression line. S is the positive square root of the
mean square error (MSE). For a given problem, the better the equation predicts the response, the lower S
is.
2

R (R Sq )
R 2 is the proportion of variability in the response that is explained by the equation. Acceptable values for
R

vary depending on the study. For example for engineers studying chemical reactions may require an

R 2 of 90% or more. However, someone studying human behavior ( which is more variable) may be
satisfied with much lower R 2 values.
45

R adjusted (R q (adj))
S
2

R adjusted is sensitive to the number of terms in the model and is important when comparing models
with different number of terms.
The Least Squares regression line
The coefficients for the regression equation are chosen to minimize the sum of the squared differences
between the response values observed in the sample and those predicted by the equation.
In other words the squared vertical distances between the points and line are minimized. The result is
called the Least squares regression line.
Confidence and prediction bands
Confidence bands provide the estimated range in which the mean response for a given value of the
predictor is expected to fall.
Prediction bands provide the estimated range in which a single new observation for a given value of the
predictor is expected to fall.
Analysts want to be confident that the mean and the individual points of the y-variable, Revenue, fall within
certain limits of variability.
Use the default confidence level of 95%
Confidence Interval
The 95% confidence interval defines a likely range of values for the population mean of y. For any given
value of x, you can be % confident that the population mean for y is between the indicated lines.
Prediction interval
The 95% prediction interval defines a likely range of y values for future individual observations. For any
given value of x, you can be 95% confident that the corresponding value of y for a single future observation
is between the indicated lines.
Note : The prediction interval is always wider than the confidence interval because of the added uncertainty
46

involved in predicting a single response versus the mean response.


Residuals
The residuals for each observation is the difference between the observed value of the response and the
value predicted by the model ( the fitted value). For example, if the observed response value is 12 and the
model predicts 10, the residual is 2.
Assumptions
1. To confirm that the analysis is valid. Verify all assumptions about the model error term. Use residual
plots to check that the errors have the following characteristics.
2. Normally distributed
3. Constant variance for all fitted values
4. Random over time
Normal Probability Plot
The normal probability plot should roughly follow a straight line. Use this plot to verify that the residuals do
not deviate substantially from a normal distribution.
Histogram
Use the normal probability plot to make decisions about the normality of the residuals. With a reasonably
large sample size, The histogram displays compatible information with the normal probability plot
The histogram of the residuals should appear approximately bell-shaped with no unusual values or outliers.
Use the histogram as an exploratory tool to learn about the following characteristics of the data.
-Typical values, spread or variation, and shape
-Unusual values in the data
Residual versus fits
Use the plot of the residuals versus fits to verify that the residuals are scattered randomly about zero.
This pattern.

Indicates

..
47

Curvilinear

A quadratic term may be missing from the model

Fanning or uneven spread


Of residuals across the different fitted values

Non constant variance of the residuals

Points far away from zero relative to other


Data points

Outliers exist

Residual versus order


The plot of the residuals versus order displays the residuals in the order of data collection (provided the
data were entered in the same order in which they were collected.)
If the data collection order affects the results, residuals near each other may be correlated , and thus , not
independent.
This pattern.
Residuals are not randomly scattered around zero
Residuals are randomly scattered around zer
Points far away from zero

Indicates

..

Residuals are not independent over time


Residuals are independet
Outliers exist

Additional Considerations
1. Be careful when using regression analysis to assert that changes in the predictor values were fixed
at predetermined levels in a controlled experiment. If the values of the predictors are allowed to
vary randomly, other factors may influence both the predictors and the response.
2. Do not apply regression results to values of x that are outside the sample range. The relationship
between Sales calls and Revenue may be very different for sales calls above 168.
3. Be alert for outliers when using regression procedures. Some outliers (called high leverage points)
have a large effect on the calculation of the least squares regression line. In such cases, the line
may no longer represent the rest of the data very well.
4. Time order trends in the data can violate the assumption of independence,. A run chart or individual
chart is a useful tool for detecting such efforts.
4. Resources:
48

MiniTab Software/Manual
Training Data Sets
Textbooks
5. Procedure:
Practice Problem: The sales department for a software company wants to determine whether a relationship
exists between the number of sales calls made and the revenue earned. Analysts record the number of
sales calls and the revenue earned each day for a period of 420 days.Determine the effect of Sales calls on
Revenue. Use fitted line plot to calculate and plot the regression equation.
Variable

Description

Revenue

Daily Revenue in thousands of dollars, rounded to the nearest dollar

Sales Calls Number of sales calls made each day.


Part 1: Fitted Line Plot
1. Open SoftRev1.MPJ
2. Choose Stat Regression Fitted Line Plot
3. Complete the dialog box as shown below.

4. Click OK.
49

5. Interpret the results.


6. Evaluate the results using the ANOVA results to evaluate whether the simple regression model is
useful for predicting revenue. State Hypothesis
7. Interpret the p-value (P) .
8. Make a conclusion.
Part 2: Adding confidence and prediction bands
1. Choose Stat Regression Fitted Line Plot or Press (Ctrl)+(E)
2. Click Options
3. Complete the dialog box as shown below.

4. Click OK
5. Click Graphs
6. Complete the dialog box shown below

50

7. Click OK in each dialog box.


8. Interpret Results
5. Normal Probability Plot
6. Histogram
7. Residual versus fits
8. Residual versus order
9. Make conclusions
6. Data and Results:

51

7. Data Analysis and Conclusion:


Part 1:

Ho = U1 = U2=...=U420
Ha = some means are differrent

The p-value is equal to 0.000 and is less than to 0.05. We reject Ho. Therefore, the means of the number of sales calls
and the revenue earned are different.

Part 2:
Fitted Line Plot we can see that as the sales calls increases, so does the revenue.
Normal Probability Plot the data points follow the straight line and the p-value is less than 0.05, therefeore, the normal
distribution appears to fit the sample data.
Histogram the shape is normal so the data appear to be normally distributed.
Versus Fits shows that the data have a constant variance.
Versus Order shows that the data are correlated with each other.

52

8. Assessment (Rubric for Laboratory Performance):


TIP-VPAA054D
Revision Status/Date:0/2009 September 09

CRITERIA

TECHNOLOGICAL INSTITUTE OF THE PHILIPPINES


RUBRIC FOR LABORATORY PERFORMANCE
BEGINNER
ACCEPTABLE
PROFICIENT
1
2
3

Laboratory Skills
Manipulative Members do not
Skills
demonstrate needed
skills.
Experimental Members are unable to
Set-up
set-up the materials.

Members occasionally
demonstrate needed
skills.
Members are able to
set-up the materials
with supervision.
Members occasionally
demonstrate targeted
process skills.

Members always
demonstrate needed
skills.
Members are able to
set-up the material with
minimum supervision.
Members always
demonstrate targeted
process skills.

Process
Skills

Members do not
demonstrate targeted
process skills.

Safety
Precautions

Members do not follow


safety precautions.

Members follow safety


precautions most of the
time.

Members follow safety


precautions at all
times.

Members do not finish


on time with incomplete
data.

Members finish on time


with incomplete data.

Members do not know


their tasks and have no
defined responsibilities.
Group conflicts have to
be settled by the
teacher.
Messy workplace during
and after the
experiment.

Members have defined


responsibilities most of
the time. Group
conflicts are
cooperatively managed
most of the time.
Clean and orderly
workplace with
occasional mess during
and after the
experiment.
Members require
occasional supervision
by the teacher.

Members finish ahead


of time with complete
data and time to revise
data.
Members are on tasks
and have
responsibilities at all
times. Group conflicts
are cooperatively
managed at all times.
Clean and orderly
workplace at all times
during and after the
experiment.

Work Habits
Time
Management/
Conduct of
Experiment
Cooperative
and
Teamwork

Neatness and
Orderliness

Ability to do
Members require
independent
supervision by the
work
teacher.
Other Comments/Observations:

SCORE

Members do not need


to be supervised by the
teacher.
TOTAL SCORE
RATING=
x 100%

53

S-ar putea să vă placă și