Sunteți pe pagina 1din 121

GE Automation Services

Minitab
Handbook
Instructions:
Do not print this out. Use the table of
contents (pages 2 and 3) to reference the
particular tool you need in Minitab.

Kris Kolda Black Belt


1
Scott Ganschow Master Black Belt
Table of Contents
Section Pages
Descriptive Stats 4-6
Sorting Data 7
Probability and Area Under the curve 8
Normality test 9
Gage R&R 10-13
Process Capability for Continuous Data 14-17
Process Capability for Discrete Data 18-22
Histogram 23-25
Dot Plot 26-28
Stacking Data 29-31
Box Plot 32-35
Run Chart 36-40
Multi-Vari Chart 41-42
Binary Logistic Regression 43-45
Scatter plot 46-47
Simple Regression 48-52
Fitted Line Plot 53-54
Inspection of Data 55-59
Chi-Square 60-63
Back of Analyze Roadmap 85-101
Stability 64-68
Normality 69-76
Spreads 77-84
Centering 85-101
2-Sample T-test w/ equal spread 86-92
2-Sample T-test w/ unequal spread 93-96
Mann-Whitney test 97-101
2
Table of Contents - cont
Section Pages
DOE define and analyze 102-107
Control Chart X-Bar & R 108-109
Control Chart I&MR 110-112
Control Chart C-Chart 113
Control Chart U-Chart 114-115
Control Chart P-Chart 116-117
Control Chart NP-Chart 118-119
Control Chart 2R-Chart 120-121

3
Descriptive Stats
Exercise
Minitab can easily calculate the Mean and Median

1. Open up Minitab The column of


2. Open file: Distskew.mtw interest goes here
3. Perform The Following
Stat>
Basic Statistics>
Descriptive Statistics>
4. Enter The Variables Names
5. Evaluate Results

4
Descriptive Statistics For 3 Distributions

TABULAR FORM

Variable N Mean Median TrMean StDev


Normal 500 70.000 69.977 70.014 10.000
Pos Skew 500 70.000 65.695 68.554 10.000
Neg Skew 500 70.000 73.783 71.368 10.000

Look For This In Your Session Window !


5
Percentiles Suppose your child told you she scored in the 10th percentile on her exam.
What does that mean? _____________________________________

Can you calculate these?

5th percentile=______

25th percentile=______

50th percentile=______

75th percentile=______

95th percentile=______

Q1=______

Q2=______

Q3=______

6
Percentiles
Sorting Data
One method to find percentiles...
We could SORT the data from smallest to largest and then simply COUNT up to
the number which is just larger than ____% of the data in our column.
Example: The 5th percentile of 500 rows of data is (0.05)x(500)=25th row from
the minimum.

7
Probability and Area Under the Curve
(Via Minitab)

What percentage of the individual batteries


would we expect to last 80 hours or less?

80 85.36 CALC>
PROBABILITY DISTRIBUTIONS>
NORMAL>
CUMULATIVE>
Enter Mean = 85.36
Standard deviation = 3.77
-1.42 0
Input Constant = 80
Area = 0.0775 = 7.75 %
8
Normal Probability Plots
We can test whether a given data set can be described as normal
with a test called a Normal Probability Plot
If a distribution is close to normal, the normal probability plot will be a
straight line.
Minitab makes the normal probability plot easy. Using Distskew.Mtw.
Choose: Stat>Basic Stats>Normality Tests
Produce a normal plot of each of the first 3 columns. Which appear to
be normal?

9
How To Enter GR&R Data into Minitab (for this Macro)

Drop Number (Like Parts in other GR&R Studies)

1 2 3 4 5 6 7 8 9 10
EXAMPLE: You would enter these 8 values into
MINITAB in 3 columns as follows:
Operator 2.11 2.33
A Drop Operator Response
2.22 2.44 1 A 2.11
1 A 2.22 NOTE:
Data in
2.55 2.77 2 A 2.33 this
Operator
B 2 A 2.44 column
2.66 2.88 format are
1 B 2.55 called
1 B 2.66 Stacked
Data
2 B 2.77
Operator
C 2 B 2.88

10
How To Perform a GR&R In Minitab ...

Variable: Gage R&R Minitab Commands


11
GAGE Info
Who / What /When . . .

GAGE Options
ONLY if there are both
USL & LSL USL = 2.5
for the process LSL = 1.5
(not for the Gage)
TOL = 2.5 - 1.5 = 1
Put the difference here.
This will calculate the
%Tolerance in the
session window.
12
Source DF SS MS F p-value
Minitab Macro Gives Us:
Parts 9 2.058 0.2287 39.7179 0.00000
ANOVA Method Output Operators 2 0.048 0.0240 4.1672 0.03256
Operators*Part 18 0.104 0.00576 4.4588 0.00016
(1) ANOVA Table Repeatability 30 0.039 0.001292
Total 59 2.249
(2) Variance Components Table
(3) %Comparisons Table Source VarComp StdDev 95% Conf Int 5.15*Sigma

(4) Discrimination Index Total Gage R&R 0.004437 0.066615 (0.0597, 0.2250) 0.34306

Repeatability 0.001292 0.035940 0.0287, 0.0480) 0.18509


(5) Graphical Analysis
Reproducibility 0.003146 0.056088 ( *, *) 0.28885
Initially, we focus upon
Operator 0.000912 0.030200 (0.0000, 0.2169) 0.15553
the three indices
Operator*Part 0.002234 0.047263 (0.0254, 0.0800) 0.24340
circled below (each
Part-To-Part 0.037164 0.192781 (0.1098, 0.3345) 0.99282
evaluated as Red,
Total Variation 0.041602 0.203965 -- 1.05042
Yellow, Green)
Gage R&R Study for Measure - ANOVA Method
Gage R&R Study for Measure - ANOVA Method

1.1
Xbar Chart by Operator
B Fr ed Sam 1.1
Operator*Part Interaction
Operator Source %Contribution %Study Var %Tolerance
1.0 1.0 Bill
Sample Mean

0.9 UCL=0.8796 0.9 Sam


Average

0.8 X=0.8075 Fr ed
0.8
0.7
0.6
LCL=0.7354
0.7 Total Gage R&R 10.67 32.66 34.31
0.6
0.5
0.4 0.5
0.3
Gage R&R Study for Measure - Gage
ANOVAR&R
Method
0.4
Study for Measure - ANOVA Method
Repeatability 3.10 17.62 18.51
0 Part 0 1 2 3 4 5 6 7 8 9 10

0.15
R Chart by Operator
B Fr ed Sam 1.1
By Operator
Reproducibility 7.56 27.50 28.89
1.0
Sample Range

UCL=0.1252
0.10 0.9
0.8 Operator 2.19 14.81 15.55
0.7
0.05
R=0.03833 0.6

0.00
Gage R&R Study
LCL=0.000
0.5 for Measure - ANOVA Method
0.4
Operator*Part 5.37 23.17 24.34
0
Gage R&R Study for Measure - ANOVA Method
Oper 1 2 3

Components of Variation By Part


Part-To-Part 89.33 94.52 99.28
100 1.1
90 %Tot al Var 1.0
80
70
%St udy Var
0.9 Total Variation 100.00 100.00 105.04
Percent

%Tolerance
60 0.8
50
0.7
40
30 0.6
20 0.5
10
0
Gage R&R Repeat Repr od Par t -t o-Par t
0.4
Part 1 2 3 4 5 6 7 8 9 10 Number of Distinct Categories = 4
13
Capacity for Continuous Data

Open file Delivery2.mtw, containing data from 100


deliveries over the last 2 months.

What type of data is this? _________


(Continuous or Discrete)

If the delivery is too early (LSL= 5 days) the customer


will either make the trucks wait or ship it back to us at
our cost. If it is too late (USL= 13 days), there are
other serious consequences for this customers
process.

Characterize the capability of our delivery process to


meet the customers requirements.

Z(Short Term)= _____ ? Z(Long Term)= _____?


(Potential) (Actual)
14
...then click OK.

15
Report 1: Executive Summary
Process Performance Process/Demographics
Pretend Imaginary Performance
Actual (LT)
Potential (ST) Date:
Z(ST)=
Reported by: 3.98
Project:
LSL USL
Short-Term
Department: Sigma Level
Potential Process Capability
Process:
Capability using the POTENTIAL Average
Characteristic:
Units:

5 10 15 Upper Spec: 13
Lower Spec: 5
1000000 Nominal:
Actual (LT)
Potential (ST) Opportunity:
100000

10000
Process Benchmarks
1000 Real / Actual Performance Actual (LT) Potential (ST)

100
Z(LT)= - 1.03 Sigma -1.03 3.98
(Z.Bench)
10 Long-Term Sigma Level
Actual Process Capability PPM 849134 34.9258
1
Capability using the ACTUAL Average
0 50 100

16
Report 2: Process Capability for Delivery Tim

I and MR Chart Capability Indices


17.5
UCL=17.02
16.5 ST LT
15.5
14.5 Mean 9.00000 13.9970
Mean=14.00
13.5
12.5 StDev 0.96648 0.9654
11.5 Z.USL 4.13871 -1.0327
LCL=10.97
10.5
Z.LSL 4.13871 9.3194
Observ. 0 50 100
Z.Bench 3.97679 -1.0327
4
UCL=3.713
Z.Shift 5.00952 5.0095
3
P.USL 0.000017 0.849134
2
P.LSL 0.000017 0.000000
1 R=1.136
P.Total 0.000035 0.849134
0 LCL=0
Yield 99.9965 15.0866
PPM 34.9258 849134
Potential (ST) Capability Actual (LT) Capability Cp 1.37

Process Tolerance Process Tolerance Cpk -0.34

6.0893 11.9107 11.0896 16.9044 Pp 1.38


I I I I I I Ppk -0.34

I I I I I I
5 13 5 13 Data Source:
Specifications Specifications Time Span:
Data Trace:

17
Capacity for Discrete Data
Unfortunately, another business had been merely recording deliveries as
Late or On-Time, rather than Days Early / Late.
Over the past eight weeks, these data revealed
a total of 400 late out of of 4000 shipments.
Characterize process capability.

What type of data is this? ________ (Continuous or Discrete)

Z(Short Term)= _____? Z(Long Term)= _____?


Example
Step 1: 1 Defect = 1 late shipment 400
Type the data into 3 Minitab 1 Unit = 1 shipment 4000
columns as shown below 1 opportunity for defect per unit 1

18
Step 2:
Select Six Sigma >
Product Report
and fill-in the boxes as shown.

Step 3:
To get Z(LT) reported, type in
0 in the Shift factors box.
...then click OK.
To get Z(ST) reported, type in
1.5 in the Shift factors box.

19
Report 7: Product Performance

Characteristic Defs Units Opps TotOpps DPU DPO PPM ZShift ZBench

1 400 4000 1 4000 0.100 0.100000 100000 1.500 2.782

Note: Zshift is always assumed 1.5 for discrete data Z(ST)=2.78


Zlt will be 1.5 worse than Zst. and, thus,
Report
Thus, Zlt = 2.782 7:1.282
-1.5 = Product Performance
Z(LT)=1.28
Total 400 4000 0.100000 100000 1.500 2.782

Characteristic Def s Units Opps TotOpps DPU DPO PPM ZShif t ZBench

1 400 4000 1 4000 0.100 0.100000 100000 0.000 1.282

20
Six Sigma Product Report
In Minitab, create the following table:

Run the Six Sigma Product Report:

21
Discrete Data Examples
1. Select Defects, Units, & Opportunities

2. Select OK.
22
Histogram
Purpose: To display variation in a process. Converts an unorganized set of data or group
of measurements into a coherent picture.
When: To determine if process is on target meeting customer requirements. To
determine if variation in process is normal or if something has caused it to vary in
an unusual way.
How: Count the number of data points
Determine the range (R) for entire set
Divide range value into classes (K)
Determine the class width (H) where H = R/K
Determine the end points
Construct a frequency table based on values computed in previous step
Construct a Histogram based on frequency table
10
# of Students

55 60 65 70 75 80 85 90 95 100
Test Grades
23
Histogram Example
MINITAB FILE: Catapult.mtw

24
Histogram Output
1. Double
click
anywhere
on the C3
line to
select
Oper 1 as
a variable

2. Click OK

25
Dot Plot
Purpose: To display variation in a
process. Quick graphical
comparison of two or more
processes.
.
When: First stages of data analysis. .:
.: .:: :: :
:: .:: ..:::.:: : .
.: . :.:: :::::::::::::::..:.:.: ..
How: Create an X axis. Scale the +---------+---------+---------+---------+---------+-------patrn24
axis per the range in the data. .
:
Place a dot for each value :
along the X axis. Stack repeat :. : .
. .: : : : :: : : : : ..
dots. : .: :: :.::. :::.:::: : :. .: : ::
.:::::::.::::: ::::::::::: ::.::: :: .::: .
. ::::::::::::::::::::::::::::::::::::::.:::: :
+---------+---------+---------+---------+---------+-------patrn60
0.0000 0.0050 0.0100 0.0150 0.0200 0.0250

26
Dot Plot Example
MINITAB FILE: Catapult.mtw

27
Dot Plot Output

1. Double
click here
and here

2. Check
Same scale
for all
variables

3. Click OK

:
. .: . : .
-----+---------+---------+---------+---------+---------+-Oper 1
.
. :: . : .
-----+---------+---------+---------+---------+---------+-Oper 2
47.0 48.0 49.0 50.0 51.0 52.0
28
Stacking Data
Depending on what you want Minitab to do, you may need to organize
your data in different ways. To create a box plot (the next tool we will
demonstrate) you need stacked data.
To take your five catapult operator columns of data and stack them on top
of each other, use the Stack command.
MINITAB FILE: Your file

29
Stacking - Input

1. Double
click on
each
2. In these
column
boxes
label
enter the
you
column
want to
numbers
stack
of the
next
available
column
in your
3. Click on OK
data
window.
Example:
C8 or
C9

30
Stacking - Output
Target Angle Oper 1 Oper 2 Oper 3 Oper 4 Oper 5 Dist 50 Oper 50
50 162 50.50 50.50 46.50 49.00 50.00 50.50 1
50 162 50.50 49.00 50.00 50.25 49.75 50.50 1
50 162 49.75 51.50 49.25 50.50 49.75 49.75 1
50 162 49.50 50.50 48.75 49.75 50.00 49.50 1
50 162 49.50 47.00 49.00 50.00 48.75 49.50 1
50 162 48.25 48.75 49.75 50.25 50.00 48.25 1
50 162 50.75 49.00 50.00 50.00 49.75 50.75 1
50 162 49.25 49.00 49.75 50.50 50.25 49.25 1
50 162 49.50 48.75 48.00 50.00 50.75 49.50 1
50 162 49.50 49.75 50.25 49.75 50.25 49.50 1
50.50 2
49.00 2
51.50 2
50.50 2
47.00 2
48.75 2
49.00 2
49.00 2
48.75 2
49.75 2
46.50 3
50.00 3
49.25 3
48.75 3
49.00 3
49.75 3
50.00 3
49.75 3
48.00 3
50.25 3
49.00 4
50.25 4
50.50 4
49.75 4
50.00 4
50.25 4
50.00 4
50.50 4 31
Box Plot

Purpose:
To begin an understanding of the distribution of the
data
To get a quick, graphical comparison of two or more
processes

When: First stages of data analysis.

How: Let Minitab do it.


32
Box and Whisker Plot
* Outlier
any point outside the lower or
upper limit.

Maximum Observation
that falls within the upper limit
= Q3 + 1.5 (Q3 - Q1)
75th Percentile (Q3)
Median (50th Percentile)

25th Percentile (Q1)

Minimum Observation
that falls within the lower limit
= Q1 - 1.5 (Q3 - Q1)
33
Box and Whisker Plot Example
MINITAB FILE: Catapult.mtw

1. Double clicking
chooses the
variables to
graph

2. Click on OK
34
Box and Whisker Plot Output
For your catapult data, make a Box and Whisker Plot by
operator.

What are your observations?


35
Run Chart
Purpose: To track process over time in order to display trends and
focus attention on changes in the process

When: 1. To establish a baseline of performance for improvement


2. To uncover changes in your process
3. To brainstorm possible causes for trends
4. To compare the historical performance of a process with
the improved process

36
Run Chart
How: 1. Determine what you want to measure
2. Determine period of time to measure and in what time increments
3. Create a graph (vertical axis = occurrences, horizontal axis = time)
4. Collect data and plot
5. Connect data points with solid line
6. Calculate average of measurements, draw solid horizontal line on
run chart
7. Analyze results
8. Indicate with a dashed vertical line when a change was introduced
to the process
70
65

60
55
50
45
40
35
30
25
20
15
10
5
J F M A M J J A S O
37
Run Chart Example
MINITAB FILE: Catapult.mtw

38
Run Chart Output - Subgroup Size 1
1. Double
Click on
C8

2. Enter a1
in subgroup
size

3. ClickOK

39
Run Chart Output - Subgroup
1. Double
Size 10
click on
Distance
50.

2. Enter a
10 in
Subgroup
size

3. Click OK

40
Multi-Vari Chart
cont.
Purpose: To identify the most important types or families of variation
Y
To make an initial screen of process output for potential Xs
disc.

disc. cont.
X

Oper No
64 1
2
63 3
4
5
62
Length

61

60

59 Between Variation

58
1 2

Within Variation Round Time-to-Time Variation


41
Multi-Vari Chart - Minitab Commands
1
5

2
Enter Length
for Response

3 Enter OperNo for


Factor 1
4 Enter Rounds for
Factor 2

6
Check Display
individual... and
Connect 42
means...boxes
3 Types of Logistic Regression
The Y: USE:
is Binary
EXAMPLES ARE
Pass / Fail, Go / No Go
Win / Lose, On / Off

Types of Logistic regression


has categories of Order
EXAMPLES ARE
3 or more levels with Natural Ordering
None, Low, Med, High
has 3 or more levels but NO ORDER
EXAMPLES ARE
West, East, Central, North
First Pass, Retest, Return
Account 1, 2, 3, 4, 5
43
Binary Logistic Regression
BinaryLogistic.mtw:C1, C2, EPRO1 Minitab Commands...

Binary Logistic regression Minitab Commands


1. STAT
2. Regression
3. Binary Logistic Regression

4. Response: C2

5. Model: C1

6. Storage

7. Event Probability

8.OK Twice
44
Binary Logistic Regression
BinaryLogistic.mtw:C1, EPRO1
Minitab Commands...
Creating the Graph 1. Graph Plot: EPRO1 vs. C1

Binary Logistic regression Minitab Commands


2. Data Display: Connect

Area
Connect
Lowess
Project
Symbol

CONNECT OR Lowess
3. OK

45
Scatter Plot Example
Is there a relationship between age and years with
GE?
MINITAB FILE: Age_yrge.mtw

46
Scatter Plot Input & Output
1. Double
Click

2. Click on
OK

47
Regression Analysis Example

MINITAB FILE: Age_yrge.mtw

48
Regression Analysis Example
1. Double
Click

2. Click OK

49
Correlation & Regression - Residual Plots
Use Residual Plots to test the assumptions of the analysis
When setting up Regression Analysis:
1. Click Storage
2. Select Residuals and Fits

A Fit is the predicted value of response variable for a given value of the predictor variable.
A Residual is the difference between an actual observation and the fitted value (the
difference between an individual data point and the predicted value).
50
Correlation & Regression - Constructing Residual Plots

51
Correlation & Regression - Interpreting the Output

How normal are


the residuals?
Trends over time in
residuals?

Histogram - bell
curve? Constant Variance?
Outliers? Ignore
for small data sets
(<30)

Look for Gross Violations of Assumptions. If You Find Problems,


You May Need To Add Additional Terms (e.g. Interactions), Or
Transform One Of The Variables
52
Regression Analysis Example:
Confidence and Prediction Bands
MINITAB FILE: Age_yrge.mtw

1. Double
Click

2. Click Options

53
Regression Analysis Input & Output

3. Check both
boxes

4. Click OK

54
Step 6: Run the Analysis
Stat>Regression>Regression
Enter the
response

Enter the
X factors
Check Display
Variance Inflation
Factor under
options

Regression will be usually be an iterative process.


Select the Select all the X factors believed to be important
residuals plot and enter into the first analysis. Eliminate the
options insignificant terms one at a time and re-run the
analysis. Of course, youll need to complete the
other analysis steps outlined on the next page as
well. 55
Example: Engine Performance
Data
File messy1.mtw contains engine performance data collected from past
records. It is historical data without any designed structure to it.

BMSN: Bell Mouth Serial


Number
Cell: Test Cell
COWLSN: Cowl Serial
Number
FN: Engine Thrust

56
Inspection Of The Data
Use the cross tabulation command to view the structure of the data.

57
Inspection Of The Data
We will analyze the effects of bell mouth and cowl on fuelburn. Load in
the variables for bell mouth and cowl.

For now just check the


counts under Display.

58
Inspection Of The Data
Are the Xs correlated? To find out, run a correlation analysis on the
continuous Xs.

59
Chi-Square
How to get MINITAB to perform
the Chi-Square Test For Independence

Type this into Minitab:

Chi Square Minitab Commands


NOTE: What if the Female data was, instead: 6 44
60
How to get MINITAB to perform
the Chi-Square Test For Independence

Chi Square Minitab Commands


61
How to get MINITAB to perform
the Chi-Square Test For Independence

Chi Square Minitab Commands


62
0.8 is not Small
P-value, no X&Y
relationship

Chi Square Minitab Output


Chi-Square Calculated
= 0.064
df = 1, P-Value=0.800 P-Value

63
Branch #2
Step 1 of 6

Study Stability Over Time


(but only if the data is in time order sequence)

64
Minitab Commands

Stat>ControlChart
>I&MR

Stat>QualityTools
>RunChart

Study STABILITY Study SHAPE Compare Compare CENTERING


(Each Group) (Each Group) SPREADS
(If time ordered) Non-Normal Normal
(if each n>20) 2-Samp T-Test
with Assume Equal Var=YES 2-Sample T-Test with
Spreads
(if each n>20) 1-Way ANOVA Assume Equal Var=YES
Are
Non-Parametric Test:
Mann-Whitney(WilcoxonRankSum)
1-Way ANOVA
Equal
Ho: MedianA= MedianB

Were now on (if each n>20) 2-Samp T-Test


with Assume Equal Var=NO
2-Sample T-Test with
Spreads
Branch#2... Non-Parametric Test:
Mann-Whitney(WilcoxonRankSum)
Ho: MedianA= MedianB
Assume Equal Var=NO
Ho: mA= mB
Not
Equal

Step1:
For each group
separately, create a
Control Chart.
Why?
Look for any major runs,
trends, or patterns within a
group -- evidence that
each groups data are not
from just one group or
process? We dont want
to blindly mix different
processes, here, and
pretend its just Group A.

65
Stat>ControlChart
>I&MR

Stat>QualityTools
>RunChart
Study STABILITY
(Each Group)
(If time ordered)

Step1:
For each group
separately, create a
Control Chart.
Why?
Look for any major runs,
trends, or patterns within a
group -- evidence that
each groups data are not
from just one group or
process? We dont want
to blindly mix different
processes, here, and
pretend its just Group A.

66
We are just I and MR Chart for Brand A
looking for big, 40
obvious stuff here - UCL=37.96

Individual Value
- not worried about 30 Mean=30.38

one or two outliers.


20
LCL=22.81
Dont forget to
Subgroup 0 10 20
do this again
This process 10
1 1 for Brand B!
UCL=9.303
looks Moving Range

reasonably 5

R=2.847
stable 0 LCL=0

Just ask yourself could this pattern of points have been produced by just one process
(hose) that was reasonably consistent/stable over time (RE: Centering, Spread)?
Do I see any obvious shifts in process behavior that would make me feel uncomfortable
smashing all the points against the wall to get one pile (Histogram) that reliably
represents the hose output shape?

67
Common Cause Although we dont know exactly where the next point will land, we
are reasonably confident it will fall somewhere on the Common
Highway
Cause Highway. For this reason, a process that has only
common cause variation is said to be...
STABLE and PREDICTABLE.
30

20

10

A process with Special Causes might better be represented by a


Special Cause highway through an earthquake zone: likely to suddenly shift to the
side by several feet or more. For this reason, a process that has
Highway
Special Causes of variation is said to be...
UNSTABLE and UNPREDICTABLE.

30

20

10
68
Branch #2
Step 2 of 6
Check the Shape of
Each Group
(Normality)

69
Minitab Commands
Stat>BasicStats>
DescriptiveStats>
Graphs>
GraphicalSummary

Stat>BasicStats>
NormalityTest

Study STABILITY Study SHAPE Compare Compare CENTERING


(Each Group) (Each Group) SPREADS
(If time ordered) Non-Normal Normal
(if each n>20) 2-Samp T-Test
with Assume Equal Var=YES 2-Sample T-Test with
Spreads
(if each n>20) 1-Way ANOVA
Answ=Both Stable Assume Equal Var=YES
Are
Non-Parametric Test:
Mann-Whitney(WilcoxonRankSum)
1-Way ANOVA
Equal
Ho: MedianA= MedianB

(if each n>20) 2-Samp T-Test


with Assume Equal Var=NO
2-Sample T-Test with
Assume Equal Var=NO
Spreads
Non-Parametric Test:
Ho: mA= mB
Mann-Whitney(WilcoxonRankSum) Not
Ho: MedianA= MedianB
Equal

Step2:
For each group
separately, create a
Histogram, Normal
Probability Plot, and
Anderson-Darling
Normality Test
(P-Value).
If P<0.05, then not
Normal Distribution
Why?
Were now on Just to pick the right
tools later in Step5.
Branch#2...
70
Once we feel reasonably secure about the general STABILITY of Group A over time (Run
Chart), we can give the bulldozer operator permission to smash the data points against a
wall -- creating a pile of individual observations (called a histogram). So next, we should
check the SHAPE of this pile of Group A data as shown here... (and then check Group B
separately).
Stat>BasicStats>
DescriptiveStats>
Graphs>
GraphicalSummary

Stat>BasicStats>
NormalityTest
Study SHAPE
(Each Group)

Step2:
For each group
separately, create a
Histogram, Normal
Probability Plot, and
Anderson-Darling
Normality Test
(P-Value).
If P<0.05, then not
Normal Distribution
Why?
Just to pick the right
tools later in Step5.
71
Descriptive Statistics
Variable: Brand A

Anderson-Darling Normality Test


A-Squared: 0.395
P-Value: 0.340

Mean 30.3850
P-Value > 0.05, so the
StDev data CAN be2.3225
Variance 5.39397
represented well
Skewness -4.8E-01
enough with a Normal
Kurtosis 1.56132
N Distribution 20
24 26 28 30 32 34
Minimum 24.4000
1st Quartile 29.3500
Median 30.3000
3rd Quartile 31.8000
95% Confidence Interval for Mu Maximum 34.6000
95% Confidence Interval for Mu
29.2980 31.4720
29.5 30.5 31.5 95% Confidence Interval for Sigma
1.7662 3.3922
95% Confidence Interval for Median
95% Confidence Interval for Median
29.6176 31.4059

72
Recall from Basic Statistics .....
Normal Distribution
Normal Probability Plots
.999

.99
100
.95

Probability
.80
Frequency

.50

50
.20

.05
.01
If the Normality
Test shows a
.001

0
26 36 46 56 66 76 86 96 106
20 30 40 50 60 70 80 90 100 110 Normal

P-value that is
Average: 70 Anderson-Darling Normality Test
C1 Std Dev: 10 A-Squared: 0.418
N of data: 500 p-value: 0.328

Positive Skewed Distribution


Normal Probability Plots
less than 0.05,
300

.999
.99 then the data is
Probability
.95
Frequency

200
.80

100
.50
.20
.05
NOT
represented
.01
.001
0
60 70 80 90 100 110 120 130
60 70 80 90 100 110 120 130
Pos Skew
C2 Average: 70
Std Dev: 10
N of data: 500
Anderson-Darling Normality Test
A-Squared: 46.447
p-value: 0.000 well by a
300
Normal Probability Plots
Negative Skewed Distribution
normal
.999
.99
.95
.80
distribution
Probability

200
Frequency

.50
.20
.05
.01
100
.001

0
0 10 20 30 40 50 60 70 80
0 10 20 30 40 50 60 70 80
Neg Skew
C3 Average: 70 Anderson-Darling Normality Test
Std Dev: 10 A-Squared: 43.953
N of data: 500 p-value: 0.000
73
How to get a...
Stat>BasicStats>
DescriptiveStats> (1) Normal
Graphs>
GraphicalSummary Probability Plot

Stat>BasicStats> and
NormalityTest
(2) Statistical
Study SHAPE Normality Test
(Each Group)

Step2:
For each group
separately, create a
Histogram, Normal
Probability Plot, and
Anderson-Darling
Normality Test
(P-Value).
If P<0.05, then not
Normal Distribution
Why?
Just to pick the right
tools later in Step5.

74
Normal Probability Plot
Subjective Judgment:
Normal, if data appear to
form a fairly straight line
.999
.99
.95
Probability

.80

.50
.20
.05 P-Value > 0.05, so the
.01 data CAN be
represented well
.001 enough with a Normal
Distribution

25 30 35
Brand A
Average: 30.385 Anderson-Darling Normality Test
Std Dev: 2.32249 A-Squared: 0.395
N of data: 20 p-value: 0.340

75
What if my sample size is small?
Can I still test for Normality (shape)?

- If n is very small, forget it. When selecting other analysis tools, just assume Non-Normal.
- Rely primarily on Normal Probability Plot alone
- Ignore all Statistical Normality Tests (P-Values).

Actually, Minitab should not even offer any Statistical Normality Test output (P-
Values) when very small samples are involved -- but it does.
A good way to drive this point home to a class might be to ask what a
histogram of n=10 data points looks like. Just two or maybe three bars/cells!
Of what value is that? Actually, some seasoned statisticians won't trust a
histogram (for shape information) until nearly n=50 or n=80!!
...And to further illustrate, you might tear a piece of paper into small pieces,
and then drop n=10 of them onto a table. Then ask, "Is this pile coming from a
process/population which is bell-shaped or something else?" Of course, no
one can tell -- which is exactly the point!!
You need quite a bit of data before you can reliably determine what sort of
"data pile shape" you are dealing with. However, you don't need nearly as
much data to be able to reliably talk about the spread (Std.Dev.) of a
group/process/population -- and even less for centering (Average, Median).
76
Branch #2
Step 3 of 6
Stack the Data
(if not done already)

77
Were now on
Branch#2...

Study STABILITY Study SHAPE Compare Compare CENTERING


(Each Group) (Each Group) SPREADS
(If time ordered) Non-Normal Normal
(if each n>20) 2-Samp T-Test
Spreads
Answer = with Assume Equal Var=YES
(if each n>20) 1-Way ANOVA
2-Sample T-Test with
Are
Answ=Both Stable Assume Equal Var=YES

Both Normal Non-Parametric Test:


Mann-Whitney(WilcoxonRankSum)
1-Way ANOVA
Equal
Ho: MedianA= MedianB

(if each n>20) 2-Samp T-Test


with Assume Equal Var=NO
2-Sample T-Test with
Assume Equal Var=NO
Spreads
Non-Parametric Test:
Ho: mA= mB
Mann-Whitney(WilcoxonRankSum) Not
Ho: MedianA= MedianB
Equal

Step3:
If not done already, arrange the
data into the Stacked format
(i.e., Y data in one column,
X data in another column)

Minitab will need data in the


Stacked format in order to
test for Spread Differences
78
When two groups are arranged When the all the Y data is in one
as two separate columns of column, and the X data is in
data, Minitab calls this another column, Minitab calls this
Unstacked data Stacked data
Brand A Brand B
Brand A Brand B A 34.00
34.00 33.75 A 34.25
34.25 29.25 A 32.75
32.75 29.75 A 33.75
33.75 29.25 A 29.50
29.50 34.25 B 33.75
29.25 34.00 B 29.25
B 29.75
etc. etc. B 29.25
79
etc. etc.
Minitab will need data in the Stacked format in order to test for Spread Differences

Heres how to Stack data:

Group A&B-mpg

You should type these column


names at the top of the data table,
after youve created these two new
columns of stacked data
80
Branch #2
Step 4 of 6
Use Minitab to Test for
Spread Differences

81
Were now on
Branch#2...
Stat>ANOVA>
Homogeneity of
Variance

Study STABILITY Study SHAPE Compare Compare CENTERING


(Each Group) (Each Group) SPREADS
(If time ordered) Non-Normal Normal
(if each n>20) 2-Samp T-Test
Spreads
Answer = with Assume Equal Var=YES
(if each n>20) 1-Way ANOVA
2-Sample T-Test with
Are
Answ=Both Stable Assume Equal Var=YES

Both Normal Non-Parametric Test:


Mann-Whitney(WilcoxonRankSum)
1-Way ANOVA
Equal
Ho: MedianA= MedianB

(if each n>20) 2-Samp T-Test


with Assume Equal Var=NO
2-Sample T-Test with
Assume Equal Var=NO
Spreads
Non-Parametric Test:
Ho: mA= mB
Mann-Whitney(WilcoxonRankSum) Not
Ho: MedianA= MedianB
Equal

Step3:
If not done already, arrange the
data into the Stacked format
(i.e., Y data in one column,
X data in another column)

Step4:
Run Minitab test, above. Find
P-value from only Levenes test
in session window, etc.
Why? If P<0.05, then the X is
important -- proves X has
relationship or impact on the
Spread of the Y. The StdDevs
are significantly different!
82
To Test for Equal Variances (Spreads):

Stat>ANOVA>
Homogeneity of
Variance

Compare
SPREADS

Step4:
Run Minitab test, above. Find
P-value from only Levenes test
in session window, etc.
Why? If P<0.05, then the X is
important -- proves X has
relationship or impact on the A&B-mpg
Spread of the Y. The StdDevs
are significantly different!

Response =(Y)
Factors =(X)

83
Homogeneity of Variance Test for: Milage
A&B-mpg
95% Confidence Intervals for Sigmas Factor Levels

Best Practice: Bartlett's Test


Always rely on Levenes Test only!
Test Statistic: 1.594
It turns-out that Levenes Test is good whether
you have Normal or Non-Normal Data. p value : 0.207

The F-Test (Bartletts Test) is very misleading


when there are even slight departures from
normality.
NOTE: When there are only two groups, Levene's Test
Bartletts Test is equivalent to a popular tool
called the F-Test. Test Statistic: 0.767

p value : 0.387

Since P=0.387 is greater than 0.05, then there is NOT


sufficient evidence to prove that X (Brand) is important.
2
This data does not suggest that X (Brand) has
1.5 2.5 3.5 relationship or impact on the Spread of the Y.
Axis=Standard Deviations The StdDevs are NOT significantly different!
84
Branch #2
Step 5 of 6
Pick the Best Centering
Comparison Tool
(i.e., Quadrant)

85
Were now on
Branch#2...

Study STABILITY Study SHAPE Compare Compare CENTERING


(Each Group) (Each Group) SPREADS
(If time ordered) Non-Normal Normal
(if each n>20) 2-Samp T-Test
Spreads
Answer = Answer = with Assume Equal Var=YES
(if each n>20) 1-Way ANOVA
2-Sample T-Test with
Are
Answ=Both Stable Assume Equal Var=YES

Both Normal Spreads are Non-Parametric Test:


Mann-Whitney(WilcoxonRankSum)
1-Way ANOVA
Equal
NOT different Ho: MedianA= MedianB

(Are Equal) (if each n>20) 2-Samp T-Test


with Assume Equal Var=NO
2-Sample T-Test with
Assume Equal Var=NO
Spreads
Non-Parametric Test:
Ho: mA= mB
Mann-Whitney(WilcoxonRankSum) Not
Ho: MedianA= MedianB
Equal

Step5:
Based on prior info from Step2 & 4,
pick the best quadrant of centering
comparison tools. Within a quadrant,
start at the top of the list and if you
cant use it, move down to the next.

Answer = The 2-Sample T-Test

WHY? Because we have Normal


Data, and Equal Spreads. It is the
first tool listed in the top-right
quadrant.

86
Branch #2
Step 6 of 6
Run the Centering
Comparison in Minitab

87
Were now on
Branch#2... Minitab Commands Stat> BasicStats> 2Sample-T
Stat> ANOVA> OneWay

Stat> NonParametrics> Mann-Whitney

Study STABILITY Study SHAPE Compare Compare CENTERING


(Each Group) (Each Group) SPREADS
(If time ordered)
Non-Normal Normal
(if each n>20) 2-Samp T-Test
with Assume Equal Var=YES 2-Sample T-Test with Spreads
(if each n>20) 1-Way ANOVA Assume Equal Var=YES Are
Non-Parametric Test: Equal
1-Way ANOVA
Mann-Whitney(WilcoxonRankSum) Ho: mA= mB
Ho : MedianA= MedianB

(if each n>20) 2-Samp T-Test


with Assume Equal Var=NO 2-Sample T-Test with Spreads
Assume Equal Var=NO
Non-Parametric Test: Ho: mA= mB
Not
Mann-Whitney(WilcoxonRankSum) Equal
Ho : MedianA= MedianB

Step6:
Run the right Minitab test, above. Find P-value in
session window.
Why? If P<0.05, then the X is important -- proves
X has relationship or impact on the Centering of
the Y. The Avgs are significantly different!
88
MINITAB COMMANDS: 2-Sample T-Test
Stat>Basic Stats>2-Sample T-Test

This is for Stacked Data


Samples = Y
Subscripts = X

This is for Unstacked Data

When would we have


permission to click this
box?

89
MINITAB COMMANDS 2-Sample T-Test

We can also choose a graph to visualize the data


(DotPlot or Box Plot)

90
Since P<0.05 for this centering test, then the X (Brand) is important.
It proves that X does have relationship or impact on the Centering of the Y.
The Averages are significantly different!

91
DotPlot to help visualize the difference

92
Branch #2
Quiz Question 1

93
Quiz Question #1:
What Minitab tool should we use to test for centering differences
between Brand A & B, when we have

(1) Normal Data for both groups, and


(2) Unequal Variances (spreads)? Stat> BasicStats> 2Sample-T
Stat> ANOVA> OneWay

Stat> NonParametrics> Mann-Whitney

Study STABILITY Study SHAPE Compare Compare CENTERING


(Each Group) (Each Group) SPREADS
(If time ordered)
Non-Normal Normal
(if each n>20) 2-Samp T-Test
with Assume Equal Var=YES 2-Sample T-Test with Spreads
(if each n>20) 1-Way ANOVA Assume Equal Var=YES Are
Non-Parametric Test: Equal
1-Way ANOVA
Mann-Whitney(WilcoxonRankSum)
Ho: MedianA= MedianB

(if each n>20) 2-Samp T-Test


with Assume Equal Var=NO 2-Sample T-Test with Spreads
Assume Equal Var=NO
Non-Parametric Test: Ho: mA= mB
Not
Mann-Whitney(WilcoxonRankSum) Equal
Ho: MedianA= MedianB

94
Quiz Answer #1:
What Minitab tool should we use to test for centering differences
between Brand A & B, when we have

(1) Normal Data for both groups, and


(2) Unequal Variances (spreads)? Stat> BasicStats> 2Sample-T
Stat> ANOVA> OneWay

Stat> NonParametrics> Mann-Whitney

Study STABILITY Study SHAPE Compare Compare CENTERING


(Each Group) (Each Group) SPREADS
(If time ordered)
Non-Normal Normal
(if each n>20) 2-Samp T-Test
with Assume Equal Var=YES 2-Sample T-Test with Spreads
(if each n>20) 1-Way ANOVA Assume Equal Var=YES
Answer: Are
Equal
Non-Parametric Test: 1-Way ANOVA
Mann-Whitney(WilcoxonRankSum)
Use the 2-Sample T-test, Ho: MedianA= MedianB

but without clicking the


(if each n>20) 2-Samp T-Test
Assume Equal with Assume Equal Var=NO 2-Sample T-Test with Spreads
Variances button
Assume Equal Var=NO
Non-Parametric Test: Ho: mA= mB
Not
Mann-Whitney(WilcoxonRankSum) Equal
Ho: MedianA= MedianB

95
Quiz Answer #1 (cont.): MINITAB COMMANDS 2-Sample T-Test

In this case, we should NOT


click this box!

96
Branch #2
Quiz Question 2

97
Quiz Question #2:
What Minitab tool should we use to test for centering differences
between Brand A & B, when we have
(1) Non-Normal Data for one of the groups, and
(2) Equal Variances (spreads)
(3) Only n=7 data points for each brand of car? Stat> BasicStats> 2Sample-T
Stat> ANOVA> OneWay

Stat> NonParametrics> Mann-Whitney

Study STABILITY Study SHAPE Compare Compare CENTERING


(Each Group) (Each Group) SPREADS
(If time ordered)
Non-Normal Normal
(if each n>20) 2-Samp T-Test
with Assume Equal Var=YES 2-Sample T-Test with Spreads
(if each n>20) 1-Way ANOVA Assume Equal Var=YES Are
Non-Parametric Test: Equal
1-Way ANOVA
Mann-Whitney(WilcoxonRankSum)
Ho: MedianA= MedianB

(if each n>20) 2-Samp T-Test


with Assume Equal Var=NO 2-Sample T-Test with Spreads
Assume Equal Var=NO
Non-Parametric Test: Ho: mA= mB
Not
Mann-Whitney(WilcoxonRankSum) Equal
Ho: MedianA= MedianB

98
Quiz Answer #2:
What Minitab tool should we use to test for centering differences
between Brand A & B, when we have
(1) Non-Normal Data for one of the groups, and
(2) Equal Variances (spreads)
(3) Only n=7 data points for each brand of car? Stat> BasicStats> 2Sample-T
Stat> ANOVA> OneWay

Stat> NonParametrics> Mann-Whitney

Answer:
Study STABILITY Study SHAPE Compare Compare CENTERING
Since each
(Each groups
Group) sample sizeGroup)
(Each is so SPREADS
small(If
(n=7
timeeach),
ordered)we should NOT use...
Non-Normal Normal
(#1) the 2-Sample T-test, with clicking
(if each n>20) 2-Samp T-Test
the Assume Equal Variances box, or with Assume Equal Var=YES 2-Sample T-Test with Spreads
equivalently, (if each n>20) 1-Way ANOVA Assume Equal Var=YES Are
Non-Parametric Test: Equal
1-Way ANOVA
(#2) the One-Way ANOVA Mann-Whitney(WilcoxonRankSum)
Ho: MedianA= MedianB
Thus, moving down, we SHOULD use...
(#3) the Non-Parametric procedure (if each n>20) 2-Samp T-Test
called the Mann-Whitney Test -- would with Assume Equal Var=NO 2-Sample T-Test with Spreads
Assume Equal Var=NO
be appropriate. Non-Parametric Test: Ho: mA= mB
Not
Mann-Whitney(WilcoxonRankSum) Equal
Ho: MedianA= MedianB

99
Quiz Answer #2: MINITAB COMMANDS Mann-Whitney Test

Stat >Nonparametrics >Mann-Whitney

This Requires
Unstacked Data
(in separate columns)

100
Quiz Answer #2: MINITAB OUTPUT Mann - Whitney Test

Mann-Whitney Confidence Interval and Test

Brand A N = 20 Median = 30.300


Brand B N = 20 Median = 20.350
Point estimate for ETA1-ETA2 is 10.050
95.0 Percent CI for ETA1-ETA2 is (8.900,11.200)
W = 609.0
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.0000
The test is significant at 0.0000 (adjusted for ties)

The P-Value
(poorly labeled)

Since P<0.05 for this centering test, then the X (Brand) is important.
It proves that X (Brand) does have relationship or impact on the Centering of the Y.
The Averages (or Medians, in this test) are significantly different!

101
102
Put all factors here

Click OK

103
Columns C7-10 are created,
dont do anything with these

104
105
Insert Response

Click Pareto

Click Graph

106
-0.5243

DE coeff also

Items to the right of the line


Are the effects that matter

Y= 19.7743 + 1.1424(B) 0.3785(D) + .6840(E) + 0.5660(BE)


0.5243(DE) 107
X Bar & R Chart Example (cont.)
MINITAB FILE: Xbar_r. mtw

108
X Bar & R Chart - Output
1. Double Click C1.

2. Type in a 3 for
Subgroup size.

3. Click OK.

Xbar/R Chart for NC_LATHE

0.065
3.0SL=0.06435
Note that 3.0 SL
denotes a 3 sigma
Sample Mean

0.060 X=0.05952 limit = Control Limit


0.055 -3.0SL=0.05469 Do not confuse this
Subgroup 0 5 10 15 20 25
with specification
3.0SL=0.01215 limits.
Sample Range

0.010

0.005 R=0.004720

0.000 -3.0SL=0.000

109
Example:
Individuals & Moving Range
Chart
MINITAB FILE: Imr.mtw

110
Input:
Individuals & Moving Range
1. Double click on
Shaft_OD.
Chart 2. Click Tests.

3. Click on Perform all


eight tests.

111
Output:
Individuals & Moving Range
Chart
I and MR Chart for Shaft_OD

0.2525
2 3.0SL=0.2522
Individual Value

0.2515
2 2
X=0.2509
0.2505

5 5 5
0.2495 -3.0SL=0.2496
1
Subgroup 0 5 10 15 20 25

0.0015 3.0SL=0.001566
Moving Range

0.0010 2

0.0005 2
R=4.79E-04

0.0000 -3.0SL=0.000

112
C-Chart Example - Input

C Chart for Weld_I


15

3.0SL=13.02
Sample Count

10

C=5.800
5

0 -3.0SL=0.000

0 5 10 15 20 25
Sample Number

113
U-Chart Example
Minitab Menu Commands
MINITAB FILE: U_Chart.mtw

114
U-Chart Minitab Input & Output

U Chart for errors

3
Sample Count

3.0SL=2.114
2
U=1.764

-3.0SL=1.415

0 10 20 30
Sample Number
115
Commands
MINITAB FILE: P chart.mtw

116
P-Chart Minitab Input & Output

Out of controls: determine cause and P Chart for Voids


adjust
0.03
Nonconstant control limits due to
variable subgroup size
Proportion

3.0SL=0.02137
0.02

P=0.01192
0.01

-3.0SL=0.002472
0.00

0 5 10 15 20 25

Sample Number Very good, determine cause.


117
NP-Chart Example
Minitab Menu Commands

MINITAB FILE: Np chart.mtw

118
NP-Chart
Minitab Input & Output

NP Chart for switches


1
1
20 3.0SL=20.11
Sample Count

NP=10.80
10

-3.0SL=1.489
0
0 5 10 15 20 25
Sample Number
119
2-R Chart Example, X - Minitab
Menu Commands
MINITAB FILE: 2r_chart.mtw

120
2-R Chart Example,
X, Rp, Rw- Minitab Input & Output
1. Double Click on
max and min.

WB (within and between) Chart for max...min

Individuals Chart of Subgroup Means


Individual Value

70 3.0SL=69. 67

60 X=59.70

50 -3. 0SL=49.73

Moving Range Chart of Subgroup Means


3.0SL=12. 25
Moving Range

10

5
R=3. 750

0 -3. 0SL=0.000

Range Chart of A ll Data


3.0SL=16. 08
Sample Range

15

10
5 R=4. 920

0 -3. 0SL=0.000

Subgroup 0 5 10 15 20 25
121

S-ar putea să vă placă și