Sunteți pe pagina 1din 31

KXGX 6101: Research Methodology

If we knew what we were doing, it wouldn't be called research, would it? - Albert Einstein

Science & Statistic


If the facts don't fit the theory, change the facts Albert Einstein

Objective of statistical methods to make the process as efficient as possible!


data (facts/phenomenon)
deduction induction deduction deduction induction

induction

Hypothesis (conjecture/model/theory) data induction Hypothesis H1 deduction Consequence of H1 Modified Hypothesis H2

Introduction to Design of Experiment (DoE)


All life is an experiment. The more experiments you make the better Ralph Waldo Emerson

What is it? - DoE is efficient way to quantitatively determine how


numerous input variables (Xi) affect the outcome (Y)

DoE can be best used if:


Multiple variables affect outcome

Interactions of inputs exist


Want to sort out , using data, which variables are significant Unsure of how variables are affecting the outcome Want to verify what you think you know

Want to quantify how a process works

It is not Magic!

DoE Factorial Strategy


The probability of anything happening is in inverse ratio to its desirability John W. Hazard

All Possibilities are Considered from Main Effects to Interaction Effects

Variable 3 Variable 2

Variable 1

23 = (Two Levels) (Three Factors)

DoE for Mileage Example


Problem: Gas mileage for Car is 20 mpg

Speed (A)

Octane (B)

Tyre Pressure (C)

Mileage (Y)

55 (-) 65 (+)
55 (-) 65 (+) 55 (-) 65 (+) 55 (-) 65 (+)

87 (-) 87 (-)
92 (+) 92 (+) 87 (-) 87 (-) 92 (+) 92 (+)

30 (-) 30 (-)
30 (-) 30 (-) 35 (+) 35 (+) 35 (+) 35 (+)

Y1 Y2
Y3 Y4 Y5 Y6 Y7 Y8

How Many Runs? 23 = 8 How Many Observations for each level?

What DoE Tools to Use


Golden rule of an experiment: the duration of an experiment should not exceed the lifetime of the experimentalist Unknown Physicist

Current State of Problem Knowledge


Low High

Type of Design Usual Number of Factors Purpose Identify

Screening > 10

Fractional Factorial 5-10

Factorial 1-5

Most Important Factors - Vital Few Crude direction for Improvement - Liner Effects

Some Interactions

Relationships Among Factors All main effects and interactions

Estimate

Some interpolation

Analyzing a Full Factorial Design


Everything should be made as simple as possible, but not one bit simpler Albert Einstein

Step 1: Set-up Table of Contrast


Example: This example relates two quantitative Input Variables (Temperature

and Concentration) and one qualitative Input (Catalyst) to Yield. The factors and levels: - Temperature : 160C (-1), 180C (+1) - Concentration (%) : 20 (-1), 40 (+1) - Catalyst : Brand A (-1), Brand B (+1)
Temp -1 1 -1 1 -1 1 -1 1 Concentration -1 -1 1 1 -1 -1 1 1 Catalyst -1 -1 -1 -1 1 1 1 1 Yield ? ? ? ? ? ? ? ?

Step 2: Calculating Main Effects


We will now calculate the effects of the experiment. First we look at Temperature. We simply add the Yields associated with (-1) and the Yields associated with (1) and calculate the average.
Temp -1 Concentration -1 Catalyst -1 Yield 60

1
-1 1 -1 1 -1 1

-1
1 1 -1 -1 1 1

-1
-1 -1 1 1 1 1

72
54 68 52 83 45 80

Total (-) Total (+) Diff Mean Eff

211 303 92 23

267 247 -20 -5

254 260 6 1.5

Temperature Effect = (72 + 68 + 83 + 80) - (60 + 54 + 52 + 45) 4 4 = 75.72 - 52.75 = 23 This can be interpreted as the Yield going up by and average of 23 points as temperature moves from low to high

Step 3: Calculating Interaction Effects (cont.)


It is always better to be approximately right than precisely wrong Unknown Engineer

The Interaction Effects is represented by multiplying the columns to be

presented. For the 2x2 example, the Temperature x Concentration interaction contrast is created by multiplying the Temp contrast and Concentration contrast.

Temp -1 1 -1

Concentration -1 -1 1

TxC 1 -1 -1

Step 3: Calculating Interaction Effects


Calculate the interaction effects for the entire matrix
Temp (T)
-1

Conc (C)
-1

Cat (K)
-1

T*C
1

T*K
1

C*K
1

T*C*K
-1

Yield
60

1
-1 1 -1 1 -1 1

-1
1 1 -1 -1 1 1

-1
-1 -1 1 1 1 1

-1
-1 1 1 -1 -1 1

-1
1 -1 -1 1 -1 1

1
-1 -1 -1 -1 1 1

1
1 -1 1 -1 -1 1

72
54 68 52 83 45 80

Total (-) Total (+) Diff Mean Eff

211 303 92 23

267 247 -20 -5

254 260 6 1.5

254 260 6 1.5

237 277 40 10

257 257 0 0

256 258 2 0.5

Step 4: Graph Main Effects Plot

-1
75 _

-1

-1

70

65

60

50

Temp

Conc

Cat

Average Yield at (-1) Level

Average Yield at (+1) Level

Step 5: Graph Interaction Plot


Interaction Plots (T*K)
80 _

70

Average Yield at (+1) Cat and (+1) Temp Average Yield at (-1) Cat and (-1) Temp

Mean

60

50

_ -1

Catalyst
80

1 _

Interaction Plots (T*C)

70

Mean

60

50

_ -1

Concentration

Distribution Plot

Process Capability Analysis for C1


Process Data USL T arget LSL Mean 10.0000 * 5.0000 10.7039

LSL

USL
Within Overall

Sample N 20 StDev (Within) 4.73941 StDev (Overall) 5.12602

Potential (Within) Capability Cp CPU CPL Cpk Cpm 0.18 -0.05 0.40 -0.05 * -5 0 5 10 15 20 25

Overall Capability Pp 0.16 PPU -0.05 PPL Ppk 0.37 -0.05

Observed Performance PPM < LSL 150000.00 PPM > USL 750000.00 PPM T otal 900000.00

Exp. "Within" Performance PPM < LSL 114392.51 PPM > USL 559030.23 PPM T otal 673422.73

Exp. "Overall" Performance PPM < LSL 132912.93 PPM > USL 554607.20 PPM T otal 687520.13

Cpk = X - LSL 3

Importance of Statistics in Industry

Organizations around the world are constantly

searching for more effective methodology to achieve improvement (breakthrough improvement)


Financial Performance Customer Satisfaction

The improvement methodology evolved from

common sense, PDCA, Kaizen, Just-in-Time, Lean, SPC, TQM, Business Process Reengineering to Six Sigma now.

If your result needs a statistician then you should design a better experiment." Ernest Rutherford

Statistical Definition of Six Sigma


Six Sigma (with 1.5 sigma mean shifts)
LSL
1.5 s

USL

Short - term

Shortterm

-6 s

-5 s

-4 s

-3 s

-2 s

-1 s

0 1s

2s

3s

4s

5s

6s

99.9999998% or 0.002 DPMO 99.99966% or 3.4 DPMO

Six sigma commonly refers to a statistically derived performance target of 3.4 defects for every 1 million opportunities (3.4 DPMO).

Practical Meaning of Six Sigma


Why 99% Good is often not Good Enough 99% Good 3-Sigma
54,000 lost articles of mail per year Five short or long landings at most major airports/day More than 40,500 newborn babies dropped by doctors/nurses each year Unsafe drinking water about two hours each month 20,000 Lost bags per Day (Baggage Handling System Houston Airport )

99.99% Good 6-Sigma


35 lost articles of mail per year One short or long landings at most major airports/10 year Three newborn babies dropped by doctors/nurses in 100 years Unsafe drinking water 1 second every 16 years < 5 Lost bags per day

Six Sigma DMAIC Approach


Be thankful for problems. If they were less difficult, someone with less ability might have your job Reliability Engineer

There are five major steps involved in applying Six Sigma Approach to achieve breakthrough quality and performance. Define, Measure, Analyze, Improve, & Control. (D-M-A-I-C).
D M A I C

DMAIC - Systematic Problem Solving Tool


"Statistics: The only science that enables different experts using the same figures to draw different conclusions. - A frustrated Statistician

In Define phase, the team : Defines the Project


Defines Problem & Goal Statement Defines Project Benefits (Financial Analysis) Defines Project Charter & Project Scope Obtains support from Management

A SMART Goal statement Specific Measurable Attainable Relevant Time Bound

Classic American and Russian approach for a problem during space mission!

Measure Phase

In Measure phase, the team : Measure the baseline performance Identifies the input & output variables YyX Establishes data collection plan for YyX
Sampling techniques

Where to collect data or samples, when & what is the frequency or sample size ?
Establish operational definition

Determines the current performance and performance standards Verifies the measurement method

Measure Phase cont.

Measurement System Analysis: Four characteristics to examine in a gauge


system
1) Sensitivity

The gauge should be sensitive enough to detect differences in measurement as slight as one-tenth of the total tolerance specification. e.g: 200 0.1 mm tool should be able to measure at 0.01mm accuracy.
2) Reproducibility

The reliability of the gauge system to reproduce measurements. Customarily checked by comparing the results of different operators taken at different time. This affects both accuracy and precision.

Measure Phase cont.


3) Accuracy

An unbiased true value Normally reported as difference between the average of a number of measurements and the true value. e.g: checking a micrometer with a gauge block
4) Repeatability/Precision

The ability to repeat the same measurement by the same operator at the same time. To improve the accuracy and precision of a measurement process, it must have a defined test method and must be statistically stable.

Precise but not accurate

Accurate but not precise

Accurate and precise D M A I C

Measurement Error
Repeatability & Reproducibility
Any equation longer than three inches is most likely wrong Unknown Physicist

Analysis of variance (ANOVA) is the most accurate method for quantifying repeatability and reproducibility. It considers error by appraiser and the system

How to do ANOVA Test: 1) Calculate variance between system/appraiser 2) Calculate variance within system/appraiser

3) Calculate F ratio

4) If F ratio is greater than the Fcritical value accept or reject your hypothesis
D M A I C

Example : A Complex DoE Model (using JMP)


Actual by Predicted Plot
3.5 3

Shall I refuse my dinner because I do not fully understand the process of digestion?" Oliver Heaviside, English physicist

% Stress Actual

2.5 2 1.5 1 0.5 0 .0 .5 1.0 1.5 2.0 2.5 3.0 3.5 % Stress Predicted P<.0001 RSq=0.98 RMSE=0.1931

Scaled Estim ates


Nominal factors expanded to all lev els Term Scaled Estimate Intercept 0.880625 Machine[MC-15 (Bad)] 0.520625 Machine[MC-3 (Good)] -0.520625 1st Bond Power(30,40) 0.224375 Frame[New Possehl] -0.613125 Frame[Old] 0.613125 Loop Mode[QSQ-QSQ] -0.076875 Loop Mode[SSS-QSQ] 0.076875 Machine[MC-15 (Bad)]*F rame[New Possehl] -0.410625 Machine[MC-15 (Bad)]*F rame[Old] 0.410625 Machine[MC-3 (Good)]*Frame[New Possehl] 0.410625 Machine[MC-3 (Good)]*Frame[Old] -0.410625 1st Bond Power*Frame[New Possehl] -0.196875 1st Bond Power*Frame[Old] 0.196875 Std Error 0.048269 0.048269 0.048269 0.048269 0.048269 0.048269 0.048269 0.048269 0.048269 0.048269 0.048269 0.048269 0.048269 0.048269 t Ratio 18.24 10.79 -10.79 4.65 -12.70 12.70 -1.59 1.59 -8.51 8.51 8.51 -8.51 -4.08 4.08 Prob>| t| <.0001 <.0001 <.0001 0.0012 <.0001 <.0001 0.1457 0.1457 <.0001 <.0001 <.0001 <.0001 0.0028 0.0028

Sum m ary of Fit


RSquare 0.97749 RSquare Adj 0.962484 Root Mean Square Error 0.193076 Mean of Response 0.880625 Observ ations (or Sum Wgts) 16

Good Model

Analys is of Variance
Source Model Error C. Total DF Sum of Squares Mean Square 6 14.569588 2.42826 9 0.335506 0.03728 15 14.905094 F Ratio 65.1385 Prob > F <.0001

Model is significant
Oneway Analysis of FAB Size By EFO Time
60

Param e te r Es tim ates


Term Intercept Machine[MC-15 (Bad)] 1st Bond Power(30,40) Frame[New Possehl] Loop Mode[QSQ-QSQ] Machine[MC-15 (Bad)]*Frame[New Possehl] 1st Bond Power*Frame[New Possehl] Estimate 0.880625 0.520625 0.224375 -0.613125 -0.076875 -0.410625 -0.196875 Std Error 0.048269 0.048269 0.048269 0.048269 0.048269 0.048269 0.048269 t Ratio 18.24 10.79 4.65 -12.70 -1.59 -8.51 -4.08 Prob>| t| <.0001 <.0001 0.0012 <.0001 0.1457 <.0001 0.0028
FAB Size

55

50

45

0.55 ms

0.565 m s

0.575 m s

0.585 m s

0.6 ms

0.75 ms

EFO Time

All Pairs Tukey-Kramer 0.05

Oneway Anov a

Analyze Phase
Well done is better than well said Benjamin Franklin

In Analyze phase: Brainstorm potential root causes Use the data collected to determine root causes and opportunities for improvement Verifies the hypothesis established
Establishes the priority for action regarding the Xs 2 common techniques:

i) Fish bone diagram ii) Why-why analysis

Avoids solutions that dont solve the real problem !


Separate what we think is happening from what is really happening !!

Cause-And-Effect Diagrams (Fish Bone)

Measurement

Machine

Man

Problem

Mother Nature

Material

Method

(Causes )
D

(Effect )
M A I C

Why-Why Analysis
It is easy to see, it is hard to foresee Benjamin Franklin, American Scientist and Statesman

It is a technique to determine root causes to a phenomenon by repeatedly asking Why It is a variant of the 5 Why Analysis used at Toyota Motor company for discovering true causes by repeating the question Why five times.

Why?...Why?...Why?...Why?...Why?

Stop!

Improve Phase
If you bet on a horse, thats gambling. If you bet you can make three spades, thats entertainment. If you bet the device will survive for twenty years, thats engineering. See the difference? Unknown Engineer

After invested much time in the Define-Measure-Analyze phases, the team needs to change gear from being detailed minded (in process analysis and data analysis) to creative and innovative in developing solutions and change

processes.
Piloting whenever implementation.

possible,

before

the

full

Control Phase
To err is human, to forgive is divine, but to include errors in

This is the last phase improvement process.

in

the

DMAIC

your design is statistical Leslie Kisch

Without control efforts, the improved process

may revert to its previous state.

What About Human Error ???


Be more careful is not effective
The old way of dealing with human error was to scold people, retrain them, and tell them to be more careful We cant do much to change human nature, and people are going to make mistakes (often the same mistakes too). If we cant tolerate them ... we should remove the opportunities for error.

Poka-Yoke Error Proofing


Reliability it is when the customer comes back, not the product, Unknown Reliability Manager

Beep !!

Beep !

Key Take Away

1. Plan DoE matrix using 23 = (Two Levels) (Three Factors) You should have at least 8 runs for your simple DoE matrix 2. Calculate Mean, Std Deviation (sigma ) and Cpk MsoftExcel application can be easily used for this Calculate Cpk using below formula: Cpk = X - LSL 3 3. Plot interaction chart to understand the interaction of various input factors and identify the most significant factor(s)

4. Plot distribution curve for better visualization of your data if needed


5. Consider measurement errors in your data

Thanks for your attention


Statistics is like modern art, the more complicated it is

the higher the value Unknown Engineer

S-ar putea să vă placă și