Sunteți pe pagina 1din 53

Statistic Quality Control

Table of contents

Chapter 1. Data & Analysis of


Measurements
Chapter 2. Probability & Distribution
Chapter 3. Concept of Control Chart
Chapter 4. Control Chart for Variable
Chapter 5. Control Chart for Attribute
Chapter 6. Process Capability Analysis
CHAPTER 1
1. Data concept
- Population and sample
2. Types of Data
3. Description of data
 Central Tendency
 Dispersion
4. Display data
 Histogram
 Stem and leaf
 Box plot
5. Normality test
Data

It represents the FACT though the observation !

n1
Represents
n2

○ n3 the result

Of observation



Sample (n)
N (population)
want to observe
Data
Collection

1) Objective should be understood (clarify)


• Identify the problem
• Report the problem
• Verify the problem
• Analyze the problem
• Correct the problem
2) True data
Basic
Concept

mean : µ
Standard population
deviation :σ (N=5,000)
parameter
sample
mean : x
N=50
Standard
deviation : S

Inference statistic statistic

Descriptitive statistic
Let’s think about the
followings

(Exercise)

1) Based on my 20 years experience, this is not the machine


problem :
<point> .
2) Why measuring this temperature, we are in good shape
<point> .
3) Let’s pass them, since measurement cost…
<point> .
Types of Data
Continuous Data : Temp, length,
weight
Quantitative
Data
Discrete Data : Num. claims,
Data Num.Defective
categorized according to similarities
or
Differences in kind
Qualitative Data : Not Numeric value (ex: occupation, high
school,
attended)
Quantifying
<Exercise>
Write down three or more continuous and discrete data which are
related
Continuous
to your job. data : , , .

Discrete data : , , .
Types of Data
(Continued)
  Variable(Continuous) Attribute(Discrete)
Characteristi measurable countable
cs continuous discrete units or occurrences
may derive from counting good/bad

Types of Datalength no. of defects


volume no. of defectives
time no. of scrap items
Examples width of door gap audit points lost
lug nut torque paint chips per unit
fan belt tension defective lamps
Data 1.7 inches 10 scratches
Examples 32.06 psi 6 rejected parts
10.542 seconds 25 paint runs
<Table> A Comparison of Variable and Attribute Data
Four Levels of
Measurement*
The following table details the four levels of measurement in
increasing order of statistical desirability.
LEVEL DESCRIPTION EXAMPLE
Nominal Data consists of names or categories A bag of candy contained the        following
only. colors:
No ordering scheme is possible. Brown   17 Yellow   11 Red      10
Tan      6 Orange   5 Green    7

Ordinal Data is arranged in some order but Product defects are tabulated as follows:
differences between values cannot A   16 C   42 B   32 D   30
be determined or are meaningless. Where, A defects are more critical than D
defects
Interval Data is arranged in order and The temperatures of three aluminum ingots
differences can be found. However, were 200°F, 400°F and 600°F. Note, that
there is no inherent starting point and three times 200°F is not the same as 600°F
ratios are meaningless. as a measurement of warmth.
(Difference is meaningful)
Ratio An extension of the interval level that Product A costs $300 and product B costs
includes an inherent zero starting $60. Note, That $600 is twice as much as
point. Both differences and ratios $300.
are meaningful.

<Table> A Description of the Four Measurement Levels


Four levels of
Measurement(Continued)

LEVEL Central Location Dispersion Significance Tests


Norminal Mode Information Only Chi-square

Ordinal Median Percentages Sign or Run test


Interval Arithmetic Standard or T test
Mean Average F test
Deviation Correlation Analysis
Ratio Geometric or Percent *
Harmonic Mean Variation

<Table> Statistical Measures for Measurement Levels


*Note: Many of the interval measures may be useful ratio data as
well
Data
Exercises
<Examples> Is the following data continuous or discrete?
1.    A station wagon weighs 3478.6 lbs
2.    On the last CQE exam, 800 people failed
3.    Out of 300 steel rood, 12 proved to be defective
4.    A heat treatment lasted 6 hours and 18 minutes
5.    Of 240 potential customers, 185 use our product

<Examples> What measurement level is most appropriate for the


following?
6. Defects are categorized as critical, major A, major B and minor
7. A print-out of all shipping codes for last week`s orders
8. The individual weights of a sample of widgets
9. The temperatures of steel rods (°F) after one hour of cooling
10. The number of low, middle and high income consumers using
our product
Answers: 1. Continuous, 2. Discrete, 3. Discrete, 4. Continuous, 5.
Discrete, 6, Ordinal, 7. Nominal 8. Ratio, 9. Interval, 10. Ordinal.
Numerical Description of
Data
• Measure of Central Tendency
1) Mean
2) Mode
3) Median
4) Midrange
5) Harmonic Mean
• Measure of Dispersion
1) Variance
2) Standard Deviation
3) Range
4) Coefficient of Variation (Measure of relative variation)
5) Quartile
Measure of Central
Tendency
(1) Mean : arithmetic mean n


x i
(A) sample mean : x =i=
1
n

2 + 9 + 11 + 5 + 6
<ex> 2, 9, 11, 5, 6 x= = 6 .6
5
(B) population mean : µ

(2) Median : middle position value


• If n is odd, median is (n+1)/2 rank
• If n is even, halfway between rank n/2 and (n/2)+1
<ex> 2, 9, 11, 5, 6, 27
~
X = median(m)=7.5
Measure of Central
Tendency
(3) Mode : most frequently occurring value
<ex> 2, 9, 9, 11, 5, 6 ; Mode = 9

Mean
Mode
Median
Media
Mode n Mean
Midrange
(M)
: Mid point
: M = X max + X min
2

Harmonic
mean
n
H=
1 1 1
+ ++
x1 x2 xn
used to calculate average speed
(Exercise) Diameter of the bolts (5 data)
7.08 7.00 7.04 7.02 6.96
1) Sample mean (x ):
~
2) Median ( X ):
3) Mode ( M0 ) :
4) Midrange ( M ) :
5) Harmonic mean (H) :
Measure of
Dispersion
Central tendency does not necessarily provide enough
information
to describe data adequately. Consider the bursting strengths
obtained from two sample of six bottles each : (unit : psi)
● Sample 1 230 250 245 258 265 240

Sample 2 190 228 305 240 265 260
The mean of both samples is 248 psi

○ ○ ○
○ ●○ ●● ● ● ● ○

180 200 220 240 260 280 300 320

Sample mean =
248
Fig. Bursting – strength data
Variance and Standard
Deviation
Consider, sample 2 data
observation Xi - (Xi - )2
x x
X1 = 190 -58 3364
X2 = 228 -20 400
X3 = 305 57 3249
X4 = 240 -8 64
X5 = 265 17 289
X6 = 260 12 144
x
= 248 n Sum = 0 Sum = 7510
∑ ( x − x)i
2

From equation s2= n − 1 = 7510


1
5
= 1502( psi ) 2

Calculation of s2 of the 1st sample


s2 = 158(psi)2
Because s2 is expressed in the square of the original units,
It is not easy to interpret
Variance and Standard
Deviation

s = s2

1st sample : 158


=12.57psi
: 1502
2nd sample
=38.76psi
Measure of
Dispersion
. qer F evi t al e R

. qer F evi t al e R
0 1 2 3 4 5 6 7 8 X 0 1 2 3 4 5 6 7 8 X

■ Variance : The Large value, the greater variability of data set


N

∑ ( xi −µ)2
(A) Variance of population : 2
σ = i =1
N n

n n
(∑X i ) 2
(B) Variance of sample : ∑( x i − x) 2 ∑X i
2
− i =1
n
s2 = i =1
= i =1

n −1 n −1
Measure of
■Dispersion
Variance
<Example> Complete s2 for the following measurement data :
Data : 5, 7, 1, 2, 4
<Sol> Xi Xi2
5 25
7 49
1 1
2 4
4 16

∑x i =19 ∑x 2
i =95

<Example> Complete s2 for the following data :


Data : 6, 4, 3, 1, 5
<Sol> ∑ xi = . ∑x 2
i = . n= .
s2= .
Measure of
Dispersion
■ Standard deviation
(A) Standard deviation of population :
(B) Standard deviation of sample :
■ Range : Difference between the largest and smallest
measurements.
Range = Xmax – Xmin
<Example> 5, 7, 9, 3, 1
Range = .
■ Coefficient of variation (cov) : Measure of relative variation
s
cov = ×100 (%)
x
Measure of
Dispersion
■ Coefficient of variation
<Example>
(unit : kg)
Data 1 New born baby 1 2 3 4 5 x = 3.5kg
Weight 4.0 3.0 3.5 3.4 3.6
s = 0.36
Data 2 Father 1 2 3 4 5 x = 65.4
Weight 71.0 64.0 67.0 66.0 59.0
s = 4.39

<Q> Data 2 has more variation than data2 ?

s
cov( data 1) = =0.103
x
s
cov( data 2) = =0.067
x
Dispersio
n
■ Skewness
measure of symmetric
µ3 E[ X − µ]3
α3 = =
σ3 3
σ

; α 3 = 0 (mean=median=mode)
α 3 > 0 (positively skewed : mean>median>mode)

■ Kurtosis
measure of sharp
µ4 E ( X − µ) 4
α4 = 4 =
σ σ4
; The greater α 4 , the more sharp
(Normal distribution α
; 4 =3 )
Degree of
Freedom
Results from the fact that the n observations
X1- x , X2- x, ∙∙∙ ,Xn- x always sum to zero, so
Specifying the values of any (n-1) of these quantities
Automatically determines the remaining one.
Thus only (n-1) of the n observations Xi-x are independent.
Display Measurement
Data
Histogram

■ Relative frequency histogram


A bar graph in which the height of the bar represents the
proportion or relative frequency of occurrence for a particular
class. The classes are plotted along the horizontal axis.

<ex>
7/30 Frequency
*Relative Frequency=
6/30
n

5/30

4/30


F evi t al e R

1.85 2.85 X
Display Measurement
Data
Histogram
1) Collect Data (50~200 data)
2) Find Xmax and Xmin : R = Xmax – Xmin
3) Determine the number of classes (k)

k= n < ex > k = 100 = 10classes


* Recommendation
Num. data Num. of class
50~100 6~10
100~250 7~12
n>250 10~20

4) Calculate the width of class (h) : ( xmax − xmin ) R


h=
( Number of class)k
• Select the close measurement unit value
50.8 − 45.5
<ex> 47.2, 50.8* ㆍㆍㆍ 46.8, 45.5h = = 0.53 0.5
10
Display Measurement
Data
Histogram

5) Determine the class boundaries (should include Xmax and Xmin


value) Measuremen t unit
= xmin − = C1l
2
1) The first class low boundary
2) The first class high boundary = C1l + width of class(h)
6) Find midpoint
(classoflow
class
boundary + class high boundary)
2

7) Frequency count
8) Relative Frequency
9) Draw a picture
Display Measurement
Data
Histogram

<Example> * : Xmax ■ : Xmin


47.2 47.9 50.3 ‥ 47.9 1) R=Xmax -Xmin
47.8 46.4 49.1 50.8*
=50.8-45.5
47.6 47.0 48.1 47.8
ㆍ ㆍ =5.3
ㆍ ㆍ
2) k = 100 = 10
45.6 45.5■ 47.8 49.2 R 5 .3
h= = =0.53
3) k 10
47.0 48.6 47.0 ‥ 47.3
0.5
n = 100 data (unit : micronµ )
unit
4) Class boundaries : ① 1st class low boundary=xmin −
2
0.1
=45 .5 − =45 .45
2
② 1st class high boundary = 45.45+0.5=45.95
5) Find midpoint ( 45 .45 +45 .95 )
=45 .70
2
Histogram
<Exampl <Histogra
e> m>
Class Class Midpoint Frequency Rel R.F. Low Upper
boundary Freq. spec. spec.

1 45.45~45.95 45.70 2 0.02

2 45.95~46.45 46.20 3 0.03


0.05

3 46.45~46.95 46.70 6 0.06


0.04

4 ㆍ ㆍ ㆍ ㆍ
0.03
5 ㆍ ㆍ ㆍ ㆍ
6 ㆍ ㆍ ㆍ ㆍ 0.02
7 ㆍ ㆍ ㆍ ㆍ
0.01
8 ㆍ ㆍ ㆍ ㆍ
9 49.45~49.95 49.70 7 0.07
45.70 50.95
10 49.95~50.45 50.20 3 0.03 44.2 50.0
Histogra
m
(Exercise)
1) Measurement data (unit : cm)
163 172 178 174 174 167 182 176 159 158
169 171 174 182 178 180 164 163 154 181
175 183 173 164 165 172 170 169 177 172
172 156 158 168 164 178 173 175 176 171
163 164 170 172 170 169 168 164 186 174

2) Range : Max = Min =


Range =
3) Number of class(k) = k .
R
4) Width of class(h) = h = = .
k
5) Determine the class boundaries
measuremen t unit
① xmin − = C1l =
2
② C1l + width of calss(h) =
Histogra USL USL
freq. m freq.

T=X Length T X Lengt


(2) All data (b) Machine 1 data h
freq. freq. USL

T=X Length T X Lengt


(c) Machine 2 (d) Machine 3 data h
<Stratification
Histogra
6)mFind mid point of each class
7) Frequency count
8) Relative frequency Relative frequency
Class Class boundary Midpoint Frequency Rel.
freq.

1
2
3
4
5
6
7
8
9
10
11
12

cm
Display Measurement
Data
Histogram

• Measurement has a • Adjust data after • Physical restriction


certain behavior 100% inspection exist in one direction
• Wrong in
constructing
histogram

• Measurement failure • Mixed data from two


machines one~two different
• Misuse of raw
materials, operators
material
• Mistake in
adjustment of
Display Measurement
Data
Stem and Leaf
Displays
■ A method for describing a set of data.
It allows to retain the actual observed value of each data point.
Stem Leaf Stem Leaf
<ex> 7.15 or

7 15 71 5

<ex> 2.88 2.94 Stem Leaf Frequency


7.99 2.74
2 74 88 89 94 4
7.15 3.03
3 03 07 81 3
6.27 4.40
4 ••• •
⋮ ⋮
5 •
5.91 2.89
6 •
n=40 data
7 03 15 2
Quartil
e
The first quartile (Q1) : Observation with rank
(0.25)(100) + 0.5 = 25.5
(halfway of the 25th and 26th observation)
The third quartile (Q3) : (0.75)(100) + 0.5 = 75.5
(halfway between the 75th and 76th )
IQR = Q3 – Q1 (Inter quartile)
Display Measurement
Data

■ Stem – and - leaf


N = 40 (data) Stem Leaf 빈도
48 76 4 57899 5
53 75 5 1223349 7
49 91 6 0023445589 10
… 73
7 33556789 8
8 1123568 7
9 122 3
■ Easy to find percentiles
~ n n 
1) Fiftieth percentile : sample medianX =  and + 1
40 40 2 2 
rank and +1
2 2

~ 65 + 68
or X= = 66.5
2
Display Measurement
Data
1
2) 1 Quartile :  ( 40) + 0.5 = 10.5(rank 10 과 11)
st

4
53 + 54
or = 53.5
2
3) 3 Quartile :  3 (40) + 0.5 = 30.5; (79 + 81) / 2 = 80
rd

4

4) IQR = Q3 – Q1 = 80 – 53.5 = 26.5


The Box-plot

• Display the three quartiles, the minimum, and maximum of the


data
on a rectangular box, aligned either horizontally on vertically.
• Useful in comparing two or more samples.
Table. Viscosity Measurement for two mixtures
Mixture 1 Mixture 2
22.02 20.33
23.83 21.67
26.67 24.67
25.38 22.45
25.49 22.28
23.50 21.95
25.90 20.49
24.98 21.81
The Box-plot

27.0
26.67 ●

25.70
25.8
25.19 24.67 ●
24.7
23.5 23.68
22.37
22.3 21.88
22.02 ●

21.2 21.08

20.0 20.33 ●

1 2
• Mixture 1 has higher viscosity than 2
• Distribution is not symmetric
• The Max viscosity value in mixture 2 seems unusually large.
The Box-plot
■ Graphical display that simultaneously display several important
features of the data
(location, central tendency, spread, departure from symmetry,
outliers)

<Table> Hole Diameter (n = 12)


120.5 120.4 120.7
120.9 120.2 121.1
120.3 12.01 120.9
121.3 120.5 120.8
~
 X = rank 6 & rank 7 : (120.5+120.7)/2 = 120.6
 Q1 = (1/4)(2) + 0.5 = 3.5 ; 120.45 Outlier:
 Q3 = 120.9 Distance of
1.5(Q3-Q1)
Q3
120.54(Q1)
120.1 121.3

120.6(Q2)
Tchebysheff’s
Theorem
■ Given a number k greater than 1 and a set of n measurements
1
X1, X2, …, Xn, at least [1 − 2
] of the measurements will lie within
k
k standard deviation of their mean.

K 1
(1 − )100 (%)
k2
1 1 0
At least (1 − 2
)
k 1.5 55.6
2 75
2.5 84
3 88.9
. qerf evi t al e R

µ X

kσ kσ
Tchebysheff’s
Theorem
<Example> The mean and variance of a sample of n=25
measurements
are 75 and 100, respectively.
Use Tchebysheff’s Theorem to describe the distribution of
measurements
x = 75
<Sol> We are given and s=10 (s2=100)
① At least ¾ of the 25 measurements lie in the
x ± 2 s = 75 ± 2(10)
interval [55, 95]
3
At least ( 25 )( ) data lies
4

55 95 X
Empirical Rule (Rule of
Thumb)
■ Given a distribution of measurements that is approximately
bell-shaped, the interval
① (µ ± σ ) contains approximately 68% of the measurements
② ( µ ± 2σ ) contains approximately 68% of the measurements
③ ( µ ± 3σ ) contains approximately more than 99% of the
measurements

68%
F evi t al e R

µ X
σ σ
Empirical
Rule
<Example> A time study is conducted to determine the length of
time
necessary to perform a specified operation in a
manufacturing
plant. The length of time necessary to complete the
operation
is measured for each of n=40 workers. The mean and
standard
( x ± s ) = 12.8 ± 1.7
deviation are We found
expectto be 12.8 and 1.768%
approximately respectively.
of the Describe
the measurements
sample datatobyfall using the Empirical Rule to 14.5
( x ± 2s ) = 12.8into
± 2(1interval
.7 ) from
[9.4, 11.1
16.2]
<Sol> [11.1, 14.5]
Normal
Distribution

• What is normal ?
• How can we predict from the distribution ?
• How to make normal distribution ?
• Where can we use it ?

Symmetric
Bell shaped

µ
Can predict what will happen
Can make X r.v. to be normal
Normal
Distribution
µ 1 ≠ µ 2 , σ 1=
σ σ
σ2
1 2

68.3
% σ µ µ
µ 1 = µ 2 , σ 11 ≠ σ 2

σ2 1
95.5%
σ
99.73%
2
-4 -3 -2 -1 0 1 2 3 4 µ 1=
µ µ 1 ≠ µ 2 , σ 1 ≠µ
σ2 σ1
1
σ
2

µ µ
1 2
Normal probability
■plot
Estimate process yields and fallouts, % to fail, … etc.
• slope = standard deviation
^
• σ = 84th (percentile) – 50th (percentile)
■ Test Normality
<Normal probability
99.9 plot>

99
If the spec. on bottle strength
● ●


● is

95 ● ●


● LSL=210psi, we can estimate

80 ●

that about 25% of the bottles
● ●
50


● ● Manufactured by this process

( evi t al u mu C

20
)


(equipment) would be below
5
This limit.
0
190 230 270 310 350
%

Bottle Strength
Normal probability
plot
(Ex) Use probability plotting to determine the parameter of the
normal
distribution given the following data set.

<bottle strength
(unit :
data> psi) X axis : data
193 198 219 197 …
Y axis :
• Cumulative(%)
• i
• F ( x) = × 100(%)
• n +1
: mean rank
218 231 294 318 … i − 0.3
• F ( x) = × 100(%)
n + 0.4
: median rank
Normal probability
plot
<Table> Time to fail data (xx model) ; unit : month
t F(t) t F(t)
31 4.8 48 52.4
35 9.5 48 57.1
36 14.3 51 61.9
41 19.0 52 66.7
42 23.8 54 71.4
44 28.6 54 76.2
45 33.1 55 81.0
46 38.1 56 85.7
46 42.9 57 90.5
47 47.6 59 95.2
i
* use F(t) =
n +1
Normal probability
plot
99.99

µ + 3σ
F(x) 99 ●

%
80 ● µ + 2σ

µ=
● ●

● ●
µ +σ ①
50 ●
② σ = (µ + σ ) − (µ )
● ● µ
40 ●
● ③ F ( x ) = 10%
● ●
● ●
30 µ −σ


① ?
10


µ − 2σ
1
0.1 µ − 3σ
0.01
30 40 50 60 ⇒ Time to fail
Normal probability
plot
<Exercise> Diode failure data : (unit : hours)
F(ti)%
i ti F(ti)%
1 1500 6.7
2 2400 16.2
3 3000 26.9
4 3200 35.5
5 3700 45.2
6 4000 -
7 4000 -
8 4700 - t=103
• Normal? hours
9 5100
• µ=
10 5900
• σ =
i - 0.3
* use F(x) =
n + 0.4 • When 20% of diode will fail?

S-ar putea să vă placă și