Sunteți pe pagina 1din 19

STATGRAPHICS – Rev.

8/31/2009

DOE Wizard – Single Factor Categorical Designs

Summary
The DOE Wizard can construct designs for studying the effects of quantitative or categorical
(non-quantitative) factors. A simple yet extremely important example arises when there is a
single categorical factor. In such a case, the wizard will generate runs at each level of that factor.
In addition, one or more blocking factors may be included in the design. The wizard is capable of
generating:

1. completely randomized designs


2. randomized block designs
3. balanced incomplete block (BIB) designs

Example
The example described in this documentation comes from Box, Hunter and Hunter (1978). They
describe an experiment performed to compare the effect of 7 treatments (A, B, C, D, E, F, and G)
on the wearing quality of a particular material. Unfortunately, the machine used to measure wear
could only accommodate 4 samples during any one run. Expecting potential differences between
runs of the machine, they wished to treat the runs as a blocking variable in order to reduce any
possible confounding of run-to-run differences with differences between the treatments.

Sample StatFolio: doewiz singlecat.sgp

 2009 by StatPoint Technologies, Inc. DOE Wizard – Single Factor Categorical Designs - 1
STATGRAPHICS – Rev. 8/31/2009

Design Creation
To begin the design creation process, start with an empty StatFolio. Select DOE – Experimental
Design Wizard to load the DOE Wizard’s main window. Then push each button in sequence to
create the design.

Step #1 – Define Responses

The first step of the design creation process displays a dialog box used to specify the response
variables. For the current example, there is a single response variable:

 Name: The name for the variable is wear.

 Units: There are no units in this example.

 Analyze: The parameter of interest is the mean wear.

 Goal: The goal of the experiment is to maximize the mean wear.

 Impact: The relative importance of each response (not relevant if only one response).

 Sensitivity: The importance of being close to the best desired value (in this case, the
Minimum). Setting Sensitivity to Medium implies that the desirability attributed to the
response decreases linearly between the Minimum and Maximum values indicated.

 Minimum and Maximum: Range of desirable values for the response (200 - 400).
 2009 by StatPoint Technologies, Inc. DOE Wizard – Single Factor Categorical Designs - 2
STATGRAPHICS – Rev. 8/31/2009
Step #2 – Define Experimental Factors

The second step displays a dialog box on which to specify the factors that will be varied. In the
current example, there is only one factor:

 Name – Each factor must be assigned a unique name.

 Units – Units are optional.

 Type – Set the type of factor to Categorical, since there is a discrete set of possible values
for treatment.

 Role – Set the role of the factor to Controllable.

 Levels – identify the levels of the factor, separating each level by a comma.

Step #3 – Select Design


The third step begins by displaying the dialog box shown below:

 2009 by StatPoint Technologies, Inc. DOE Wizard – Single Factor Categorical Designs - 3
STATGRAPHICS – Rev. 8/31/2009

Since all of the factors are controllable process factors, only one Options button is enabled.
Pressing that button displays a second dialog box:

 2009 by StatPoint Technologies, Inc. DOE Wizard – Single Factor Categorical Designs - 4
STATGRAPHICS – Rev. 8/31/2009
 Design Type: The following types of designs may be available, depending upon the number
of levels of the experimental factor:

1. Completely randomized design - a design in which a random sample of measurements is


taken from each of the q levels, with no attempt to account for the effects of any other
factor.

2. Randomized block design - a design in which an equal number of observations is taken


from each treatment at two or more levels of a blocking or nuisance factor. Block effects
are included in the model to reduce the magnitude of the experimental error.

3. Combinatoric BIB - a Balanced Incomplete Block design involving a single blocking


variable where the number of treatments in each block is less than q. If k treatments can
 q
be run in each block, the design requires   blocks, which represents the number of
 k
ways of choosing k items out of q.

4. Small BIB - a Balanced Incomplete Block design in which the number of blocks is less
than that required by a full combinatoric BIB. These designs are only available for certain
combinations of the number of factor levels and the block size.

 Runs at each factor level - the number of runs to be performed at each level of the factor.

 Randomize - whether or not to randomly order the runs in the experiment.

 Block size - for BIB designs, the number of treatments that can be tested in each block.

Based on the selected design, the dialog box calculates and displays the total number of runs
(tests) to be performed, the number of blocks, and the degrees of freedom that will be available
to estimate the experimental error. Note that the degrees of freedom are calculated assuming that
the blocking factors do not interact with the main experimental factor, which is the usual
assumption.

The dialog box above requests a small BIB design capable of testing four treatments in a single
block. As indicated, a design is available in 7 blocks (runs of the machine). A total of 28 tests
will be performed, meaning that each of the 7 materials will be included in 4 blocks.

When OK is pressed, the tentatively selected design is displayed in the Select Design dialog box:

 2009 by StatPoint Technologies, Inc. DOE Wizard – Single Factor Categorical Designs - 5
STATGRAPHICS – Rev. 8/31/2009

Note that the runs are divided into blocks of size 4, with 4 different treatments appearing in each
block. Each treatment appears a total of 4 times. Each pair of treatments appears together in the
same block twice.

If the design is acceptable, press OK to save it to the STATGRAPHICS DataBook and return to
the DOE Wizard’s main window, which should now contain a summary of the design:

 2009 by StatPoint Technologies, Inc. DOE Wizard – Single Factor Categorical Designs - 6
STATGRAPHICS – Rev. 8/31/2009

Step #4: Specify Model

Before evaluating the properties of the design, a tentative model must be specified. Pressing the
fourth button on the DOE Wizard’s toolbar displays a dialog box to make that choice:

 2009 by StatPoint Technologies, Inc. DOE Wizard – Single Factor Categorical Designs - 7
STATGRAPHICS – Rev. 8/31/2009

For designs with a single categorical factor, the only useful model includes effects due to
differences between levels of that factor.

Step #5: Select Runs

Since we intend to run all of the runs in the base design, this step can be omitted.

 2009 by StatPoint Technologies, Inc. DOE Wizard – Single Factor Categorical Designs - 8
STATGRAPHICS – Rev. 8/31/2009

Design Properties

Step #6: Evaluate Design

Several of the selections presented when pressing button #6 are helpful in evaluating the selected
design:

Design Worksheet

The design worksheet shows the 28 runs that have been created, in the order they are to be run:

Worksheet for <untitled> - One factor categorical design


run block treatment wear
weight loss
1 1 C
2 1 D
3 1 F
4 1 G
5 2 A
6 2 B
7 2 F
8 2 G
9 3 B
10 3 D
11 3 E
12 3 G
13 4 A
14 4 C
15 4 E
16 4 G
17 5 B
18 5 C
19 5 E
20 5 F
21 6 A
22 6 D
23 6 E
24 6 F
25 7 A
26 7 B
27 7 C
28 7 D

Note that each block contains 4 of the 7 treatments.

 2009 by StatPoint Technologies, Inc. DOE Wizard – Single Factor Categorical Designs - 9
STATGRAPHICS – Rev. 8/31/2009
ANOVA Table
The ANOVA table shows the breakdown of the degrees of freedom in the design:

ANOVA Table

Source D.F.
Blocks 6
Model 6
Total Error 15
Lack-of-fit 0
Pure error 15
Total (corr.) 27

6 of the 27 total degrees of freedom are used to estimate the differences between treatments,
while another 6 are used to account for block effects. 15 degrees of freedom are left to estimate
the experimental error.

Model Coefficients

The table of model coefficients is shown below:

Model Coefficients

Power at Power at Power at


Coefficient Standard Error VIF Ri-Squared SN = 0.5 SN = 1.0 SN = 2.0
A 0.494872 1.71429 0.416667 7.61% 15.73% 47.26%
A 0.494872 1.71429 0.416667 7.61% 15.73% 47.26%
A 0.494872 1.71429 0.416667 7.61% 15.73% 47.26%
A 0.494872 1.71429 0.416667 7.61% 15.73% 47.26%
A 0.494872 1.71429 0.416667 7.61% 15.73% 47.26%
A 0.494872 1.71429 0.416667 7.61% 15.73% 47.26%
alpha = 5.0%, sigma estimated from total error with 15 d.f.

Since there are 7 levels of factor A, 6 indicator variables are used in the underlying regression
model to represent differences between treatments. Those 6 variables are defined as:

X1 = -1 for treatment A, 1 for treatment B, and 0 for all other treatments

X2 = -1 for treatment A, 1 for treatment C, and 0 for all other treatments

X3 = -1 for treatment A, 1 for treatment D, and 0 for all other treatments

X4 = -1 for treatment A, 1 for treatment E, and 0 for all other treatments

X5 = -1 for treatment A, 1 for treatment F, and 0 for all other treatments

X6 = -1 for treatment A, 1 for treatment G, and 0 for all other treatments

This coding is convenient since the sum of each variable across the 28 runs equals 0, which sets
the constant term in the model to the grand mean of all the treatments.

 2009 by StatPoint Technologies, Inc. DOE Wizard – Single Factor Categorical Designs - 10
STATGRAPHICS – Rev. 8/31/2009
Design Points

The graph of the design points shows that each treatment is run the same number of times
(4):

One factor categorical design

3
Number of runs

0
A B C D E F G

Saving the Design File

Step #7: Save experiment

Once the experiment has been created and any additional runs entered, it must be saved on disk.
Press the button labeled Step 7 and select a name for the experiment file:

 2009 by StatPoint Technologies, Inc. DOE Wizard – Single Factor Categorical Designs - 11
STATGRAPHICS – Rev. 8/31/2009

Design files are extended data files and have the extension .sgx. They include the data together
with other information that was entered on the input dialog boxes.

To reopen an experiment file, select Open Data File from the File menu. The data will be loaded
into the datasheet, and the Experimental Design Wizard window will be displayed.

 2009 by StatPoint Technologies, Inc. DOE Wizard – Single Factor Categorical Designs - 12
STATGRAPHICS – Rev. 8/31/2009

Analyzing the Results


After the design file has been created and saved, the experiments would be performed. At a later
date, once the results have been collected, the experimenter would return to STATGRAPHICS
and reopen the saved design file using the Open Data Source selection on the main File menu.
The results can then be typed into the response columns. The results for the example are
displayed below:

run block treatment wear


weight loss
1 1 C 627
2 1 D 248
3 1 F 563
4 1 G 252
5 2 A 344
6 2 B 233
7 2 F 442
8 2 G 226
9 3 B 251
10 3 D 211
11 3 E 160
12 3 G 297
13 4 A 337
14 4 C 537
15 4 E 195
16 4 G 300
17 5 B 278
18 5 C 520
19 5 E 199
20 5 F 595
21 6 A 369
22 6 D 196
23 6 E 185
24 6 F 606
25 7 A 396
26 7 B 240
27 7 C 602
28 7 D 273

Step #8: Analyze data

Once the data have been entered, press the button labeled Step #8 on the Experimental Design
Wizard toolbar. This will display a dialog box listing each of the response variables:

 2009 by StatPoint Technologies, Inc. DOE Wizard – Single Factor Categorical Designs - 13
STATGRAPHICS – Rev. 8/31/2009

 Response: column containing the response variable to be analyzed.

 Transformation: the desired transformation to be applied before the model is fit.

 Power and addend: the transformation parameters if a Power or Box-Cox transformation is


selected.

If more than one response has been measured, you should repeat this step once for each response.

When OK is pressed, the program will invoke one of two procedures:

1. The Oneway ANOVA procedure for a completely randomized design with no blocking
variables.

2. The Multifactor ANOVA procedure for the designs containing one or more blocking
variables.

Full details of those analyses are contained in the corresponding documentation.

Of particular interest in the current example are several tables and graphs:

 2009 by StatPoint Technologies, Inc. DOE Wizard – Single Factor Categorical Designs - 14
STATGRAPHICS – Rev. 8/31/2009
ANOVA Table
This table is used to judge whether or not there are statistically significant differences between
the levels of the experimental factor:

Analysis of Variance for wear - Type III Sums of Squares


Source Sum of Squares Df Mean Square F-Ratio P-Value
MAIN EFFECTS
A:BLOCK 14570.1 6 2428.35 1.65 0.2015
B:treatment 506799. 6 84466.4 57.40 0.0000
RESIDUAL 22071.4 15 1471.43
TOTAL (CORRECTED) 626265. 27
All F-ratios are based on the residual mean square error.

A small P-value for factor A (less than 0.05 if operating at the 5% significance level) indicates
that there are significant differences between treatments. In the current example, the differences
are highly significant. Of secondary interest is the P-value for BLOCK. Since the P-value in the
above table is greater than 0.05, the block effects are not statistically significant, meaning that
there were not large differences between runs of the wear testing machine.

Graphical ANOVA
A new method for illustrating the differences between blocks and treatments, from Hunter
(2005), is shown below:

Graphical ANOVA for wear

C
E D B G A F
treatment P = 0.0000

3 5
2 4 6 17
BLOCK P = 0.2015

Residuals
-260 -160 -60 40 140 240 340

The plot shows the scaled deviations of the block and treatment averages from the grand mean,
together with the model residuals. Scaling is such that, if a factor has no effect, the variation
observed should be comparable to that of the residuals. Note that the variation among blocks is
well within that observed for the residuals, with the possible exception of block #2. Note also
that treatment E shows the least wear, although other some other treatments are relatively close.

Means Plot
The Means Plot can be used to determine which treatments are significantly different from which
others. Because of the large number of treatments, the plot below shows the treatment means
 2009 by StatPoint Technologies, Inc. DOE Wizard – Single Factor Categorical Designs - 15
STATGRAPHICS – Rev. 8/31/2009
with Tukey HSD intervals, which allows the experimenter to compare all pairs of treatments
with an experiment-wide error rate of 5%:

Means and 95.0 Percent Tukey HSD Intervals

800

600
wear

400

200

0
A B C D E F G
treatment

Treatment E showed the least wear on average. However, since its interval overlaps those of
treatments B, D and G, it cannot be declared to be significantly better than any of those 3 other
treatments.

 2009 by StatPoint Technologies, Inc. DOE Wizard – Single Factor Categorical Designs - 16
STATGRAPHICS – Rev. 8/31/2009
Table of Means

The Table of Means shows the least squares means for each treatment and block:

Table of Least Squares Means for wear with 95.0% Confidence Intervals
Stnd. Lower Upper
Level Count Mean Error Limit Limit
GRAND MEAN 28 345.786
BLOCK
1 4 364.714 20.32 321.403 408.025
2 4 292.286 20.32 248.975 335.597
3 4 340.929 20.32 297.618 384.24
4 4 340.786 20.32 297.475 384.097
5 4 355.429 20.32 312.118 398.74
6 4 353.286 20.32 309.975 396.597
7 4 373.071 20.32 329.76 416.382
treatment
A 4 367.429 20.32 324.118 410.74
B 4 255.857 20.32 212.546 299.168
C 4 558.786 20.32 515.475 602.097
D 4 219.786 20.32 176.475 263.097
E 4 182.929 20.32 139.618 226.24
F 4 555.857 20.32 512.546 599.168
G 4 279.857 20.32 236.546 323.168

The least squares treatment means equal the estimated mean response for each treatment,
evaluated for an average block. Since each treatment was not run in each block, the least squares
means are NOT the same as the observed means of the 4 runs for each treatment. Instead, the
means have been adjusted to compensate for the blocks in which they did not appear.

Each mean is shown together with its standard error and 95% confidence limits.

Optimization

Step #9: Optimize responses

Once a statistical model has been developed for each response, the analyst may now determine
what combination of factors will yield the best results. Pressing the button labeled Step #9 on the
Experimental Design Wizard toolbar instructs the program to examine each treatment and find
the treatment that maximizes the joint desirability of the estimated responses. When the
optimization is complete, a message similar to that shown below will be displayed:

The dialog box indicates the “Desirability” of the final result, based on a metric designed to
balance competing requirements of multiple responses (see the document titled DOE Wizard for

 2009 by StatPoint Technologies, Inc. DOE Wizard – Single Factor Categorical Designs - 17
STATGRAPHICS – Rev. 8/31/2009
full details). The value displayed in this case indicates that the predicted wear for the best
treatment is below 200, which was the desired minimum specified when the design was created.

If you press OK, additional information will be added to the main DOE Wizard window:

Step 9: Optimize the responses


Response Values at Optimum
Response Prediction Lower 95.0% Limit Upper 95.0% Limit Desirability
wear 182.929 139.618 226.24 1.0

Factor Settings at Optimum


Factor Setting
treatment E

The table shows that the estimated wear for the best treatment (treatment E) equals 182.9, with a
95% confidence interval for the mean that ranges between 139.6 and 226.2.

If you push the Tables and Graphs button on the analysis toolbar, you can display the estimated
desirability for each treatment by selecting the Desirability Plot:

Desirability Plot

0.8
Desirability

0.6

0.4

0.2

0
A B C D E F G

 2009 by StatPoint Technologies, Inc. DOE Wizard – Single Factor Categorical Designs - 18
STATGRAPHICS – Rev. 8/31/2009
Step 10: Save results

The button labeled Step 10 allows you to save the results in a StatFolio:

Actually, the StatFolio can be saved at any point and reloaded at a later date.

IMPORTANT: When using the Experimental Design Wizard, two files are created:

1. An experiment file with the extension .sgd which stores information about the
experimental data.

2. A StatFolio with the extension .sgp that stores the results of the analysis.

If you move the experiment to another computer, be sure to transfer both files.

Step 11: Augment Design

This option is not available for this design.

Step 12: Extrapolate

This step is not applicable to designs with a single categorical factor.

 2009 by StatPoint Technologies, Inc. DOE Wizard – Single Factor Categorical Designs - 19