Sunteți pe pagina 1din 5

COMPLETELY RANDOMIZED DESIGN

A completely randomized design (CRD) is a design in which the treatments are assigned
completely at random to the experimental units, or vice versa. That is, it imposes no restrictions,
such as blocking, on the allocation of the treatments to the experimental units.

In a completely randomized design, there is only one primary factor under consideration
in the experiment. The test subjects are assigned to treatment levels of the primary factor at
random. A completely randomized design is probably the simplest experimental design, in terms
of data analysis and convenience. With this design, subjects are randomly assigned to
treatments.

Caution is placed to the use of this design to those cases in which HOMOGENEOUS
experimental units are available. The CRD is best suited for experiments with a small number of
treatments.

In a completely randomized design, objects or subjects are assigned to groups completely


at random. One standard method for assigning subjects to treatment groups is to label each subject,
then use a table of random numbers to select from the labelled subjects. This may also be
accomplished using a computer. In MINITAB, the "SAMPLE" command will select a random
sample of a specified size from a list of objects or numbers.

A completely randomized design relies on randomization to control for the effects of


extraneous variables. The experimenter assumes that, on averge, extraneous factors will affect
treatment conditions equally; so any significant differences between conditions can fairly be
attributed to the independent variable.

RANDOMIZATION PROCEDURE:
Treatments are assigned to experimental units completely random. Every experimental unit
has the same probability of receiving any treatment. Randomization is performed using a random
number table, computer, program, etc.

SIX ELEMENTS OF A VALID EXPERIMENT


1. Treatment
 In experiments, a treatment is something that researchers administer to experimental
units. Treatments are administered to experimental units by 'level', where level implies
amount or magnitude.
2. Replication
 It is the repetition of an experiment on a large group of subjects.
 Number of times a treatment is repeated in an experiment.
3. Experimental Units
 Where treatments are applied.
4. Randomization
 It is where objects or individuals are randomly assigned (by chance) to an experimental
group. Using randomization is the most reliable method of creating homogeneous
treatment groups, without involving any potential biases or judgments.
5. Experimental Errors
 It can be due to inherent characteristics of Experimental units or to minimized human
error.
6. Local Control
 It serves as inherent characteristics of the Experimental units.

ANALYSIS OF VARIANCE OR ANOVA TABLE

It is a powerful statistical technique that involves partitioning the observed variance into
different components to conduct various significance tests.
Why do we use analysis of variance (ANOVA) when we are interested in the differences among means?
ANOVA is used to compare differences of means among more than 2 groups. It does this by looking
at variation in the data and where that variation is found (hence its name). Specifically, ANOVA compares
the amount of variation between groups with the amount of variation within groups. It can be used for both
observational and experimental studies.
 The ANOVA model
Mathematically, ANOVA can be written as:

x ij = μ i + ε ij
where x are the individual data points (i and j denote the group and the individual observation), ε is the
unexplained variation and the parameters of the model (μ) are the population means of each group. Thus,
each data point (xij) is its group mean plus error.
 Hypothesis testing
Like other classical statistical tests, we use ANOVA to calculate a test statistic (the F-ratio) with
which we can obtain the probability (the P-value) of obtaining the data assuming the null hypothesis. A
significant P-value (usually taken as P<0.05) suggests that at least one group mean is significantly different
from the others.
Null hypothesis: all population means are equal
Alternative hypothesis: at least one population mean is different from the rest.
 Calculation of the F ratio
ANOVA separates the variation in the dataset into 2 parts: between-group and within-group. These
variations are called the sums of squares, which can be seen in the equations below.
Step 1) Variation between groups
The between-group variation (or between-group sums of squares, SS) is calculated by comparing
the mean of each group with the overall mean of the data.

by adding up the square of the differences between each group mean and the overall population
mean , multiplied by sample size , assuming we are comparing three groups (i = 1, 2 or 3).
We then divide the BSS by the number of degrees of freedom [this is like sample size, except it is n-1,
because the deviations must sum to zero, and once you know n-1, the last one is also known] to get our
estimate of the mean variation between groups.
Step 2) Variation within groups.
The within-group variation (or the within-group sums of squares) is the variation of each
observation from its group mean.
SSR = s2group1 (ngroup1 – 1) + s2group2 (ngroup2 – 1) + s2group3 (ngroup3 – 1)
by adding up the variance of each group times by the degrees of freedom of each group. Note, you
might also come across the total SS (sum of ). Within SS is then Total SS minus Between SS.
As before, we then divide by the total degrees of freedom to get the mean variation within groups.
Step 3) The F ratio
The F ratio is then calculated as:

If the average difference between groups is similar to that within groups, the F ratio is about 1. As
the average difference between groups becomes greater than that within groups, the F ratio becomes larger
than 1.
To obtain a P-value, it can be tested against the F-distribution of a random variable with the degrees
of freedom associated with the numerator and denominator of the ratio. The P-value is the probably of
getting that F ratio or a greater one. Larger F-ratios gives smaller P-values
 Assumptions of ANOVA

 The response is normally distributed


 Variance is similar within different groups
 The data points are independent

Sum of Squares and Mean Squares


The total variance of an observed data set can be estimated using the following relationship:

Where:
 s is the standard deviation.
 yi is the ith observation.
 n is the number of observations.
 is the mean of the n observations.
Total sum of squares:
Total mean square:

When you attempt to fit a model to the observations, you are trying to explain some of the
variation of the observations using this model. For the case of simple linear regression, this model
is a line. In other words, you would be trying to see if the relationship between the independent
variable and the dependent variable is a straight line. If the model is such that the resulting line
passes through all of the observations, then you would have a "perfect" model, as shown in Figure
1.

Figure 1: Perfect Model Passing Through All Observed Data Points


The model explains all of the variability of the observations. Therefore, in this case,
the model sum of squares (abbreviated SSR) equals the total sum of squares:

For the perfect model, the model sum of squares, SSR, equals the total sum of squares, SST,
because all estimated values obtained using the model, , will equal the corresponding
observations, yi. The model sum of squares, SSR, can be calculated using a relationship similar to
the one used to obtain SST. For SSR, we simply replace the yi in the relationship of SST with :

The number of degrees of freedom associated with SSR, dof(SSR), is 1. Therefore, the model
mean square, MSR, is:
Figure 2 shows a case where the model is not a perfect model.

Figure 2: Most Models Do Not Fit All Data Points Perfectly


You can see that a number of observed data points do not follow the fitted line. This
indicates that a part of the total variability of the observed data still remains unexplained. This
portion of the total variability, or the total sum of squares that is not explained by the model, is
called the residual sum of squares or the error sum of squares (abbreviated SSE). The deviation for
this sum of squares is obtained at each observation in the form of the residuals, ei:

The error sum of squares can be obtained as the sum of squares of these deviations:

The number of degrees of freedom associated with SSE, dof(SSE), is (n-2). Therefore, the residual
or error mean square, MSE, is:

 Analysis of Variance Identity


The total variability of the observed data (i.e., the total sum of squares, SST) can be written using
the portion of the variability explained by the model, SSR, and the portion unexplained by the
model, SSE, as:

The above equation is referred to as the analysis of variance identity.

 F Test
To test if a relationship exists between the dependent and independent variable, a statistic
based on the F distribution is used. The statistic is a ratio of the model mean square and
the residual mean square.

For simple linear regression, the statistic follows the F distribution with 1 degree of
freedom in the numerator and (n-2) degrees of freedom in the denominator.

S-ar putea să vă placă și