Sunteți pe pagina 1din 30

225-1

Chapter 225

Repeated
Measures ANOVA
Introduction
This chapter describes how to obtain false discovery rate or experiment-wise error rate
(Bonferroni) adjusted P-values (Probability Levels) for a repeated measures (within-subject)
experiment using the GESS: Repeated Measures ANOVA procedure. The general linear models
approach for repeated measures is used in this procedure. Up to three between factor variables
and three within factor variables, as well as interactions, may be specified in the Repeated
Measures ANOVA procedure. Geisser-Greenhouse, Box, and Huynh-Feldt corrections on the
within-subject F tests are available in this procedure. A detailed discussion of Repeated Measures
Analysis of Variance is found in the NCSS: Repeated Measure Analysis of Variance chapter.
Before running this procedure, output (.ges) files containing a single expression value for each
gene on each array must be obtained using the appropriate pre-processing procedure in GESS.

Repeated Measures ANOVA


The Repeated Measures ANOVA procedure produces an F-Test for each gene for each term of
the model used. Geisser-Greenhouse-, Box, or Huynh-Feldt corrections for the unadjusted P-
values (Probability Levels) may be made previous to multiple testing correction. A discussion of
repeated measures, disadvantages of within-subjects designs, assumptions, and P-Value
corrections are found in Chapter 214, Repeated Measures Analysis of Variance, of the NCSS
Manual.

Multiple Testing Adjustment


When a repeated measures analysis of variance is run, the result is a P-value (Probability Level)
for each fixed factor that reflects the evidence of difference in expression for at least one level of
the factor. When hundreds or thousands of genes are investigated at the same time, many ‘small’
P-values will occur by chance, due to the natural variability of the process. It is therefore requisite
to make an appropriate adjustment to the P-value (Probability Level), such that the likelihood of a
false conclusion is controlled.
225-2 Repeated Measures ANOVA

Benjamini and Hochberg’s (1995) False Discovery Rate Table


The following table (adapted to the subject of microarray data) is found in Benjamini and
Hochberg’s (1995) false discovery rate article. In the table, m is the total number of tests, m0 is
the number of tests for which there is no difference in expression, R is the number of tests for
which a difference is declared, and U, V, T, and S are defined by the combination of the
declaration of the test and whether or not a difference exists, in truth.

Declared Declared
Not Different Different Total
A true difference in
expression does not exist U V m0
There exists a true
difference in expression T S m – m0

Total m–R R m

In the table, the m is the total number of hypotheses tested (or total number of genes) and is
assumed to be known in advance. Of the m null hypotheses tested, m0 is the number of tests for
which there is no difference in expression, R is the number of tests for which a difference is
declared, and U, V, T, and S are defined by the combination of the declaration of the test and
whether or not a difference exists, in truth. The random variables U, V, T, and S are unobservable.

Need for Multiple Testing Adjustment


Following the calculation of a raw P-value for each test, P-value adjustments need be made to
account in some way for multiplicity of tests. It is desirable that these adjustments minimize the
number of genes for which factors are falsely declared different (V) while maximizing the number
of genes that are correctly declared different (S). To address this issue the researcher must know
the comparative value of finding a gene to the price of a false positive. If a false positive is very
expensive, a method that focuses on minimizing V should be employed. If the value of finding a
gene is much higher than the cost of additional false positives, a method that focuses on
maximizing S should be used.

Error Rates – P-Value Adjustment Techniques


Below is a brief description of three common error rates that are used for control of false positive
declarations. The commonly used P-value adjustment technique for controlling each error rate is
also described.

Per-Comparison Error Rate (PCER) – No Multiple Testing Adjustment


The per-comparison error rate (PCER) is defined as
PCER = E (V ) / m ,
where E(V) is the expected number of genes that are falsely declared different, and m is the total
number of tests. Preserving the PCER is tantamount to ignoring multiple testing altogether. If a
method is used which controls a PCER of 0.05 for 1,000 tests, approximately 50 out of 1,000
tests will falsely be declared significant. Using a method that controls the PCER will produce a
list of genes that includes most of the genes for which there exists a true difference in expression
Repeated Measures ANOVA 225-3

(i.e., maximizes S), but it will also include a very large number of genes which are falsely
declared to have a true difference in expression (i.e., does not appropriately minimize V).
Controlling the PCER should be viewed as overly weak control of Type I error.
To obtain P-values (Probability Levels) that control the PCER, no adjustment is made to the P-
value. To determine significance, the P-value is simply compared to the designated alpha.

Family-Wise Error Rate (FWER) – Bonferroni Adjustment


The family-wise error rate (FWER) is defined as
FWER = Pr(V > 0) ,
where V is the number of genes that are falsely declared different. Controlling FWER is
controlling the probability that a single null hypothesis is falsely rejected. If a method is used
which controls a FWER of 0.05 for 1,000 tests, the probability that any of the 1,000 tests
(collectively) is falsely rejected is 0.05. Using a method that controls the FWER will produce a
list of genes that includes a small (depending also on sample size) number of the genes for which
there exists a true difference in expression (i.e., limits S, unless the sample size is very large).
However, the list of genes will include very few or no genes that are falsely declared to have a
true difference in expression (i.e., stringently minimizes V). Controlling the FWER should be
considered very strong control of Type I error.
Assuming the tests are independent, the well-known Bonferroni P-value adjustment produces
adjusted P-values (Probability Levels) for which the FWER is controlled. The Bonferroni
adjustment is applied to all m unadjusted P-values ( p j ) as

p% j = min(mp j ,1) .
That is, each P-value (Probability Level) is multiplied by the number of tests, and if the result is
greater than one, it is set to the maximum possible P-value of one.

False Discovery Rate (FDR) – Benjamini and Hochberg Adjustment


The false discovery rate (FDR) (Benjamini and Hochberg, 1995) is defined as
V V
FDR = E ( 1{R >0} ) = E ( | R > 0) Pr( R > 0) ,
R R
where R is the number of genes that are declared significantly different, and V is the number of
genes that are falsely declared different. Controlling FDR is controlling the expected proportion
of falsely declared differences (false discoveries) to declared differences (true and false
discoveries, together). If a method is used which controls a FDR of 0.05 for 1,000 tests, and 40
genes are declared different, it is expected that 40*0.05 = 2 of the 40 declarations are false
declarations (false discoveries). Using a method that controls the FDR will produce a list of genes
that includes an intermediate (depending also on sample size) number of genes for which there
exists a true difference in expression (i.e., moderate to large S). However, the list of genes will
include a small number of genes that are falsely declared to have a true difference in expression
(i.e., moderately minimizes V). Controlling the FDR should be considered intermediate control of
Type I error.
Assuming the tests are independent, the Benjamini and Hochberg P-value adjustment produces
adjusted P-values (Probability Levels) for which the FDR is controlled. These adjusted P-values
are found as
225-4 Repeated Measures ANOVA

m
p% ri = min {min( pr ,1)} ,
k =i ,..., m k k
where pr1 ≤ pr2 ≤ L ≤ prm are the observed ordered unadjusted P-values. The procedure is
defined in Benjamini and Hochberg (1995). The corresponding adjusted P-value definition given
here is found in Dudoit, Shaffer, and Boldrick (2003).

Multiple Testing Adjustment Comparison


The following table gives a summary of the multiple testing adjustment procedures and error rate
control. The power to detect differences also depends heavily on sample size.

Adjustment Error Rate Control of Power to


Technique Controlled Type I Error Detect Differences
None PCER Minimal High
Bonferroni FWER Strict Low
Benjamini and FDR Moderate Moderate/High
Hochberg

Type I Error: Rejection of a null hypothesis that is true.

Analysis Steps
Following are the recommended steps for running an Repeated Measures ANOVA on microarray
data.

Step 1 – Pre-Processing
Run the appropriate pre-processing procedure (e.g., GenePix Pre-processing or Affymetrix Pre-
processing) to prepare data (.ges) files for statistical analysis. The .ges files are created when a
variable name is entered in the Output File Names Variable box on the variables tab of the pre-
processing procedure window.

Step 2 – Spreadsheet Setup


Because the analysis for hundreds or thousands of genes may be time-consuming, it is
recommended that an initial run be made on fictitious data to assure the spreadsheet is setup
properly. Perhaps the most important part of this initial run is careful specification of the model.
Correct specification may be verified by confirming that the expected sums of squares are shown
in the output for the run on fictitious data. The importance of this step increases as the complexity
of the statistical analysis increases. This step is also useful for getting ideas for follow-up
statistical analyses of specific genes.
Repeated Measures ANOVA 225-5

Step 3 – Run the Analysis


Carefully specify the model and the Prob Level Cutoff. If follow-up experiments are to be run,
the False Discovery Rate Control adjustment is recommended. If there will be no follow-up
experiments, the Bonferroni adjustment is recommended. The pre-processed data for the most
significant genes should be stored in the spreadsheet for detailed follow-up analysis.
Examine the output to determine if the number of hypothesis tests conducted is as expected, and
to see if the appropriate number of replicates was used. It may also help to look at the Prob Level
histogram to understand the distribution of statistics across the entire experiment.

Step 4 – Follow-Up Analysis


Run individual follow-up statistical analyses on the genes for which pre-processed data was
stored using the Repeated Measures Analysis of Variance procedure in NCSS. These individual
analyses are useful for examining test assumptions and specific trends in greater detail. Note,
however, that statistical tests are not adjusted for multiple testing across genes in the NCSS
procedures.

Procedure Options
This section describes the options available in this procedure.

Variables Tab
These options specify the variables that will be used in the analysis.

GES Files Specifications


These variables are used to identify the .ges files for the Repeated Measures ANOVA.
Response GES Files Variable
Specify the variable containing the column of input files on the spreadsheet. These input files will
usually be those files that were output as a result of a pre-processing procedure. The files of this
column contain the intensity summaries that will be the responses in the model.

Factor Specification
These variables are used to identify all the factors to be used in the model.
Between Variables (1-3)
These three variables specify between-subject factors. A between-subject factor specifies groups
into which the subjects are divided. For example, gender, age group, and treatment group are all
between factors.
The values of these variables indicate which group the subject belongs in. Values may be text or
numeric.
225-6 Repeated Measures ANOVA

Subject Variable
This variable indicates the variable containing the subject identification number or phrase.
Note that this variable is treated as a 'nested' factor.
Within Variables (1-3)
These three variable define within-subject factors. A within-subject factor is one whose levels
represent different points in time or space. It is the 'repeated measurement'.
Examples of within-subject factors are time, pre-post, and body organ.
Type
This option specifies the type of each factor. The options are

• Fixed
A fixed factor includes all possible values across the range of interest. Usually, hypotheses
are tested about fixed factors. For example, gender, dose-level, and treatment-group are
examples of fixed factors.

• Random
A random factor includes a sample from the population of possible values. Examples of
random factors are hospitals, cities, and randomly-selected sites.

Model Specification
These options determine the model that will be analyzed.
Which Model Terms
A design in which all main effect and interaction terms are included is called a saturated model.
Occasionally, it is useful to omit various interaction terms from the model-usually because some
data values are missing. This option lets you specify which interactions to keep.
The options included here are:

• Full Model. Use all terms.


The complete, saturated model is analyzed. All reports will be generated when this option is
selected.

• Full model except subject interactions combined with error.


Some authors recommend pooling the interactions involving the subject factor into one error
term to achieve more error degrees of freedom and thus more power in the F-tests. This
option lets you do this. Note that the Geisser-Greenhouse corrections are not made in this
case.

• Use the Custom Model given below.


This option indicates that you want the Custom Model (given in the next box) to be used.
Custom Model
When 'Custom Model' is selected in the Which Model Terms above, the actual analysis of
variance model is entered here.
Repeated Measures ANOVA 225-7

For complicated designs, it is usually easier to check the option 'Write Only', and run the
procedure. A model containing the listed factors will be generated and placed in this box. You
can then edit it as you desire.
The model is entered using letters (in alphabetical order) separated by the plus sign. For example,
a three-factor factorial in which only two-way interactions are needed would be entered as
follows:
A+B+AB+C+AC+BC
A simple repeated-measures design would look like this:
A+B(A)+C+AC+BC(A)
Write Model in ‘Custom Model’ Field. Do Not Process Data.
When this option is checked, no data analysis is performed when the procedure is run. Instead, a
copy of the full model is stored in the Custom Model box. You can then edit the model as desired.

Correction Option
Geisser-Greenhouse Correction
In a repeated measures ANOVA, the regular F-Tests of the within factors may not meet all of the
necessary assumptions. Geisser-Greenhouse proposed an adjustment to make the probability
levels more accurate. Box made a popular refinement. Huynh-Feldt made a further refinement
that made the probability level even more accurate.
Select here the type of adjustment you want to use.
RECOMMENDATION:
We recommend the Huynh-Feldt adjustment.

Adjustment for Multiple Testing


Multiple Test Correction
When several tests are performed on the same set of data, the probability levels of the individual
tests should be corrected. This option lets you specify the type of multiple test correction.

• None
No correction is done.

• Bonferroni
The Bonferroni correction preserves the experiment-wise error rate.

• False Discovery Rate Control


False Discovery Rate Control controls the proportion of falsely declared significant
differences.

• Recommendation
If you will be doing follow-up testing, False Discovery Rate Control should be used. If not,
the Bonferroni correction should be used.
225-8 Repeated Measures ANOVA

Reports Tab
The options on this panel control which reports and plots are generated.

Select Reports
The following options are used to determine the reports that will be displayed.
Expected Mean Square Report
Check this box to obtain the expected mean square for each model term.
Test Detail Sorted by Prob Level
Check this box to obtain a list of the most significant F tests, sorted by the probability level.
Associated names or IDs, unadjusted probability levels, standard deviations of means, standard
errors, degrees or freedom, and test statistics are also shown.
A separate report is produced for each term in the model.
Prob Level Cutoff
Specify the cutoff for the multiple test corrected probability levels. When the Test Detail Sorted
by Prob Level box is checked, all adjusted probability levels below this value will be reported.
Test Detail Sorted by Gene Within Subset
Check this box to obtain a list of all genes that are in subset lists. Associated probability levels,
standard deviations of means, standard errors, degrees of freedom, and test statistics are also
shown.
A separate list is produced for each subset, sorted alphabetically. A separate report is produced
for each term in the model.

Report Options
These options determine the format of the reports.
Precision
Specifies whether unformatted numbers are displayed as single (7-digit) or double (13-digit)
precision numbers.

• Single
Unformatted numbers are displayed with 7-digits. This is the default setting. All reports have
been formatted for single precision.

• Double
Unformatted numbers are displayed with 13-digits. This option is most often used when the
extremely accurate results are needed for further calculation. Double precision numbers will
require more space than allotted, potentially resulting in unaligned output. This option is
provided for those instances when accuracy is more important than format alignment.
COMMENTS:
This option does not affect formatted numbers such as probability levels.
This option only influences the format of the numbers as they are output. All calculations are
performed in double precision regardless of selection.
Repeated Measures ANOVA 225-9

Prob Decimals
Specify the number of decimal places to be used for displaying probability levels on the reports.
The number chosen here does not affect the internal precision of the data.
sqrt(MS) Decimals
Specify the number of decimal places to be used for displaying square root transformed mean
squares on the reports. The number chosen here does not affect the internal precision of the data.
F Value Decimals
Specify the number of decimal places to be used for displaying F-statistics on the reports. The
number chosen here does not affect the internal precision of the data.

Select Histograms
The following options are used to determine which histograms will be displayed.
Histogram of Prob Level
Check this box to obtain a histogram of the unadjusted (raw) probability levels.
Histogram of Corrected Prob Level
Check this box to obtain a histogram of all corrected probability levels.
Histogram of Log10(Prob Level)
Check this box to obtain a histogram of the Log(base 10) transformed, unadjusted (raw)
probability levels. When the mean square denominator is zero, the Log10(Prob Level) is put in
the bin at -5.
Histogram of Log10(Corrected Prob Level)
Check this box to obtain a histogram of all Log(base 10) transformed, corrected probability
levels. Occasionally, a mean square denominator of zero occurs, producing an undefined Prob
Level. When the mean square denominator is zero, the Log10(Corrected Prob Level) is put in the
bin at -5.
Histogram of Z(Prob Level)
Check this box to obtain a histogram of Z-transformed unadjusted (raw) probability levels. The
Z-transformation converts the probability level into the corresponding standard normal
distribution value using the probability integral transform. Values less than -9 are binned at -9.
Values greater than 9 are binned at 9.
Histogram of Z(Corrected Prob Level)
Check this box to obtain a histogram of Z-transformed corrected probability levels. The Z-
transformation converts the probability level into the corresponding standard normal distribution
value using the probability integral transform. Values less than -9 are binned at -9. Values greater
than 9 are binned at 9.
Histogram of SQRT(Mean Square Numerator)
Check this box to obtain a histogram of all square root transformed numerator mean squares for
each factor.
Histogram of SQRT(Mean Square Denominator)
Check this box to obtain a histogram of all square root transformed denominator mean squares for
each factor.
225-10 Repeated Measures ANOVA

Histogram of F Value
Check this box to obtain a histogram of all F values for each factor. When the mean square
denominator is zero, the value 100 is used in the histogram.

Computational Option
Genes Per Batch
To optimize the use of computer memory, the genes are processed in groups or batches. This
parameter specifies the number of genes processed per batch.
The basic rule is that the number of genes per batch times the number of arrays should be less
than 500,000.
If you choose 'Automatic', the program will select a reasonable value.

Histograms Tab
The options on this panel control the appearance of the histograms.

Vertical and Horizontal Axes


These options are used to format the histogram axes.
Label
Enter text here for the designated label.
REPLACEMENT CODES:
The following code is replaced by the appropriate name when the plot is generated.
{X} is replaced by the statistic that is reported in the histogram.
Minimum
Specify the value to be displayed as the minimum on this axis. Data values less than this amount
will be ignored.
If this value is left blank, the minimum will be determined from the data.
Maximum
Specify the value to be displayed as the maximum on this axis. Data values greater than this
amount will be ignored.
If this value is left blank, the maximum will be determined from the data.
Tick Label Settings…
This option specifies the characteristics of the reference numbers. It displays a window that edits
the font size and color of the reference numbers that appear next to the text along the axis of the
plot. It also allows you to set the number of digits in the reference numbers as well as their
vertical/horizontal orientation.
Note that in some cases, the format specified here is overridden by the variable's format as
specified on the database in the Variable Info Sheet.
Repeated Measures ANOVA 225-11

Major Ticks
Specify the number of large tickmarks and optional grid lines along this axis. A set of minor
tickmarks will be generated between each pair of major tickmarks. A reference number is
displayed adjacent to each major tickmark.
Minor Ticks
Select the number of small tickmarks to be displayed between each pair of major (large)
tickmarks along this axis.
Show Grid Lines
Check this option to display grid lines at the major tickmarks along this axis.
NOTE: Since the grid lines are drawn out from the tickmarks, they appear perpendicular to the
axis. Thus, checking the Y Grid Lines will actually cause horizontal grid lines to appear.

Histogram Settings
These options are used to specify the appearance of the histograms.
Style File
Designate a histogram style file. This file sets all histogram options that are not set directly on
this panel. Unless you choose otherwise, the HistoBox style file is used. Histogram style files are
created in the Histograms procedure.
Number of Bars
Specify the number of bars (bins) to be displayed. Select '0 - Automatic' to direct the program to
select an appropriate number based on the number of values.
Interior Color
Specify the histogram interior color.
Background Color
Specify the histogram background color.
Bar Fill Color
Specify the color of the inside of the bars.
Bar Border Color
Specify the color of the lines around the bars.

Horizontal Axis Minimums and


Maximums
Horizontal Axis Maximum
Specify the value to be displayed as the maximum on this axis. Data values greater than this
amount will be ignored.
If this value is left blank, the maximum will be determined from the data.
225-12 Repeated Measures ANOVA

Horizontal Axis Minimum


Specify the value to be displayed as the maximum on this axis. Data values greater than this
amount will be ignored.
If this value is left blank, the maximum will be determined from the data.

Histogram Title
Title
Enter text here for the histogram title.
REPLACEMENT CODES:
The following code is replaced by the appropriate name when the plot is generated.
{X} is replaced by the statistic that is reported in the histogram.

Storage Tab
The options on this panel control the storage of pre-processed data values on the spreadsheet for
further analysis.

Model Term Used for Storage


Term
Specify the term for which the pre-processed gene data are stored on the database. The pre-
processed values for all rows are stored, beginning with the gene with the most significant
(smallest probability level) for this term.

Spreadsheet Storage of the NAMES


of Significant Genes
These options determine whether the names of significant genes will be stored and where.
Store the names of the most significant genes on the spreadsheet
Check this box to store a list of names of the most significant genes into the variable (column)
specified under Store Gene Names in Variable.
Store Gene Names in Variable
If the box immediately below is checked, the names of the most significant genes will be stored in
the column associated with this variable.
Any data that is already in this variable will be overwritten.
Repeated Measures ANOVA 225-13

Spreadsheet Storage of the


EXPRESSION VALUES of Significant
Genes
These options determine whether the expression values of significant genes will be stored and
where.
Store the data values of the most significant genes on the spreadsheet
Check this box to store the pre-processed data values of all genes for which the corrected
probability level is below the cutoff value.
This allows the user to utilize other procedures to obtain follow-up analyses and graphics for the
significant genes.
Store Expression Values Beginning with Variable
The values of the most significant gene will be stored in this variable. The values for each
additional significant gene are stored in the variables immediately to the right of this variable.
Leave this value blank if you want the data storage to begin in the first blank column on the right-
hand side of the data.
WARNING: Use caution when selecting this variable, since existing data is automatically
replaced when the storage variables are created.
Maximum Storage Variables Used
Specify the maximum number of variables (columns) for which you want the gene intensity data
stored on the spreadsheet. This choice may be particularly important when the number of
significant genes is large.
Note that NCSS spreadsheets are limited to 255 variables, so if you want to store more values,
you will have to add more sheets.

Subsets 1 - 9 Tabs
The options on this panel control the names and lists of subsets.

Subset 1 – 9
Name
The name of the gene subset is entered here.
Separate reports may be generated to show all genes of a subset (see Reports tab). This may be
useful for examining probability levels of specific genes you are interested in that do not make
the cutoff.
Genes in this Subset
Enter a list of genes that are to be in this subset. The genes may be entered directly, or the *
character may be used to specify all genes with a particular beginning. The gene names or IDs
entered in this list must be in the column specified in Gene Name From box on the Variables tab.
EXAMPLES:
Blank
spike1
225-14 Repeated Measures ANOVA

spike3
spike5
spike* (all names beginning with spike)
AA44719
NM_00582
NM_04762
NM_27564
cntrl* (all names beginning with cntrl)
file(C:\Microarray\genelist.txt) (all names in the genelist.txt file)
var(OutputGenes) (all names in the spreadsheet variable with the variable name OutputGenes)
These Genes are
Specify here whether the genes of this subset are to be included or excluded from the list of genes
that are analyzed. Probability levels will not be calculated for the genes of this subset when
'Excluded' is entered here.

Non-Subset (Ungrouped) Genes


Name of Ungrouped Set
Enter the subset name to be used for all genes that are not included in any of the nine subsets.
Ungrouped Genes are
Specify here whether the genes not listed in any other subset are to be included or excluded from
the list of genes that are analyzed. Probability levels will not be calculated for these genes when
'Excluded' is entered here.
Excluding the genes of the ungrouped subset may be useful when analyzing only a small subset
of the genes of the array is desired.

Template Tab
The options on this panel allow various sets of options to be loaded (File menu: Load Template)
or stored (File menu: Save Template). A template file contains all the settings for this procedure.

Specify the Template File Name


File Name
Designate the name of the template file either to be loaded or stored.

Select a Template to Load or Save


Template Files
A list of previously stored template files for this procedure.
Repeated Measures ANOVA 225-15

Template Id’s
A list of the Template Id’s of the corresponding files. This id value is loaded in the box at the
bottom of the panel.

Example 1 – Repeated Measures ANOVA


The effect on gene expression of a control and 2 treatments are to be monitored over time. The
gene expression measurement times of interest are 0 hours, 12 hours, and 24 hours. Nine rats are
randomly assigned to the three treatment groups such that 3 rats are in each group. Blood samples
are taken from each rat immediately after treatment, 12 hours after treatment, and 24 hours after
treatment.
Time
Rat 0 Hours 12 Hours 24 Hours
1 Sample 1,1 Sample 1,2 Sample 1,3
Control 2 Sample 2,1 Sample 2,2 Sample 2,3
3 Sample 3,1 Sample 3,2 Sample 3,3
4 Sample 4,1 Sample 4,2 Sample 4,3
Treatment 1 5 Sample 5,1 Sample 5,2 Sample 5,3
6 Sample 6,1 Sample 6,2 Sample 6,3
7 Sample 7,1 Sample 7,2 Sample 7,3
Treatment 2 8 Sample 8,1 Sample 8,2 Sample 8,3
9 Sample 9,1 Sample 9,2 Sample 9,3

The result is 27 samples. Each sample is processed, exposed to a single microarray, resulting in a
single expression value for each gene for each rat of each treatment group at each time period.
The goal is to determine for each gene whether there is evidence that the expression is different
across treatment, time, and/or if there is a treatment by time interaction.
In the pre-processing procedure, 27 files are created. The format of the spreadsheet is shown
below.
RM1_RM dataset
Rat Time Treatment OutputFile
1 0 C %p%\data\gess\rm\rm\RM1_RM_1.ges
1 12 C %p%\data\gess\rm\rm\RM1_RM_2.ges
1 24 C %p%\data\gess\rm\rm\RM1_RM_3.ges
2 0 C %p%\data\gess\rm\rm\RM1_RM_4.ges
2 12 C %p%\data\gess\rm\rm\RM1_RM_5.ges
2 24 C %p%\data\gess\rm\rm\RM1_RM_6.ges
3 0 C %p%\data\gess\rm\rm\RM1_RM_7.ges
3 12 C %p%\data\gess\rm\rm\RM1_RM_8.ges
3 24 C %p%\data\gess\rm\rm\RM1_RM_9.ges
4 0 Trt1 %p%\data\gess\rm\rm\RM1_RM_10.ges
4 12 Trt1 %p%\data\gess\rm\rm\RM1_RM_11.ges
4 24 Trt1 %p%\data\gess\rm\rm\RM1_RM_12.ges
5 0 Trt1 %p%\data\gess\rm\rm\RM1_RM_13.ges
5 12 Trt1 %p%\data\gess\rm\rm\RM1_RM_14.ges
5 24 Trt1 %p%\data\gess\rm\rm\RM1_RM_15.ges
225-16 Repeated Measures ANOVA

6 0 Trt1 %p%\data\gess\rm\rm\RM1_RM_16.ges
6 12 Trt1 %p%\data\gess\rm\rm\RM1_RM_17.ges
6 24 Trt1 %p%\data\gess\rm\rm\RM1_RM_18.ges
7 0 Trt2 %p%\data\gess\rm\rm\RM1_RM_19.ges
7 12 Trt2 %p%\data\gess\rm\rm\RM1_RM_20.ges
7 24 Trt2 %p%\data\gess\rm\rm\RM1_RM_21.ges
8 0 Trt2 %p%\data\gess\rm\rm\RM1_RM_22.ges
8 12 Trt2 %p%\data\gess\rm\rm\RM1_RM_23.ges
8 24 Trt2 %p%\data\gess\rm\rm\RM1_RM_24.ges
9 0 Trt2 %p%\data\gess\rm\rm\RM1_RM_25.ges
9 12 Trt2 %p%\data\gess\rm\rm\RM1_RM_26.ges
9 24 Trt2 %p%\data\gess\rm\rm\RM1_RM_27.ges

The spreadsheet data used are recorded in the RM1_RM dataset.


You may follow along here by making the appropriate entries or load the completed template
Example 1 from the Template tab of the GESS Repeated Measures ANOVA window.

1 Open the RM1_RM dataset.


• From the File menu of the NCSS Data window, select Open.
• Select the Data subdirectory of your NCSS directory.
• Open the GESS folder.
• Click on the file RM1_RM.S0.
• Click Open.

2 Open the GESS Repeated Measures ANOVA window.


• On the menus, select GESS, then Analysis of Variance Routines, then Repeated
Measures ANOVA. The GESS Repeated Measures ANOVA procedure will be
displayed.
• On the menus, select File, then New Template. This will fill the procedure with the
default template. Alternatively, load the Example 1 Template, which generates the
specifications described below.

3 Specify the variables and hypothesis test details.


• On the GESS Repeated Measures ANOVA window, select the Variables tab.
• Set the Response GES Files Variable to OutputFile.
• Set the first variable box beneath Between Variables to Treatment.
• Set the Type for Treatment to Fixed.
• Set the Subject Variable to Rat.
• Set the first variable box beneath Within Variables to Time.
• Set the Type for Time to Fixed.
• Set Which Model Terms to Full model. Use all terms.
• Set Geisser-Greenhouse Correction to Huynh-Feldt.
• Set the Multiple Test Correction to Bonferroni.

4 Specify the reports.


• Select the Reports tab.
• Check the box next to Expected Mean Square Report.
• Check the box next to Test Detail Sorted by Prob Level.
• Set the Prob Level Cutoff to 0.05.
• Check all other boxes except Test Detail Sorted by Gene Within Subset.
Repeated Measures ANOVA 225-17

5 Run the procedure.


• From the Run menu, select Run Procedure. Alternatively, just click the Run button (the
left-most button on the button bar at the top).

Expected Mean Squares Section


Expected Mean Squares Section
Source Term Denominator Expected
Term DF Fixed? Term Mean Square
A: Treatment 2 Yes B(A) S+csB+bcsA
B(A): Rat 6 No S(ABC) S+csB
C: Time 2 Yes BC(A) S+sBC+absC
AC 4 Yes BC(A) S+sBC+bsAC
BC(A) 12 No S(ABC) S+sBC
S(ABC) 0 No S
Note: Expected Mean Squares are for the balanced cell-frequency case.

This report displays the expected mean squares for each term in the model.
Source Term
The source of variation or term in the model.
DF
The degrees of freedom, which is the number of observations used by this term.
Term Fixed?
Indicates whether the term is fixed or random.
Denominator Term
Indicates the term used as the denominator in the F-ratio.
Expected Mean Square
This expression represents the expected value of the corresponding mean square if the design
were completely balanced. S represents the expected value of the mean square error (sigma). The
uppercase letters represent either the adjusted sum of squared treatment means if the factor is
fixed, or the variance component if the factor is random. The lowercase letter represents the
number of levels for that factor, and s represents the number of replications of the experimental
layout.
These EMS expressions are provided to determine the appropriate error term for each factor. The
correct error term for a factor is that term whose EMS is identical except for the factor being
tested.
In this example, the appropriate error term for treatment is B(A).
225-18 Repeated Measures ANOVA

F-Test Detail for A: Treatment Sorted in Probability Level Order


F-Test Detail for A: Treatment Sorted in Probability Level Order
Bonferroni
Adjusted
Multiple Single SQRT MS SQRT MS
Gene Subset Tests Test DF1/ Num- Denom-
Name Name Prob Level Prob Level F Value DF2 erator inator
93822_at Other 0.0026532 0.0000077 148.986 2/6 3.1713 0.2598
AFFX-Ss_Angioten_3_s_at
Other 0.0080476 0.0000233 101.996 2/6 1.1045 0.1094
37029_at Other 0.0402469 0.0001167 58.397 2/6 3.3276 0.4355

Total number of hypothesis tests conducted = 345


Geisser-Greenhouse Correction: Huynh-Feldt

This report displays the genes for which the Bonferroni adjusted Prob Level is less than 0.05.
Gene Name
This is the name or ID of the genes for which the Bonferroni adjusted Prob Level is less than
0.05.
Subset Name
This is the name of the specified subset to which this gene belongs. If the gene is a not a member
of a subset list the default subset name is Other.
Bonferroni Adjusted Multiple Tests Prob Level
This is the Prob Level for the specified hypothesis test following a Bonferroni correction.
Single Test Prob Level
This is the Prob Level of the individual test, before multiple test correction is done.
F Value
This is the value of the F Statistic used to conduct the hypothesis test of interest.
DF1/DF2
DF1 is the number of degrees of freedom for the numerator. DF2 is the number of degrees of
freedom for the denominator.
SQRT MS Numerator
It is square root of the numerator of the F Statistic. It gives an idea of the variation among means.
SQRT MS Denominator
This is square root of the denominator of the F Statistic.
Repeated Measures ANOVA 225-19

F-Test Detail for C: Time Sorted in Probability Level Order


F-Test Detail for C: Time Sorted in Probability Level Order
Bonferroni
Adjusted
Multiple Single SQRT MS SQRT MS
Gene Subset Tests Test DF1/ Num- Denom-
Name Name Prob Level Prob Level F Value DF2 erator inator
37189_at Other 0.0000369 0.0000001 81.071 2/12 2.9567 0.3284
37725_at Other 0.0030582 0.0000089 35.707 2/12 1.8401 0.3079
100084_at Other 0.0035796 0.0000104 34.627 2/12 3.1345 0.5327
37001_at Other 0.0343659 0.0000996 21.868 2/12 2.5077 0.5363

Total number of hypothesis tests conducted = 345


Geisser-Greenhouse Correction: Huynh-Feldt

This report displays the genes for which the Bonferroni adjusted Prob Level is less than 0.05.

F-Test Detail for AC Sorted in Probability Level Order


F-Test Detail for AC Sorted in Probability Level Order
Bonferroni
Adjusted
Multiple Single SQRT MS SQRT MS
Gene Subset Tests Test DF1/ Num- Denom-
Name Name Prob Level Prob Level F Value DF2 erator inator
101482_at Other 0.0000030 0.0000000 88.070 4/12 3.4083 0.3632

Total number of hypothesis tests conducted = 345


Geisser-Greenhouse Correction: Huynh-Feldt

This report displays the gene for which the Bonferroni adjusted Prob Level is less than 0.05.

Histograms and Plots Section (for Treatment)


Histogram of Prob Level for Term = A Histogram of Corrected Prob Level for Term = A
20.0 350.0

15.0 262.5
Count

Count

10.0 175.0

5.0 87.5

0.0 0.0
0.0 0.3 0.5 0.8 1.0 0.0 0.3 0.5 0.8 1.0

Prob Level for Term = A Corrected Prob Level for Term = A


225-20 Repeated Measures ANOVA

Histogram of Log10(Prob Level) for Term = A Histogram of Log10(Corrected Prob Level) for Term = A
120.0 350.0

90.0 262.5
Count

Count
60.0 175.0

30.0 87.5

0.0 0.0
-6.0 -4.5 -3.0 -1.5 0.0 -3.0 -2.3 -1.5 -0.8 0.0

Log10(Prob Level) for Term = A Log10(Corrected Prob Level) for Term = A

Histogram of Z(Prob Level) for Term = A Histogram of Z(Corrected Prob Level) for Term = A
50.0 350.0

37.5 262.5
Count

Count

25.0 175.0

12.5 87.5

0.0 0.0
-6.0 -3.5 -1.0 1.5 4.0 -4.0 -0.5 3.0 6.5 10.0

Z(Prob Level) for Term = A Z(Corrected Prob Level) for Term = A

These six plots are used to examine the distribution of the P-Values (Prob Levels) of all genes in
the experiment, before and after the multiple testing correction. The Log (Base 10) and Z
(Normal) transformations aid in examining the distribution of the P-Values (Prob Levels) that are
extremely close to zero.

Histogram of SQRT(MS Numerator) for Term = A Histogram of SQRT(MS Denominator) for Term = A
80.0 40.0

60.0 30.0
Count

Count

40.0 20.0

20.0 10.0

0.0 0.0
0.0 0.9 1.8 2.6 3.5 0.0 0.4 0.7 1.1 1.4

SQRT(MS Numerator) for Term = A SQRT(MS Denominator) for Term = A

The distributions of the sqrt(mean square numerator) and sqrt(mean square denominator) a feel
for the components of the calculated F Values. Often these plots will be omitted.
Repeated Measures ANOVA 225-21

Histogram of F Value for Term = A


350.0

Count 262.5

175.0

87.5

0.0
0.0 37.5 75.0 112.5 150.0

F Value for Term = A

The distribution of the F Statistics can show the position of extreme F Values. Often this plot will
be omitted.

Example 2 – Split Plot Design – Analysis Steps


In a study, two factors are expected to influence gene expression in humans: gender and a
treatment factor. Blood samples are taken from 6 males and 6 females. Each sample is divided
into two parts. One part receives Treatment 1, while the other part receives Treatment 2. A single
cDNA sample is obtained from each part following treatment, resulting in a total of 24 samples.
Each sample is exposed to a single microarray. The goal is to determine for each gene whether
there is evidence that the expression is different between males and females, across treatments,
and/or if there are interactive effects of gender and treatment on gene expression.

Step 1 – Pre-Processing
The 24 arrays used in the example have already been pre-processed using one of the pre-
processing procedures. The spreadsheet containing the pathways for these files is the RM2_Split
dataset. To open the RM2_Split dataset, use the following steps.

1 Open the RM2_Split dataset.


• From the File menu of the NCSS Data window, select Open.
• Select the Data subdirectory of your NCSS directory.
• Open the GESS folder.
• Click on the file RM2_Split.S0.
• Click Open.
225-22 Repeated Measures ANOVA

Step 2 – Spreadsheet Setup


The RM2_Split dataset should appear as
RM2_Split dataset
Subject Gender Treatment OutputFile
1 M Trt1 %p%\data\gess\rm\split\RM2_Split_1.ges
1 M Trt2 %p%\data\gess\rm\split\RM2_Split_2.ges
2 M Trt1 %p%\data\gess\rm\split\RM2_Split_3.ges
2 M Trt2 %p%\data\gess\rm\split\RM2_Split_4.ges
3 M Trt1 %p%\data\gess\rm\split\RM2_Split_5.ges
3 M Trt2 %p%\data\gess\rm\split\RM2_Split_6.ges
4 M Trt1 %p%\data\gess\rm\split\RM2_Split_7.ges
4 M Trt2 %p%\data\gess\rm\split\RM2_Split_8.ges
5 M Trt1 %p%\data\gess\rm\split\RM2_Split_9.ges
5 M Trt2 %p%\data\gess\rm\split\RM2_Split_10.ges
6 M Trt1 %p%\data\gess\rm\split\RM2_Split_11.ges
6 M Trt2 %p%\data\gess\rm\split\RM2_Split_12.ges
7 F Trt1 %p%\data\gess\rm\split\RM2_Split_13.ges
7 F Trt2 %p%\data\gess\rm\split\RM2_Split_14.ges
8 F Trt1 %p%\data\gess\rm\split\RM2_Split_15.ges
8 F Trt2 %p%\data\gess\rm\split\RM2_Split_16.ges
9 F Trt1 %p%\data\gess\rm\split\RM2_Split_17.ges
9 F Trt2 %p%\data\gess\rm\split\RM2_Split_18.ges
10 F Trt1 %p%\data\gess\rm\split\RM2_Split_19.ges
10 F Trt2 %p%\data\gess\rm\split\RM2_Split_20.ges
11 F Trt1 %p%\data\gess\rm\split\RM2_Split_21.ges
11 F Trt2 %p%\data\gess\rm\split\RM2_Split_22.ges
12 F Trt1 %p%\data\gess\rm\split\RM2_Split_23.ges
12 F Trt2 %p%\data\gess\rm\split\RM2_Split_24.ges

Random numbers may be entered into a vacant column to verify that the setup is correct. The title
for the column may be named Random. The spreadsheet should now look like the following.
RM2_Split_a dataset
Subject Gender Treatment OutputFile Random
1 M Trt1 %p%\data\gess\rm\split\RM2_Split_1.ges 6
1 M Trt2 %p%\data\gess\rm\split\RM2_Split_2.ges 5
2 M Trt1 %p%\data\gess\rm\split\RM2_Split_3.ges 8
2 M Trt2 %p%\data\gess\rm\split\RM2_Split_4.ges 9
3 M Trt1 %p%\data\gess\rm\split\RM2_Split_5.ges 7
3 M Trt2 %p%\data\gess\rm\split\RM2_Split_6.ges 6
4 M Trt1 %p%\data\gess\rm\split\RM2_Split_7.ges 4
4 M Trt2 %p%\data\gess\rm\split\RM2_Split_8.ges 6
5 M Trt1 %p%\data\gess\rm\split\RM2_Split_9.ges 5
5 M Trt2 %p%\data\gess\rm\split\RM2_Split_10.ges 3
6 M Trt1 %p%\data\gess\rm\split\RM2_Split_11.ges 8
6 M Trt2 %p%\data\gess\rm\split\RM2_Split_12.ges 9
7 F Trt1 %p%\data\gess\rm\split\RM2_Split_13.ges 9
7 F Trt2 %p%\data\gess\rm\split\RM2_Split_14.ges 4
8 F Trt1 %p%\data\gess\rm\split\RM2_Split_15.ges 5
8 F Trt2 %p%\data\gess\rm\split\RM2_Split_16.ges 4
Repeated Measures ANOVA 225-23

9 F Trt1 %p%\data\gess\rm\split\RM2_Split_17.ges 6
9 F Trt2 %p%\data\gess\rm\split\RM2_Split_18.ges 8
10 F Trt1 %p%\data\gess\rm\split\RM2_Split_19.ges 5
10 F Trt2 %p%\data\gess\rm\split\RM2_Split_20.ges 6
11 F Trt1 %p%\data\gess\rm\split\RM2_Split_21.ges 7
11 F Trt2 %p%\data\gess\rm\split\RM2_Split_22.ges 3
12 F Trt1 %p%\data\gess\rm\split\RM2_Split_23.ges 2
12 F Trt2 %p%\data\gess\rm\split\RM2_Split_24.ges 5

Alternatively, open the RM2_Split_a dataset.

1 Open the RM2_Split_a dataset.


• From the File menu of the NCSS Data window, select Open.
• Select the Data subdirectory of your NCSS directory.
• Open the GESS folder.
• Click on the file RM2_Split_a.S0.
• Click Open.

To analyze the random column using the NCSS: Repeated Measures Analysis of Variance
procedure, take the following steps.

2 Open the NCSS: Repeated Measures Analysis of Variance window.


• On the menus, select Analysis, then Analysis of Variance (ANOVA), then Repeated
Measures Analysis of Variance. The NCSS: Repeated Measures Analysis of Variance
procedure will be displayed.
• On the menus, select File, then New Template. This will fill the procedure with the
default template.

3 Specify the variables.


• Select the Variables tab.
• Set Response Variable(s) to Random.
• Set the Between Factor 1 to Gender.
• Set the Subject Variable to Subject.
• Set the Within Factor 1 to Treatment.

4 Specify the model.


• Select the Model tab.
• Under Which Model Terms, select Full Model. Use all terms.

5 Run the procedure.


• From the Run menu, select Run Procedure. Alternatively, just click the Run button (the
left-most button on the button bar at the top).
225-24 Repeated Measures ANOVA

Repeated Measures ANOVA Output


The Expected Mean Squares Section and Analysis of Variance Table should appear as follows.

Expected Mean Squares Section


Source Term Denominator Expected
Term DF Fixed? Term Mean Square
A: Gender 1 Yes B(A) S+csB+bcsA
B(A): Subject 10 No S(ABC) S+csB
C: Treatment 1 Yes BC(A) S+sBC+absC
AC 1 Yes BC(A) S+sBC+bsAC
BC(A) 10 No S(ABC) S+sBC
S(ABC) 0 No S
Note: Expected Mean Squares are for the balanced cell-frequency case.

Analysis of Variance Table


Source Sum of Mean Prob Power
Term DF Squares Square F-Ratio Level (Alpha=0.05)
A: Gender 1 6 6 1.17 0.305024 0.165173
B(A): Subject 10 51.33333 5.133333
C: Treatment 1 0.6666667 0.6666667 0.20 0.661086 0.069474
AC 1 0.6666667 0.6666667 0.20 0.661086 0.069474
BC(A) 10 32.66667 3.266667
S 0
Total (Adjusted) 23 91.33334
Total 24
* Term significant at alpha = 0.05

The Denominator Term and Expected Mean Squares are correct. The three F-Tests are those
desired in the gene expression analysis. The appropriateness of the setup has been verified.

Step 3 – Run the Analysis


The following steps should be taken to run the analysis. You may follow along here by making
the appropriate entries or load the completed template Example 2 from the Template tab of the
GESS Repeated Measures ANOVA window.

1 Open the GESS Repeated Measures ANOVA window.


• On the menus, select GESS, then Analysis of Variance Routines, then Repeated
Measures ANOVA. The GESS Repeated Measures ANOVA procedure will be
displayed.
• On the menus, select File, then New Template. This will fill the procedure with the
default template. Alternatively, load the Example 2 Template, which generates the
specifications described below.

2 Specify the variables and hypothesis test details.


• On the GESS Repeated Measures ANOVA window, select the Variables tab.
• Set the Response GES Files Variable to OutputFile.
• Set the first variable box beneath Between Variables to Gender.
• Set the Type for Treatment to Fixed.
• Set the Subject Variable to Subject.
• Set the first variable box beneath Within Variables to Treatment.
• Set the Type for Time to Fixed.
• Set Which Model Terms to Full model. Use all terms.
• Set Geisser-Greenhouse Correction to None (Regular F).
• Set the Multiple Test Correction to False Discovery Rate Control.
Repeated Measures ANOVA 225-25

3 Specify the storage options.


• Select the Storage tab.
• Set Term used for Determining Storage to C (corresponding to Treatment).
• Check the box next to Store the data values of the most significant genes on the
spreadsheet.
• Set Store Expression Values Beginning with Variable to C6.
• Set Maximum Storage Variables used to 2.

4 Specify the reports.


• Select the Reports tab.
• Check the box next to Expected Mean Square Report.
• Check the box next to Test Detail Sorted by Prob Level.
• Set the Prob Level Cutoff to 0.05.
• Uncheck the all other boxes.

5 Run the procedure.


• From the Run menu, select Run Procedure. Alternatively, just click the Run button (the
left-most button on the button bar at the top).

Expected Mean Squares Section


Expected Mean Squares Section
Source Term Denominator Expected
Term DF Fixed? Term Mean Square
A: Gender 1 Yes B(A) S+csB+bcsA
B(A): Subject 10 No S(ABC) S+csB
C: Treatment 1 Yes BC(A) S+sBC+absC
AC 1 Yes BC(A) S+sBC+bsAC
BC(A) 10 No S(ABC) S+sBC
S(ABC) 0 No S
Note: Expected Mean Squares are for the balanced cell-frequency case.

This report displays the expected mean squares for each term in the model. The columns are
described in Example 1.

F-Test Detail for A: Gender Sorted in Probability Level Order


F-Test Detail for A: Gender Sorted in Probability Level Order
FDR
Adjusted
Multiple Single SQRT MS SQRT MS
Gene Subset Tests Test DF1/ Num- Denom-
Name Name Prob Level Prob Level F Value DF2 erator inator
37046_at Other 0.0002119 0.0000006 122.859 1/10 3.4262 0.3091
37189_at Other 0.0004410 0.0000026 90.109 1/10 3.5115 0.3699
37029_at Other 0.0020054 0.0000174 58.482 1/10 3.0374 0.3972

Total number of hypothesis tests conducted = 345


Geisser-Greenhouse Correction: None (Regular F)

This report displays the genes for which the False Discovery Rate Adjusted Prob Level is less
than 0.05 for Gender. The columns are described in Example 1.
225-26 Repeated Measures ANOVA

F-Test Detail for C: Treatment Sorted in Probability Level Order


F-Test Detail for C: Treatment Sorted in Probability Level Order
FDR
Adjusted
Multiple Single SQRT MS SQRT MS
Gene Subset Tests Test DF1/ Num- Denom-
Name Name Prob Level Prob Level F Value DF2 erator inator
31962_at Other 0.0004585 0.0000013 103.977 1/10 3.2343 0.3172
94766_at Other 0.0090688 0.0000526 45.108 1/10 3.7749 0.5620
101482_at Other 0.0228121 0.0002036 32.273 1/10 2.6180 0.4608
93822_at Other 0.0228121 0.0002645 30.170 1/10 2.8094 0.5115
40515_at Other 0.0255393 0.0003701 27.626 1/10 2.5226 0.4799

Total number of hypothesis tests conducted = 345


Geisser-Greenhouse Correction: None (Regular F)

This report displays the genes for which the False Discovery Rate Adjusted Prob Level is less
than 0.05 for Treatment. The columns are described in Example 1.

F-Test Detail for AC Sorted in Probability Level Order


F-Test Detail for AC Sorted in Probability Level Order
FDR
Adjusted
Multiple Single SQRT MS SQRT MS
Gene Subset Tests Test DF1/ Num- Denom-
Name Name Prob Level Prob Level F Value DF2 erator inator
38437_at Other 0.0000191 0.0000001 204.432 1/10 5.3333 0.3730
41237_at Other 0.0001335 0.0000008 116.895 1/10 4.3791 0.4050

Total number of hypothesis tests conducted = 345


Geisser-Greenhouse Correction: None (Regular F)

This report displays the genes for which the False Discovery Rate Adjusted Prob Level is less
than 0.05 for the interaction. The columns are described in Example 1.

Storage Data
The pre-processed data for the 2 most significant genes are output into the spreadsheet.
RM2_Split dataset after data storage
Factor1 Factor2 Treatment OutputFile Random X31962_at X94766_at
1 M Trt1 ...1.ges 6 4.347998793 3.60229846
1 M Trt2 ...2.ges 5 4.871123991 2.547631461
2 M Trt1 ...3.ges 8 4.111716766 5.001783623
2 M Trt2 ...4.ges 9 5.771551162 2.84931956
3 M Trt1 ...5.ges 7 4.206340193 4.490210641
3 M Trt2 ...6.ges 6 5.639026008 2.438519502
4 M Trt1 ...7.ges 4 3.75557322 4.313683288
4 M Trt2 ...8.ges 6 5.018803914 4.540379736
5 M Trt1 ...9.ges 5 4.788529126 4.302717763
5 M Trt2 ...10.ges 3 6.887367824 2.716635127
6 M Trt1 ...11.ges 8 3.738071745 4.61868161
6 M Trt2 ...12.ges 9 5.237574131 2.834071854
7 F Trt1 ...13.ges 9 4.395934876 4.179028364
7 F Trt2 ...14.ges 4 5.157022636 2.884850643
8 F Trt1 ...15.ges 5 4.212181686 4.963690929
Repeated Measures ANOVA 225-27

8 F Trt2 ...16.ges 4 5.528088451 2.665368144


9 F Trt1 ...17.ges 6 3.917985941 4.061706345
9 F Trt2 ...18.ges 8 5.014277601 2.794416622
10 F Trt1 ...19.ges 5 3.815579149 3.463388104
10 F Trt2 ...20.ges 6 4.94345766 2.770238937
11 F Trt1 ...21.ges 7 4.384939058 5.074048944
11 F Trt2 ...22.ges 3 5.586884161 2.732024731
12 F Trt1 ...23.ges 2 4.085517525 5.117822837
12 F Trt2 ...24.ges 5 5.949754288 2.922561482

An X is added at the beginning of the variable names to avoid a variable name beginning with a
number. This data can be analyzed further using the NCSS: Repeated Measures Analysis of
Variance procedure. However, when hypothesis tests are run using the NCSS: Repeated Measures
Analysis of Variance procedure, adjustments for multiplicity of tests across genes are no longer
made.

Step 4 – Follow-Up Analysis


Twenty-four pre-processed values should have been saved for 2 genes, X31962_at and
X94766_at.
More specific analyses of X31962_at, for example, may be obtained using the NCSS: Repeated
Measures Analysis of Variance procedure.

1 Open the NCSS: Repeated Measures Analysis of Variance window.


• On the menus, select Analysis, then Analysis of Variance (ANOVA), then Repeated
Measures Analysis of Variance. The NCSS: Repeated Measures Analysis of Variance
procedure will be displayed.
• On the menus, select File, then New Template. This will fill the procedure with the
default template.

2 Specify the variables.


• Select the Variables tab.
• Set Response Variable(s) to X31962_at.
• Set the Between Factor 1 to Gender.
• Set the Subject Variable to Subject.
• Set the Within Factor 1 to Treatment.

3 Specify the model.


• Select the Model tab.
• Under Which Model Terms, select Full Model. Use all terms.

4 Run the procedure.


• From the Run menu, select Run Procedure. Alternatively, just click the Run button (the
left-most button on the button bar at the top).
225-28 Repeated Measures ANOVA

NCSS Repeated Measures Analysis of Variance Output


Expected Mean Squares Section
Source Term Denominator Expected
Term DF Fixed? Term Mean Square
A: Gender 1 Yes B(A) S+csB+bcsA
B(A): Subject 10 No S(ABC) S+csB
C: Treatment 1 Yes BC(A) S+sBC+absC
AC 1 Yes BC(A) S+sBC+bsAC
BC(A) 10 No S(ABC) S+sBC
S(ABC) 0 No S
Note: Expected Mean Squares are for the balanced cell-frequency case.

Analysis of Variance Table


Source Sum of Mean Prob Power
Term DF Squares Square F-Ratio Level (Alpha=0.05)
A: Gender 1 7.958636E-02 7.958636E-02 0.23 0.643893 0.071698
B(A): Subject 10 3.503788 0.3503788
C: Treatment 1 10.46043 10.46043 103.98 0.000001* 1.000000
AC 1 5.132553E-02 5.132553E-02 0.51 0.491398 0.099319
BC(A) 10 1.006031 0.1006031
S 0
Total (Adjusted) 23 15.10116
Total 24
* Term significant at alpha = 0.05

Probability Levels for F-Tests with Geisser-Greenhouse Adjustments


Lower Geisser Huynh
Bound Greenhouse Feldt
Regular Epsilon Epsilon Epsilon
Source Prob Prob Prob Prob
Term DF F-Ratio Level Level Level Level
A: Gender 1 0.23 0.643893
B(A): Subject 10
C: Treatment 1 103.98 0.000001* 0.000001* 0.000001* 0.000001*
AC 1 0.51 0.491398 0.491398 0.491398 0.491398
BC(A) 10
S 0

Power Values for F-Tests with Geisser-Greenhouse Adjustments Section


Lower Geisser Huynh
Bound Greenhouse Feldt
Regular Epsilon Epsilon Epsilon
Source Power Power Power Power
Term DF F-Ratio (Alpha=0.05) (Alpha=0.05) (Alpha=0.05) (Alpha=0.05)
A: Gender 1 0.23 0.071698
B(A): Subject 10
C: Treatment 1 103.98 1.000000 1.000000 1.000000 1.000000
AC 1 0.51 0.099319 0.099319 0.099319 0.099319
BC(A) 10
S 0

Box's M Test for Equality of Between-Group Covariance Matrices Section


Covariance
Source F Prob Chi2 Prob Matrices
Term Box's M DF1 DF2 Value Level Value Level Equal?
BC(A) 2.27 3.0 18000.0 0.59 0.620328 1.77 0.620413 Okay

Covariance Matrix Circularity Section


Lower Geisser Huynh Mauchly Covariance
Source Bound Greenhouse Feldt Test Chi2 Prob Matrix
Term Epsilon Epsilon Epsilon Statistic Value DF Level Circularity?
BC(A) 1.000000 1.000000 1.000000 1.000000 0.0 0.0 1.000000 Okay

Note: Mauchly's statistic actually tests the more restrictive assumption that the pooled covariance matrix
has compound symmetry.
Repeated Measures ANOVA 225-29

Means and Standard Error Section


Standard
Term Count Mean Error
All 24 4.806888
A: Gender
F 12 4.749302 0.1708749
M 12 4.864473 0.1708749
C: Treatment
Trt1 12 4.146698 9.156196E-02
Trt2 12 5.467078 9.156196E-02
AC: Gender,Treatment
F,Trt1 6 4.135356 0.1294882
F,Trt2 6 5.363247 0.1294882
M,Trt1 6 4.158038 0.1294882
M,Trt2 6 5.570908 0.1294882

Plots Section

Means of X31962_at Means of X31962_at


7.00 7.00

6.13 6.13
X31962_at

5.25 X31962_at 5.25

4.38 4.38

3.50 3.50
F M Trt1 Trt2
Gender Treatment

Means of X31962_at Means of X31962_at


7.00 7.00
Treatment Gender
Trt1 F
Trt2 M
6.13 6.13
X31962_at

X31962_at

5.25 5.25

4.38 4.38

3.50 3.50
F M Trt1 Trt2
Gender Treatment

X31962_at vs Treatment by Subject


7.00

6.13
X31962_at

5.25

4.38

3.50
Trt1 Trt2
Treatment
225-30 Repeated Measures ANOVA

The full analysis of variance and means tables, tests of assumptions, and graphics can be used to
further study the results of each gene that is found to be statistically significant. Notice, however,
that no correction is made for multiple testing across genes in NCSS. Details of the NCSS:
Repeated Measures Analysis of Variance procedure are in Chapter 214 of the NCSS manuals.

S-ar putea să vă placă și