Sunteți pe pagina 1din 18

CETM11

Comparison of means

Objectives
At the end of this session you will be able to:
Plot a histogram to assess the normality of
the distribution of data
Know when to apply a parametric or nonparametric test
Apply the parametric Independent T Test
Apply the non-parametric Mann-Whitney test

Aims
T-tests
Dependent (aka paired, matched)
Independent

Rationale for the tests


Assumptions

Interpretation
Reporting results
Calculating an Effect Size
T-tests as a GLM

Experiments
The simplest form of experiment that can be
done is one with only one independent
variable that is manipulated in only two ways
and only one outcome is measured.
More often than not the manipulation of the
independent variable involves having an
experimental condition and a control.
E.g., Is the movie Scream 2 scarier than the original
Scream? We could measure heart rates (which
indicate anxiety) during both films and compare
them.

This situation can be analysed with a t-test

T-test
Dependent t-test
Compares two means based on related data.
E.g., Data from the same people measured at
different times.
Data from matched samples.

Independent t-test
Compares two means based on independent data
E.g., data from different groups of people

Significance testing
Testing the significance of Pearsons correlation
coefficient
Testing the significance of b in regression.

Rationale to the t-test

observed difference
expected difference
between sample
between population means
means
(if null hypothesis is true)
t

estimate of the standard error of the difference between


two sample means

Assumptions of the t-test


Both the independent t-test and the dependent
t-test are parametric tests based on the normal
distribution. Therefore, they assume:
The sampling distribution is normally distributed. In
the dependent t-test this means that the sampling
distribution of the differences between scores should
be normal, not the scores themselves.
Data are measured at least at the interval level.

The independent t-test, because it is used to


test different groups of people, also assumes:
Variances in these populations are roughly equal
(homogeneity of variance).
Scores in different treatment conditions are
independent (because they come from different
people).

Realistic distributions
Most data sets found in nature are not
normally distributed
How can you tell and what can you do if
they are not?

6
9
3
4
1

Count

There are standard


statistical tests but
the easiest way to
assess approximately
is to inspect a
histogram of the data
Plot a histogram of
the following data of
childrens age at a
party

4
0
7

2.00

4.00

6.00

8.00

10.00

Child_age

7
10
6
6
5
8
12
6
7
5
6
7
8

The distribution
is approximately
symmetrical

When to use the t-test


Independent t-test
It is used to test whether there is a
significant difference between two means
collected from independent samples
Used where there are two experimental
conditions and different participants have
been used in each condition

When Assumptions are Broken


Dependent t-test
Mann-Whitney Test
Wilcoxon rank-sum test

Independent t-test
Wilcoxon Signed-Rank Test

Robust Tests:
Bootstrapping
Trimmed means

Example
Is arachnophobia (fear of spiders) specific to
real spiders or is a picture enough?
Participants
12 spider phobic individuals

Manipulation
Each participant was exposed to a real spider and a
picture of the same spider at two points in time.
(This is therefore a Dependent t-test)

Outcome
Anxiety

Dependent t-test Output

Significance
p< .05 therefore
the results are
significant

Histogram and descriptive


statistics

Count

Mean

Standard Error

Median

Mode

Standard Deviation

2.708013

Sample Variance

7.333333

Kurtosis

0.888107

Skewness

-0.15826

Range

Minimum

10

Maximum

12

Sum

Count

5
8
2.00

4.00

6.00

Child_age

8.00

10.00

12
6
7
5
6
7
8

6
0.57735

12

132
22

The T Test (also known as Students


T Test)
We are going to use this test to to find the
statistical significance of the difference
between two sample means
It is also used for confidence intervals for
the differences between two population
means
The t-distribution or Student's tdistribution is a probability distribution
that arises in the problem of estimating
the mean of a normally distributed
population when the sample size is small.

Mann Whitney

Definition: A non-parametric test (distribution-free) used


to compare two independent groups of sampled data.
Assumptions: Unlike the parametric t-test, this nonparametric makes no assumptions about the distribution of
the data (e.g. normality).
Characteristics: This test is an alternative to the
independent group t-test, when the assumption of
normality or equality of variance is not met. This, like many
non-parametric tests, uses the ranks of the data rather than
their raw values to calculate the statistic. Since this test
does not make a distribution assumption, it is not as
powerful as the t-test.
Test: The hypotheses for the comparison of two
independent groups are:
Ho: The two samples come from identical populations
H1: The two samples come from different populations

Mann-Whitney test
In order to apply the Mann-Whitney test, the
raw data from samples A andB must first be
combined into a set of N=na+nb elements,
These are then ranked from lowest to highest,
including tied rank values where appropriate.
These rankings are then re-sorted into the two
separate samples.
If your data have not yet been rank-ordered in
this fashion, SPSS allows you to import raw
data from a spreadsheet.

Scores
These test scores were
obtained by students taught
by one method in 2003 and
by a new web-based method
in 2004
The research question is:
Is there a difference in the
mean score between the
years?

S-ar putea să vă placă și