Sunteți pe pagina 1din 16

Outlier detection is both easy and

difficult.
• It is easy since there are several
relatively straightforward tests for
Detecting Outliers the presence of outliers.

• It is difficult since there are no firm


rules as to when outlier removal is
appropriate.
Detecting Outliers Using SPSS:
Calculating Z-scores

Analyze
Descriptive
Statistics

9/20/2019 Aniceto B. Naval 2


This opens a window that allows us to select
variables to have boxplots made for.

Save standardized
values as variables

9/20/2019 Aniceto B. Naval 3


Expected output:

The only thing that will show in the output window is a


table like the above for the descriptive statistics.

9/20/2019 Aniceto B. Naval 4


The z-scores (standardized scores) will show as a
new variable in the Data Editor window:

The easiest way to look for


outliers with the z-scores is
to scan the list visually
looking for numbers that are
greater than 3 in absolute
value.

This would indicate an


outlier.

9/20/2019 Aniceto B. Naval 5


Outlier tests are an iterative process.
1. Check most extreme value for being an outlier.
2. If it is, remove it.
3. Check for the next extreme value using the
new, smaller sample. It is smaller because the
first outlier was removed.
4. Repeat the process.

Once all outlier are removed the sample can be


analyzed.

9/20/2019 Aniceto B. Naval 6


Procedure for
Identifying Outliers:

From the menu at the


top of the screen, click
on Analyze, then click
on Descriptive
Statistics,
then Explore.

9/20/2019 Aniceto B. Naval 7


• In the Display section,
make sure Both is
selected. This provides
both Statistics and Plots.
• Click on your variable
(e.g. technology), and
move it into
the Dependent list box.
Consider a factor (e.g.
Gender of the
Respondents) and move
to Factor List.
• Click Statistics.
9/20/2019 Aniceto B. Naval 8
• Click on Descriptives
and Outliers.
• Then click Continue.

9/20/2019 Aniceto B. Naval 9


• Click on Plots.

9/20/2019 Aniceto B. Naval 10


• Click on Histogram.
9/20/2019 Aniceto B. Naval 11
• Displays the valid and missing cases.

9/20/2019 Aniceto B. Naval 12


• Descriptive table provides with an
indication of how much a problem
associated with these outlying
cases.
• The expected value is the 5%
Trimmed Mean.
• SPSS removes the top and
bottom 5 per cent of the cases
and calculated a new mean value
to obtain this Trimmed Mean
value.
• Compare the original mean and
the new trimmed mean. If these
two mean values are very
different, then there’s a need to
investigate the data points further.
9/20/2019 Aniceto B. Naval 13
• The Extreme values table
gives the highest and the
lowest values recorded for
that variable and also
provide the ID of the
person with that score.
• It helps to identify the case
that has the outlying
values.

9/20/2019 Aniceto B. Naval 14


Have a look at the Histogram and check the tails of
distribution if there are data points falling away as the
extremes.

9/20/2019 Aniceto B. Naval 15


Inspect the
Boxplot whether
SPSS identifies
outliers. These
outliers are
displayed as little
circles with a ID
number attached.

9/20/2019 Aniceto B. Naval 16

S-ar putea să vă placă și