Sunteți pe pagina 1din 8

Which Measure of Central Tendency to Use? Mode, Mean, or Median?

A concept which may need a bit more explanation is: which average is appropriate for a given question? What is the best measure of central tendency? When would you use a median instead of a mean, or perhaps use a mode instead? For any data set, you can perform the analysis to come up with a value for each average. However, here are a few basic guidelines to help you choose the most appropriate form of central tendency to describe your data. 1. For a normal, random distribution of data (evenly distributed), the mean is preferred. 2. For a skewed data set, a median is more appropriate than a mean. The skewed data set (ie. extreme data points) will cause the mean value to be much more extreme than the median, and therefore less central. 3. The mode can be used for non-numerical data. Eg. hair colour in a classroom. Here are a few examples of where each would be appropriate: Mean: 1) students' heights in a classroom 2) temperature over a length of time Median: 1) income of a group of people 2) test scores for a group of students Mode: 1) finding the most common hair colour in a room 2) finding the most common car in a parking lot

Central Tendency

The term central tendency refers to the "middle" value or perhaps a typical value of the data, and is measured using the mean, median, or mode. Each of these measures is calculated differently, and the one that is best to use depends upon the situation.

Mean

The mean is the most commonly-used measure of central tendency. When we talk about an "average", we usually are referring to the mean. The mean is simply the sum of the values divided by the total number of items in the set. The result is referred to as the arithmetic mean. Sometimes it is useful to give more weighting to certain data points, in which case the result is called the weighted arithmetic mean. The notation used to express the mean depends on whether we are talking about the population mean or the sample mean: = population mean = sample mean The population mean then is defined as:

where = number of data points in the population = value of each data point i. The mean is valid only for interval data or ratio data. Since it uses the values of all of the data points in the population or sample, the mean is influenced by outliers that may be at the extremes of the data set.
Median

The median is determined by sorting the data set from lowest to highest values and taking the data point in the middle of the sequence. There is an equal number of points above and below the median. For example, in the data set {1,2,3,4,5} the

median is 3; there are two data points greater than this value and two data points less than this value. In this case, the median is equal to the mean. But consider the data set {1,2,3,4,10}. In this dataset, the median still is three, but the mean is equal to 4. If there is an even number of data points in the set, then there is no single point at the middle and the median is calculated by taking the mean of the two middle points. The median can be determined for ordinal data as well as interval and ratio data. Unlike the mean, the median is not influenced by outliers at the extremes of the data set. For this reason, the median often is used when there are a few extreme values that could greatly influence the mean and distort what might be considered typical. This often is the case with home prices and with income data for a group of people, which often is very skewed. For such data, the median often is reported instead of the mean. For example, in a group of people, if the salary of one person is 10 times the mean, the mean salary of the group will be higher because of the unusually large salary. In this case, the median may better represent the typical salary level of the group.
Mode

The mode is the most frequently occurring value in the data set. For example, in the data set {1,2,3,4,4}, the mode is equal to 4. A data set can have more than a single mode, in which case it is multimodal. In the data set {1,1,2,3,3} there are two modes: 1 and 3. The mode can be very useful for dealing with categorical data. For example, if a sandwich shop sells 10 different types of sandwiches, the mode would represent the most popular sandwich. The mode also can be used with ordinal, interval, and ratio data. However, in interval and ratio scales, the data may be spread thinly with no data points having the same value. In such cases, the mode may not exist or may not be very meaningful.
When to use Mean, Median, and Mode

The following table summarizes the appropriate methods of determining the middle or typical value of a data set based on the measurement scale of the data.

Measurement Scale

Best Measure of the "Middle"

Nominal (Categorical)

Mode

Ordinal

Median

Interval

Symmetrical data: Mean Skewed data: Median

Ratio

Symmetrical data: Mean Skewed data: Median

The measures of central tendency include mean, mode, median and range.

Let's Review To find the: Mean: (same as average) Add the numbers and divide by the number of numbers. Mode: The number that occurs the most. Median: The number is the middle of the data set (after the numbers are ordered least to greatest or vice versa). Range: The difference between the highest and lowest numbers (subtract).

Measures of central tendency: Median and mode


S. Manikandan Author information Copyright and License information Go to:

INTRODUCTION
Apart from the mean, median and mode are the two commonly used measures of central tendency. The median is sometimes referred to as a measure of location as it tells us where the data are.[1] This article describes about median, mode, and also the guidelines for selecting the appropriate measure of central tendency.
Go to:

MEDIAN
Median is the value which occupies the middle position when all the observations are arranged in an ascending/descending order. It divides the frequency distribution exactly into two halves. Fifty percent of observations in a distribution have scores at or below the median. Hence median is the 50th percentile.[2] Median is also known as positional average.[3] It is easy to calculate the median. If the number of observations are odd, then (n + 1)/2th observation (in the ordered set) is the median. When the total number of observations are even, it is given by the mean of n/2th and (n/2 + 1)th observation.[2]
Advantages 1. It is easy to compute and comprehend. 2. It is not distorted by outliers/skewed data.[4] 3. It can be determined for ratio, interval, and ordinal scale. Disadvantages 1. It does not take into account the precise value of each observation and hence does not use all information available in the data. 2. Unlike mean, median is not amenable to further mathematical calculation and hence is not used in many statistical tests. 3. If we pool the observations of two groups, median of the pooled group cannot be expressed in terms of the individual medians of the pooled groups. Go to:

MODE
Mode is defined as the value that occurs most frequently in the data. Some data sets do not have a mode because each value occurs only once. On the other hand, some data sets can have more than one mode. This happens when the data set has two or more values of equal frequency which is greater than that of any other value. Mode is rarely used as a summary statistic except to describe a bimodal distribution. In a bimodal distribution, the taller peak is called the major mode and the shorter one is the minor mode.
Advantages 1. It is the only measure of central tendency that can be used for data measured in a nominal scale.[5] 2. It can be calculated easily. Disadvantages 1. It is not used in statistical analysis as it is not algebraically defined and the fluctuation in the frequency of observation is more when the sample size is small. Go to:

POSITION OF MEASURES OF CENTRAL TENDENCY


The relative position of the three measures of central tendency (mean, median, and mode) depends on the shape of the distribution. All three measures are identical in a normal distribution [Figure 1a]. As mean is always pulled toward the extreme observations, the mean is shifted to the tail in a skewed distribution [Figure [Figure1b1b and andc].c]. Mode is the most frequently occurring score and hence it lies in the hump of the skewed distribution. Median lies in between the mean and the mode in a skewed distribution.[6,7]

Figure 1 The relative position of the various measures of central tendency. (a) Normal distribution (b) Positively (right) skewed distribution (c) Negatively (left) skewed distribution Go to:

SELECTING THE APPROPRIATE MEASURE


Mean is generally considered the best measure of central tendency and the most frequently used one. However, there are some situations where the other measures of central tendency are preferred. Median is preferred to mean[3] when
1. 2. 3. 4. 5. There are few extreme scores in the distribution. Some scores have undetermined values. There is an open ended distribution. Data are measured in an ordinal scale. Mode is the preferred measure when data are measured in a nominal scale. Geometric mean is the preferred measure of central tendency when data are measured in a logarithmic scale.[8]

Go to:

Footnotes
Source of Support: Nil Conflict of Interest: None declared.
Go to:

REFERENCES
1. Swinscow TD, Campbell MJ. 10th ed(Indian) New Delhi: Viva Books Private Limited; 2003. Statistics at square one. 2. Gravetter FJ, Wallnau LB. 5th ed. Belmont: Wadsworth Thomson Learning; 2000. Statistics for the behavioral sciences. 3. Sundaram KR, Dwivedi SN, Sreenivas V. 1st ed. New Delhi: B.I Publications Pvt Ltd; 2010. Medical statistics principles and methods. 4. Petrie A, Sabin C. 3rd ed. Oxford: Wiley-Blackwell; 2009. Medical statistics at a glance. 5. Norman GR, Streiner DL. 2nd ed. Hamilton: B.C. Decker Inc; 2000. Biostatistics the bare essentials. 6. SundarRao PS, Richard J. 4th ed. New Delhi: Prentice Hall of India Pvt Ltd; 2006. Introduction to biostatistics and research methods. 7. Glaser AN. 1st Indian Ed. New Delhi: Lippincott Williams and Wilkins; 2000. High Yield Biostatistics. 8. Dawson B, Trapp RG. 4th ed. New York: Mc-Graw Hill; 2004. Basic and Clinical Biostatistics.

S-ar putea să vă placă și