Documente Academic
Documente Profesional
Documente Cultură
MEAN
Mean is the most commonly used measure of central tendency. There are different types of
mean, viz. arithmetic mean, weighted mean, geometric mean (GM) and harmonic mean (HM). If
mentioned without an adjective (as mean), it generally refers to the arithmetic mean.
Arithmetic mean
Arithmetic mean (or, simply, “mean”) is nothing but the average. It is computed by adding all
the values in the data set divided by the number of observations in it. If we have the raw data,
mean is given by the formula
Where, ∑ (the uppercase Greek letter sigma), X refers to summation, refers to the individual
value and n is the number of observations in the sample (sample size). The research articles
published in journals do not provide raw data and, in such a situation, the readers can compute
the mean by calculating it from the frequency distribution (if provided).
Where, f is the frequency and X is the midpoint of the class interval and n is the number of
observations.
ADVANTAGES
The mean uses every value in the data and hence is a good representative of the data. The irony
in this is that most of the times this value never appears in the raw data.
Repeated samples drawn from the same population tend to have similar means. The mean is
therefore the measure of central tendency that best resists the fluctuation between different
samples.
It is closely related to standard deviation, the most common measure of dispersion.
DISADVANTAGES
The important disadvantage of mean is that it is sensitive to extreme values/outliers, especially
when the sample size is small. Therefore, it is not an appropriate measure of central tendency for
skewed distribution.
Mean cannot be calculated for nominal or nonnominal ordinal data. Even though mean can be
calculated for numerical ordinal data, many times it does not give a meaningful value, e.g. stage
of cancer.
Weighted mean
Weighted mean is calculated when certain values in a data set are more important than the
others. A weight wi is attached to each of the values xi to reflect this importance.
For example, When weighted mean is used to represent the average duration of stay by a patient
in a hospital, the total number of cases presenting to each ward is taken as the weight.
Geometric Mean
It is defined as the arithmetic mean of the values taken on a log scale. It is also expressed as the
nth root of the product of an observation.
Harmonic mean
It is the reciprocal of the arithmetic mean of the observations.
HM is appropriate in situations where the reciprocals of values are more useful. HM is used
when we want to determine the average sample size of a number of groups, each of which has a
different sample size.
MEDIAN
Median is the value which occupies the middle position when all the observations are arranged in
an ascending/descending order. It divides the frequency distribution exactly into two halves.
Fifty percent of observations in a distribution have scores at or below the median. Hence median
is the 50th percentile. Median is also known as ‘positional average.
It is easy to calculate the median. If the number of observations are odd, then (n + 1)/2th
observation (in the ordered set) is the median. When the total number of observations are even, it
is given by the mean of n/2th and (n/2 + 1)th observation.
Advantages
1. It is easy to compute and comprehend.
2. It is not distorted by outliers/skewed data.
3. It can be determined for ratio, interval, and ordinal scale.
Disadvantages
1. It does not take into account the precise value of each observation and hence does not use
all information available in the data.
2. Unlike mean, median is not amenable to further mathematical calculation and hence is
not used in many statistical tests.
3. If we pool the observations of two groups, median of the pooled group cannot be
expressed in terms of the individual medians of the pooled groups.
Calculation of Median
In individual series, where data is given in the raw form, the first step towards median calculation is
to arrange the data in ascending or descending order. Now calculate the number of observations
denoted by N. The next step is decided by whether the value of N is even or odd.
1. If the value of N is odd then simply the value of (N+1)/2 th item is median for the data.
2. If the value of N is even, then use this formula: Median = [ size of (N+1)/2 term + size of
(N/2 + 1)th term]÷2
MODE
Mode is defined as the value that occurs most frequently in the data. Some data sets do not have
a mode because each value occurs only once. On the other hand, some data sets can have more
than one mode. This happens when the data set has two or more values of equal frequency which
is greater than that of any other value. Mode is rarely used as a summary statistic except to
describe a bimodal distribution. In a bimodal distribution, the taller peak is called the major
mode and the shorter one is the minor mode.
Advantages
1. It is the only measure of central tendency that can be used for data measured in a nominal
scale.
2. It can be calculated easily.
Disadvantages
1. It is not used in statistical analysis as it is not algebraically defined and the fluctuation in
the frequency of observation is more when the sample size is small.
Calculation of Mode