Documente Academic
Documente Profesional
Documente Cultură
Now-a-days the word statistics is used to give the following three meanings
Firstly, it is used in the plural sense refer to numerical facts in any field of study. The
word “data” is used in the same sense and is always in the plural.
Secondly, the word statistics is used in the singular sense. In this sense, it refers to the
science comprising method, presentation, analysis and interpretation of numerical
data.
Thirdly, the word statistics is used in a technical sense as a plural of statistic. By
statistic we mean a quantity calculated from sample observations.
Characteristics of Statistics
Statistics (as data) have the following characteristics
Statistics are aggregate of facts.
Statistics are affected to a great extend by multiplicity of causes.
Statistics are numerically expressed.
Statistics are collected in a systematic manner.
Importance of Statistics
There are following importance (application or scope) of statistics in different fields
Statistics plays an important role in business.
It is the eyes of the administration of the state.
Statistical data and method are the needs of various insurance companies.
It has a pivotal position in almost all the natural and social science
Secondary data
The data published or used by an organization other than the one who originally collected
them are called secondary data. For example the data in the Economic survey of Pakistan are
secondary data.
Average
A single value which represents all the values of the data set is called an Average.
Why Average is called Measure of central tendency?
The Average value lies center of the data that’s why it is called measure of central tendency.
Qualities of a good Average
The good Average should possess the following qualities.
It should be clearly defined by a mathematical formula.
It should be based on all the values of data.
It should be simple to understand and easy to calculate. It should be not be
affected by extreme values.
It should have sampling stability
Types of an Average.
There are following types of an average
The Arithmetic mean or mean (𝑥̅ )
The Geometric mean (G.M)
The Harmonic mean (H.M)
The Median(𝑥 ̃)
T h e M o d e (𝑥̂)
𝑓1 𝑥1 + 𝑓2 𝑥2 + ⋯ 𝑓𝑘 𝑥𝑘 ∑ 𝑓𝑥
𝑋̅ = =
𝑓1 + 𝑓2 + ⋯ 𝑓𝑘 ∑𝑓
Methods to calculate the Arithmetic Mean
There are three methods are used to calculate the Arithmetic mean
Direct Method
S h o r t c ut M e t h od
C o d i n g M e t h od
Direct Method
Ungroup data. Group data.
∑𝑋 ∑ 𝑓𝑥
̅=
𝑋 ̅=
𝑋
𝑛 ∑𝑓
∑𝑈 ∑ 𝑓𝑈
̅=𝐴+
𝑋 ×ℎ ̅ =𝐴+
𝑋 ×ℎ
𝑛 ∑𝑓
𝑋−𝐴
Where 𝑈 = ℎ . 𝐴 is an Arbitrary Constant and ℎ is equal class interval in group data and
equal difference between values in ungroup data.
If you have ∑ 𝒇𝒙 = 𝟑𝟎𝟖 𝒂𝒏𝒅 ∑ 𝒇 = 𝟑𝟎 𝒄𝒂𝒏 𝒘𝒆 𝒇𝒊𝒏𝒅 𝒎𝒆𝒂𝒏 𝒂𝒏𝒅 𝒘𝒉𝒂𝒕 𝒊𝒔?
Yes we can find the mean and its value is obtained as
∑ 𝑓𝑥 308
𝑋̅ = = = 10.26
∑𝑓 30
The sum of deviations of 15 values from 20 is 45 find Arithmetic mean?
Here
𝑛 = 15, 𝐴 = 20 𝑎𝑛𝑑 ∑ 𝐷 = 45
Then by short cut method
∑𝐷 45
̅ =𝐴+
𝑋 = 20 +
𝑛 15
= 20 + 3 = 23
Properties of Arithmetic mean
There are following properties of Arithmetic mean.
Sum of the deviations of values from their mean is always equal to zero.
If for any distribution ∑(𝒙 − 𝟏𝟎) = −𝟏𝟒, ∑(𝒙 − 𝟐𝟎) = 𝟏𝟒 𝒂𝒏𝒅 ∑(𝒙 − 𝟏𝟓) = 𝟎 then
what is mean?
By the property of Arithmetic mean we know that “Sum of the deviations of values from their mean is
always equal to zero”
Therefore
∑(𝑥 − 15) = 0 , So 𝑋̅ = 15
We have ∑(𝒙 − 𝟏𝟓)𝟐 = 𝟖𝟔𝟖, ∑(𝒙 − 𝟏𝟔)𝟐 = 𝟕𝟐𝟎 𝒂𝒏𝒅 ∑(𝒙 − 𝟐𝟎)𝟐 = 𝟗𝟖𝟐 what is the
value of mean an why?
By the property of Arithmetic mean we know that “Sum of the squared deviations of values from a
Constant (A) is minimum if and only if 𝐴 = 𝑋̅”
Therefore
∑(𝑥 − 16)2 = 720 Is minimum therefore 𝑋̅ = 16
Weighted Arithmetic mean.
In simple arithmetic mean equal importance or weight is given to all values of the given data but
when in a data set, values are not of equal importance we assign them weight (𝑤1 , 𝑤2 , …,𝑤𝑛 )
according to their relative importance the computed mean is called weighted mean denoted by 𝑋̅𝑤 .
if 𝑥1 , 𝑥2 , … 𝑥𝑛 are n values with corresponding weights 𝑤1 , 𝑤2 , …,𝑤𝑛 then 𝑋̅𝑤 is defined as
∑ 𝑤𝑥
𝑋̅𝑤 =
∑𝑤
Geometric mean
The geometric mean is defined as a value obtained by the nth root of the product of n positive values
denoted by G.M. Let 𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 are non negative ( 𝑋𝑖 > 0) values then
geometric mean is defined as G.M= 𝑛√𝑋1 . 𝑋2 . … . 𝑋𝑛 . Similarl y in group data
∑𝑓
where each x has corresponding class frequencies then G.M= √𝑋1 𝑓1 . 𝑋2 𝑓2 . … 𝑋𝑘 𝑓𝑘 . After
taking log of each value then geometric mean is also obtained as
∑ 𝑙𝑜𝑔𝑋
𝐺. 𝑀 = 𝑎𝑛𝑡𝑖𝑙𝑜𝑔 ( )
𝑛
for grouped data
∑ 𝑓𝑙𝑜𝑔𝑋
𝐺. 𝑀 = 𝑎𝑛𝑡𝑖𝑙𝑜𝑔 ( )
∑𝑓
The G.M of three positive numbers is 4 By including the fourth number in the series G.M
becomes 2 what is the fourth number?
Let a, b, c,
3
d are four positive numbers and G.M of three numbers is
𝐺. 𝑀 = √𝑎𝑏𝑐1
= 4 = (𝑎𝑏𝑐)3
(𝑛 + 1) 2(𝑛 + 1) 99(𝑛 + 1)
𝑃1 = th value , 𝑃2 = th value , … , 𝑃99 = th value
100 100 100
Define Quantiles.
Collectively, the quartiles, deciles, percentiles and other values obtained by equal sub-
division of the data are called quantiles.
Empirical relation between mean median and mode
In a symmetrical distribution mean, median and mode coincide. In a moderately skewed
(asymmetrical) median lies between mean and mode and it is twice as far from the mode as
from the mean the following approximate relation holds between these three averages
which is called empirical relation.
Mode=3Median-2Mean
𝑋𝑚 − 𝑋0
𝑋𝑚 + 𝑋0
Coefficient of Quartile deviation
𝑄3 − 𝑄1
𝑄3 + 𝑄1
Coefficient of mean deviation
𝑀. 𝐷𝑥̅ 𝑀. 𝐷𝑋̃
By mean it is obtained as and By median it is obtained as
𝑥̅ 𝑥̃
Moments
Moments are defined as the mean of the different power of the deviations of the observations
taken from their mean. For sample data, the first four moments about the mean 𝑋̅ are defined
as
∑(𝑥 − 𝑥̅ ) ∑(𝑥 − 𝑥̅ )2
𝑚1 = = 0(𝑎𝑙𝑤𝑎𝑦𝑠) 𝑚2 = = (𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒)
𝑛 𝑛
Role of Moments
Moments play vital role to describe the distribution. We can obtain the following information
from the moements
The center value of the distribution
The measure of dispersion (variance )
The measure of symmetry of the distribution
The measure of peakedness or flatness of the distribution
The first two moments about 𝑿 = 𝟒 are 1 and 16 find C.V?
Here
𝐴 = 4, 𝑚1 / = 1 𝑎𝑛𝑑 𝑚2 / = 16
Then 𝑋̅ = 𝐴 + 𝑚1 / = 4 + 1 = 5
2
and 𝑚2 = 𝑆 2 = 𝑚2 / − 𝑚1 / = 16 − (1)2 = 15
Now
𝑆 3.873
𝐶. 𝑉 = × 100 = × 100 = 77.46%
𝑋̅ 5
Symmetry
The property of a unimodal distribution (distribution having one mode) that the values
equidistant from maximum height, have equal heights/frequencies is called symmetry.
Skewness
The lack of symmetry in a distribution around some centre value (mean, media or
mode) is called skewness.
There are following methods to measure skewness
1. Karl pearson coefficient of skewness
Mean − Mode
𝑆𝑘 =
Standard deviation
Sometimes mode is difficult to find there for by empirical relation exists between mean,
median and mode, coefficient of skewness can be changed as
3(Mean − Median)
𝑆𝑘 =
Standard deviation