Sunteți pe pagina 1din 19

BUSINESS STATISTICS (MBA)-015

Definition of statistics:
Statistics are measurements, enumerations or estimates of natural of social phenomena,
systematically arranged so as to exhibit their interrelations.
OR
Statistics is a science and an art which deals with collection, classification, tabulation,
presentation, establishment of relationship, interpretation and forecasting of data in connection
with social, economic, natural and other problems so that the predetermined aims may be
achieved.
Importance and Scope of Statistics
The scope of application of statistics has assumed unprecedented dimensions these
days. Statistical methods are applicable in diverse fields such as economics, trade,
industry, commerce, agriculture, bio-sciences, physical sciences, education etc.
Statistics in Economics-
Statistical methods are extensively used in all branches of economics. For example.
. Time series analysis is used for studying the behavior of prices, production and
consumption of commodities, money in circulation, and ban! deposits and
clearings.
". #ndex numbers are useful in economic planning as they indicate the changes
over a specified period of time in $a% prices of commodities, $b% imports and
exports, $c% industrial&agricultural production, $d% cost of living, and the li!e.
'. (emand analysis is used to study the relationship between the price of a
commodity and its output $supply%.
). Forecasting techni*ues are used for curve fitting by the principle of least s*uares
and exponential smoothing to predict inflation rate. +nemployment rate, or
manufacturing capacity utili,ation.
Statistics in B!siness Mana"ement-
-ar!eting.- /efore a product is launched, the mar!et research term of an organi,ation,
through a pilot survey, ma!es use of various techni*ues of statistics to analy,e data on
population, purchasing power, habits of the consumers, competitors, pricing, and a
hoard of other aspect.
N.P.Singh (RATM) B.Statistics, MBA-015. 1
0nalysis of sales volume in relation to the purchasing power and concentration of
population is helpful in establishing sales territories 1, routing of salesman, and
advertising strategies to improve sales.
#rod!ction-
Statistical methods are used to carry out R 2 ( programmes for improvement in the
*uality of the existing products and setting *uality control standards for new ones.
(ecisions about the *uantity and time of either self-manufacturing or buying from
outside are based on statistically analysed data.
$inance-
0 statistical study through correlation analysis of profits and dividends helps to predict
and decide probable dividends for future years. Statistics applied to analyses of data on
assets and liabilities and income and expenditure helps to ascertain the financial results
of various operations.
Financial forecasts, brea!-even analysis, investment decisions under
uncertainty---all involve the application of relevant statistical methods for analysis.
Statistics in socia% sciences-
The following definitions reflect the importance of statistics in social sciences.
. Statistics is the science of the measurement of social organism, regarded as a
whole in al its manifestations.
". The science of statistics is the method of 3udging collective, natural or social
phenomenon from the results obtained from the analysis, enumeration or
collection of estimates.
Statistics in medica% sciences-
The !nowledge of statistical techni*ues in all natural sciences--- 4oology, botany,
meteorology and medicine--- is of great importance. For e.g. for proper diagnosis of a
disease, the doctor needs and relies heavily on factual data relating to pulse rate, body
temperature, blood pressure, heart beats, and body weight.
0n important application of statistics lies in using the test of
significance for testing the efficiency of a particular drug or in3ection meant to cure a
specific disease. 5omparative studies for effectiveness of a particular drug& in3ection
manufactured by different companies can also be made by using statistical techni*ues
such as the t-test and F-test.
To study plant life, a botanist has to rely on data about the effect of
temperature, type of environment and rainfall and so on.
N.P.Singh (RATM) B.Statistics, MBA-015. 2
Statistical data:
There are various bases to study the types of statistical data.
1. Raw data and processed data:
The statistical data in its original form $unorgani,ed% is called raw data. 6hen the raw has
passed through at least one statistical method, it is called processed data.
2. Internal data land External data:
The data obtain from internal sources are !nown as internal data and the data obtain from
external sources are !nown as external data.
3. Primary data and Secondary data:
The data which are obtained for the first time by the investigator himself for his purpose are
called primary data and are thus original in character. the data which have already been
collected by some other person for their purpose and the present investigator uses them are
called secondary data.
4. Qalitati!e data and Qantitati!e data:
0 trait or characteristic that manifests differences $variations% in magnitude is called a variable,
for example height, weight, distance etc. The trait or characteristic that manifests differences in
!ind or in *uality is called an attribute. For example deafness, blindness, etc
". #t$er types of data:
The data related to study are !nown by its name. For example the data is related to population
are !nown as population data.
%aria&le:
The *uantitative characteristic or measurable characteristic of individuals which can assume
different values for different individuals is called a variable.
're(ency:
The number of similar observations of a variable is called fre*uency of that similar observation.
#n other words, the number of times each value of a variable occurs is !nown as its fre*uency.
)entral *endency:
The tendency of the observations to concentrate around a central point in a series $or fre*uency
distribution% is !nown as central tendency.
N.P.Singh (RATM) B.Statistics, MBA-015. 3
+easres of )entral *endency
The measures which tell us the location or position of central point in a series are !nown as
measures of location or measures of central tenancy.
The various measures of central tendencies are given below.
1. +at$ematical ,!era-e:
0rithmetic average or 0rithmetic mean or 0.-.
7eometric 0verage
8armonic 0verage
2. Positional ,!era-e:
-edian
-ode
9artition :alues $;uartile, deciles, percentile%
,rit$metic ,!era-e. ,rit$metic mean. mean:
0rithmetic mean of a group of observations is the *uotient obtained by dividing the sum of all
the observations by their number. #t is represented by X .
Indi&id!a% series
X
1
n
x
n
i
i
=1
6here
X
1 sum of all observations.
n 1 number of observations
'iscrete Series
X 1
N
f x
n
i
i i
=1
, where f1 fre*uency
<1 sum of all fre*uencies1 f=f"=f'>>>>..fn.
N.P.Singh (RATM) B.Statistics, MBA-015. 4
(ro!p series
X
1
N
f m
n
i
i i
=1
mi1 mid value of class interval.
f1 fre*uency
Com)ined mean of t*o series
#f a distribution of n variate- values has mean
1 X
, and a distribution of n" variate-values has
mean
2 X
, then the mean of the distribution of n 1 n" variates values obtained by combining
these two distribution is given by.
X
1
2 1
2 2 1 1
n n
x n x n
+
+
-erits.
. #t is rigidly defined and determinate.
". #t is based on all the observations made.
'. #t is easy to calculate and simple to understand.
). #t is capable of further mathematical treatments.
(emerits.
. #t is highly effected by abnormal values
". #t can not be calculate even if one value is missing.
'. The calculation is difficult in comparison to mode and median as it can hardly be located
by inspection.
). #t does not tell any thing about the composition or nature of the distribution.
+edian:
#f a group of n variate values is arranged in ascending or descending order of magnitude, then
the middle value is called median of this group.
N.P.Singh (RATM) B.Statistics, MBA-015. 5
Formulae.
Indi&id!a% series
a% #f n is odd,
-d 1
2
1 + n
th value
b% #f n is even,
-d 1
2
th value ) 1
2
n
( th value
2
+ +
n
(ro!p series
-d 1 ? =
h
f
F
N
*
)
2
( <
6here.
< 1 total number of variate values& sum of all fre*uencies.
h 1 class interval of the median class.
? 1 lower limit of median calss.
f 1 fre*uency of the median class.
F@ 1 preceding fre*uency of median class.
-erits.
. #t is rigidly defined and easy to understand.
". #t is very readily calculated and can exactly be located.
'. #t can be calculated for distributions with open end classes.
). #t is not at all affected by extreme values.
(emerits.
N.P.Singh (RATM) B.Statistics, MBA-015. 6
. #t may not be a true representative of the data.
". #n case of even number of observations it cannot be determined exactly.
'. #t is not based on all observation.
). #t re*uires the data to be arranged in ascending or descending order before it can be
determined- an operation which involves considerable wor!.
Partition !ales:
The variate-values which divide the total fre*uency into a number of e*ual parts, when these
values are arranged in ascending or descending order of magnitude are called partition values.
Qartiles:
The three points on the scale of observations which divide the total fre*uency into four e*ual
parts are called *uartiles for the data arranged in ascending or descending order of magnitude.
Formulae.
Indi&id!a% series
#f n is odd,
; 1 th value
4
1 + n

;" 1 th value
4
) 1 ( 2 + n
;' 1 th value
4
) 1 ( 3 + n
#f n is even.
; 1 th value
4
n
;" 1 th value
4
2n
;' 1 th value
4
3n
'iscrete series
N.P.Singh (RATM) B.Statistics, MBA-015. 7
; 1 th value
4
1 + N

;" 1 th value
4
) 1 ( 2 + N
;' 1 th value
4
) 1 ( 3 + N
(ro!p series
;m 1 ? =
h
f
F
mN
*
)
4
( <
, m1 , ", '.
6here.
;i 1 ith *uartile
? 1 lower limit of the mth *uartile class.
f 1 fre*uency of the mth *uartile class.
h 1 class interval of mth *uartile class.
F@ 1 preceding fre*uency of mth *uartile class.
Deciles:
The nine points on the scale of observation which divide the total fre*uency into ten e*ual parts
are called deciles when the data are arranged in ascending or descending order of magnitude.
Formulae.
Indi&id!a% series
(3 1 th value
10
) 1 ( + n j
, 31 , ",>>>.A
'iscrete series
(3 1 th value
10
) 1 ( + N j
, 31 , ",>>>.A
(ro!p series
(3 1 ? =
h
f
F
jN
*
)
10
( <
, 31 , ",>>>.A
N.P.Singh (RATM) B.Statistics, MBA-015. 8
6here.
? 1 lower limit of the 3
th
deciles
f 1 fre*uency of the 3
th
class
F@ 1 preceding fre*uency of 3th deciles class.
h 1 class interval of decile calss.
Percentiles:
The ninety nine points on the scale of variates values which divide the total fre*uency into BB
e*ual parts are called percentiles when the values are arranged in ascending or descending
order of magnitude.
Formulae.
Indi&id!a% series
9! 1 th value
100
) 1 ( + n k
, !1 , ", ', >>>..AAC
'iscrete series
9! 1 th value
100
) 1 ( + N k
, !1 , ", ', >>>..AAC
(ro!p series
9! 1 ? =
h
f
F
kN
*
)
100
( <
, !1 , ", ', >>>..AAC
6here.
? 1 lower limit of the !
th
percentiles
f 1 fre*uency of the !
th
percentile class
F@ 1 preceding fre*uency of !th percentile class.
h 1 class interval of percentile class.
+erits and demerits of partition !ales:
. The partition values are averages of the parts of a series and hence are not the
representation of the whole series.
". The partition values can be easily obtained by graph.
N.P.Singh (RATM) B.Statistics, MBA-015.
'. They give valuable information regarding the series as the spread of various variate values
round the median.
+ode:
The value of a variable which occurs most fre*uently is called its mode. #n other words, Dthe
value of the variable for which the fre*uency is maximum is called mode or the model value.E
+ocation of mode
Indi&id!a% series
. #f the number of observation is small we can inspect mode at a glance by loo!ing which one
of the observation occurs most fre*uently.
". #f the number of observations is large we convert the individual series into discrete series and
locate the mode by loo!ing into the fre*uency table.
'iscrete series
. #f there is a single mode we can locate mode at a glance by loo!ing into the fre*uency
column for maximum fre*uency.
". 8owever if the values cluster at most more than one point, than to find out the single value,
we use grouping method.
(ro!p series
. #f there is a single class with maximum fre*uency we call this class as modal class and within
this class mode is obtained by the formula.
-ode 1 4 1 ? =
h
f f f
f f
m
m
*
2
2 1
1

6here.
fm 1 fre*uency of the model class.
f 1 fre*uency of the class preceding to model class.
f" 1 fre*uency of the class following the model class.
? 1 lower limit of model class.
h 1 model class interval
". 8owever, if the values cluster at most than one class interval we decide the model class by
grouping method and then use the formula given above. #f this formula fails, then we use>.
N.P.Singh (RATM) B.Statistics, MBA-015. 10
-ode 1 4 1 ? =
h
f f
f
*
2 1
2
+
'. #f the class intervals are not e*ual magnitude we convert the series into e*ual class interval
and then obtain the mode. #f such a conversion is not possible, then in case of moderately
asymmetrical distributions we use the formula
-ode 1 ' -edian F " -ean
). 6e can also locate mode graphically by drawing a histogram.
Merits
. #t is readily comprehensible and easily understood.
". #t is easy to calculate. #n some cases it is located by mere inspection. #t can be very easily
determined from the graph.
'. #t can be calculated for distributions with open end classes.
). #t can not affected by extreme values provided they are not in the modal class.
'emerits
. #t is ill defined.
". #t is indefinite and indeterminate and in some cases impossible to find a definite value.
'. #t is not based on all observation made.
). #t is not capable of further mathematical treatment.
+easres of dispersion
'ispersion
#n a series of numerical values, all values are not uniform in their si,e.
The word dispersion is used in two senses in statistics.
i. The scattered ness of the value a variable due to variation among themselves is called
dispersion.
ii. The deviation from a measure of central tendency or any other fixed value is not uniform in
their si,e.
+easres of Dispersion:/
1, -easures based on the scattered ness of the values of a variable among themselves.
a% Range.
b% #nter*uartile Range.
N.P.Singh (RATM) B.Statistics, MBA-015. 11
c% Semi- inter*uartile Range or ;uartile (eviation.
-, -easures based on the spread of the deviations about some point.
a% -ean (eviation.
b% Standard (eviation.
The most common relative measures of dispersion are.
% Range 5oefficient of (ispersion 1 100 *
values extreme the of Sum
values extreme between difference
"% ;uartile 5oefficient of (ispersion 1
) Q (Q
) Q - Q (
1 3
1 3
+
'%5oefficient of -ean (ispersion 1
a
a oint an! about deviation "ean
)%5oefficient of variation $:% 1 100 *
# #
Mean
D S
.an"e- The difference of the two extreme observations of a data arranged in ascending order
is called Range. i.e. The difference between the largest and the smallest values of a set of
observation is called range.
Range1 Gn - G
6here.
Gn 1 largest value of the series
G 1 smallest value of the series.
Range 5oefficient of (ispersion 1
values extreme the of Sum
values extreme the of $ifference
1
1 n
1 n
% %
% - %
+
-erits, (emerits of Range.
-erits.
. Range is the simplest measure of dispersion.
". Range is rigidly defined.
(emerits.
. #t is based only on two extreme observation which may be purely accidental.
". #ts sampling fluctuation is *uite high in most cases and is rather difficult to ascertain.
'. #t is affected by the extreme values.
N.P.Singh (RATM) B.Statistics, MBA-015. 12
Inter/!arti%e .an"e-
(efinition. The difference between the third *uartile and first *uartile is !nown as the
inter*uartile range.
#nter*uartile Range 1 ;' F ;
Semi- #nter*uartile Range or ;uartile (eviation.
(efinition. 8alf of the difference between third *uartile and first *uartile is called the semi F
inter*uartile range or *uartile deviation.
;.(. 1
2
Q - Q
1 3
5oefficient of ;uartile (eviation 1
1 3
1 3
Q Q
Q - Q
+
Mean 'e&iation-
(efinition. The arithmetic mean of the absolute deviations about any point a is called the mean
deviation or mean absolute deviation about the point a.
Formula.
Indi!idal series:
-ean deviation about DaE

=
n
i
i a
a x
n
1
1

-ean deviation about x $-ean%.

=
n
i
i
x
x x
n
1
1

Discrete Series:
-ean deviation about mean.

=
n
i
i i
x
x x f
N
1
1

-ean deviation about median.

=
n
i
d i i M
M x f
N
d
1
1

N.P.Singh (RATM) B.Statistics, MBA-015. 13


-ean deviation about mode.

=
n
i
i i Z
Z x f
N
1
1

0rop series:
#n the case of grouped fre*uency distribution we find mid-values of the class intervals and
proceed as in case of discrete series.
-ean deviation about mean.

=
n
i
i i
x
x m f
N
1
1

Coefficient of Mean 'e&iation


To compare two or more groups of observations, we define coefficient of mean deviation. The
*uotient obtained by dividing the mean deviation by the point about which it is calculated is
called coefficient of mean deviation.
5oefficient of mean deviation 1
a
a

Standard 'e&iation
The positive s*uare root of the arithmetic mean of the s*uares of the deviations of observations
in a series about its arithmetic mean is called standard deviation and is denoted by #
The standard deviation may be defined as the root mean s*uare deviation when the deviations
are ta!en about the arithmetic mean.
Formulae.
Indi&id!a% series
2
) (
1
X X
n
i
=
Or
2
2
) (
n
d
n
d

=
6here d1 xi 1 0, and 01 assumed mean
'iscrete series
2
) (
1
X X f
N
i i
=
Or
N.P.Singh (RATM) B.Statistics, MBA-015. 14
2
2
) (
N
fd
N
fd

=
(ro!p series
2
) (
1
X m f
N
i i
=
Or
2
2
) (
N
fd
N
fd

=
6here, A x d =
h
N
fu
N
fu
* ) (
2
2

=
,
i
A X
u
i

= .
h1 class interval
Coefficient of 0ariation
To compare the dispersion in two and more series we defined coefficient of variation as.
5oefficient of variation
X

= .
-erits of S.(.
. Standard deviation is rigidly defined.
". Standard deviation is based on all the observation.
'. #t is not much affected by sampling fluctuations.
). #t is capable of further mathematical manipulations.
(emerits.
. it is difficult to calculate.
". #t is not easy to understand.
'. it gives greater weight to extreme values.
). #t is impossible to find it exactly in case of open-end classes.
+ses.
. Standard deviation should be used widely in statistics unless there is some definite
reason for not using it.
". Standard deviation should be used when we need a measure of greater stability.
'. Standard deviation plays a great role in sampling theory and correlation theory.
). #n fact, standard deviation is regarded as the best and most powerful measure of
dispersion.
M1MENTS2 S3E4NESS AN' 3U.T1SIS
N.P.Singh (RATM) B.Statistics, MBA-015. 15
Moments
The arithmetic mean of the r
th
powers of the deviations of all values of a variable about its
arithmetic mean is called r
th
moment about mean or r
th
central moment and is denoted by #
r

Formula.
Indi&id!a% series

=
r
i r
X X
n
) (
1
, 6here r1 , ", ', ).
'iscrete series

=
r
i r
X X f
N
) (
1
, 6here r1 , ", ', ).
(ro!p series

=
r
i r
X m f
N
) (
1
. 6here r1 , ", ', ).
S5e*ness
0 fre*uency distribution is said to be s!ewed if the fre*uencies decrease with mar!edly greater
rapidity on one side of the central maximum than on the other. This characteristic of a fre*uency
distribution is !nown as s!ewness.
T6pes of s5e*ness
. 9ositive S!ewness.
0 fre*uency distribution is said to be positively s!ewed or s!ewed to the right of the observation
pile up at the lower values of the variable. This property is !nown as positive s!ewness.
". 0 fre*uency distribution is said to be negatively s!ewed of s!ewed to the left if the
observations pile up at the higher values of the variable. This property is !nown as negative
s!ewness.
Meas!res of S5e*ness
. Harl 9earsonIs 5oefficient of S!ewness.
N.P.Singh (RATM) B.Statistics, MBA-015. 16

Z X
S
k

= .
Or

) ( 3
d
k
M X
S

=
#f 0 = Z X <o s!ewness
#f Z X > 9ositive S!ewness
#f Z X < <egative S!ewness
". /owleyIs 5oefficient of S!ewness.
1 3
1 3
) 2 (
Q Q
Q M Q
BS
d
k

+
=
#f
1 2 2 3
Q Q Q Q =
<o s!ewness
#f
1 2 2 3
Q Q Q Q >
9ositive s!ewness
#f
1 2 2 3
Q Q Q Q <
<egative s!ewness
'. HellyIs 5oefficient of S!ewness.
10 &0
10 '0 &0
2
P P
P P P
KS
k

+
=
/eta and gamma coefficient.
3
2
2
3
1

=
1 1
=
( 0
1
< <egative s!ewness
( 0
1
= <o s!ewness
( 0
1
> 9ositive s!ewness
N.P.Singh (RATM) B.Statistics, MBA-015. 17
3!rtosis
The characteristic related with nature of the concentration of the items in the central part of a
fre*uency distribution is called !urtosis.
Harl 9earsonIs introduced three terms.
. -eso!urtic
". ?epto!urtic
'. 9laty!urtic
/eta and 7amma coefficient.
2
2
4
2

=
3
2 2
=
N.P.Singh (RATM) B.Statistics, MBA-015. 18
N.P.Singh (RATM) B.Statistics, MBA-015. 1

S-ar putea să vă placă și