Documente Academic
Documente Profesional
Documente Cultură
Prepared by:
Rahul Ranganathan (121)
Rajiv V (125)
Prakash Sethu (107)
Prateek Kumar Kureel (108)
Pruthvi Raj (114)
Data Collected:
The data with which we are going to do statistical analysis is: Consumption of Conventional Energy in
India (Peta Joules).
Dataset 1
Year
1970-71
1975-76
1980-81
1985-86
1990-91
1995-96
2000-01
2005-06
2006-07
2007-08
2008-09
2009-10
2010-11
Data Brief:
The data collected for consumption of conventional energy was reported in the 19th issue of Energy
Statistics (for the year 2012) released by the Central Statistics Office, Government of India. The
above data in itself is collected in collaboration with the below entities:
1. Office of Coal Controller, Ministry of Coal
2. Ministry of Petroleum & Natural Gas
3. Central Electricity Authority
Why the data was maintained by the source and its importance?
The Central Statistics Office, as part of the Ministry of Statistics and Programme Implementation in
India maintains data not just with respect to energy but also industrial, social and price statistics.
Data released periodically with respect to CPI and WPI index, IIP index and energy statistics from the
Central Statistics Office provide the basis for decision making for government policy and for the
government to track progress and take a stance.
The data for consumption of conventional energy sources was published as part of a wider report
covering both renewable and non-renewable sources of energy and to monitor trends in
consumption and future outlook. For instance, the particular data used for this assignment provides
valuable insight into the rate at which the energy consumption is increasing in India over the recent
past.
Statistics Assignment No 1
Page 2
Type of data:
The type of data used for analysis is NUMERICAL data.
Statistical Analysis:
Concept Name: Frequency Distribution
Selection of Variable: Consumption of Conventional Energy in India (Peta Joules)
X-Axis: Class intervals
Y-Axis: The frequency of occurrence of the value in the corresponding class interval
Formula and calculation steps:
We will use Dataset 1 to construct a frequency distribution chart. With the range of data from 0 to
around 43000 Peta Joules, the class intervals are formed with an interval of 8000 as shown below.
The frequency distribution table thus looks:
Class Intervals
0-8000
8001-16000
16001-24000
24001-32000
32001-40000
40001-48000
Total
Frequency
3
2
2
2
2
2
13
Statistics Assignment No 1
Page 3
Frequency
3
2
2
2
2
2
13
Relative frequency
0.23
0.15
0.15
0.15
0.15
0.15
1.00
Statistics Assignment No 1
Page 4
Cumulative
frequency
3
5
7
9
11
13
13
Statistics Assignment No 1
Page 5
Frequency
1
3
2
1
3
3
0
Statistics Assignment No 1
Page 6
Frequency
3
2
2
2
2
2
13
Statistics Assignment No 1
Page 7
Statistics Assignment No 1
Page 8
This is plotted by using the frequency distribution table that we have been using. The difference is
that the type of information that can be derived from an Ogive. Ogive tells us about the number of
values less than/ more than a particular value.
Findings and Interpretation of results:
An Ogive will give us information about how many data points are less than/ greater than a
particular value is present. For example, in the Less than Ogive, it says that there are 5 values
which are less than 8000 in the data that we have.
Statistics Assignment No 1
Crude
Petroleum
8632
Natural
Gas
1974
Electricity
21879
Page 9
Statistics Assignment No 1
Leaves
4
3
2
1
0
5
8
8
4
3
Page 10
Consumption Qty
42664
40353
36330
34428
31040
28299
22198
18188
13313
Statistics Assignment No 1
Relative Distribution
0.146303993
0.138379079
0.12458335
0.118060985
0.106442808
0.097043332
0.076121696
0.062370547
0.045653128
Cumulative %
14.6303993
28.4683072
40.92664225
52.73274077
63.37702152
73.08135468
80.69352427
86.93057899
91.4958918
Page 11
10
11
12
13
9471
6394
5074
3860
0.032478087
0.021926395
0.017399833
0.013236767
94.74370053
96.93634007
98.67632333
100
With this data, we can identify the cumulative consumption percentage of consumption in the
decreasing order of consumption. The Pareto diagram is drawn thus:
Statistics Assignment No 1
Page 12
Statistics Assignment No 1
Consumption
3860
5074
6394
9471
13313
18188
22198
28299
31040
34428
36330
40353
42664
Page 13
Frequency
3
2
2
2
2
2
13
= +
(
)
Statistics Assignment No 1
Page 14
Frequency
3
2
2
2
2
2
13
( + 1)
( + 1)
2
+
Where, n = number of values in the series
F = sum of all the class frequencies excluding the median class
fm = frequency of the median class
w = class interval width
Lm = lower limit of the median class interval
Using the formula, we can estimate the mode as= 5333.33
Findings and Interpretation of results:
With the original series of data available with us, we know that the median for this set of data is
22198. Obviously, there is a huge deviation in the Median obtained through the formula and the
proper median. Thus, this formula will be a useful one there is data constraints but there will be a
high degree of deviation in the results.
Page 15
Page 16
Statistics Assignment No 1
Page 17
Page 18
is the element
is the Athematic mean of the data.
Mean =
'
= 22431.69
x
3860
5074
6394
9471
13313
18188
22198
28299
31040
34428
36330
40353
42664
Deviation
18571.69
17357.69
16037.69
12960.69
9118.692
4243.692
233.6923
-5867.308
-8608.308
-11996.31
-13898.31
-17921.31
-20232.31
Absolute Deviation
18571.69
17357.69
16037.69
12960.69
9118.692
4243.692
233.6923
5867.31
8608.31
11996.31
13898.31
17921.31
20232.31
(234)"
5
= 180670816
Where
1. X = Observation
Statistics Assignment No 1
Page 19
2.
3.
4.
5.
6.
6 = 7789:
; = ::9 8<= 9:> : 7789:
= @8 99 A98> ( B 6 )
0 = 7789: >:= A:
0
= 7789: C=D
x
x-
(x-)2
3860
22432
18572
344907755
5074
22432
17358
301289482
6394
22432
16038
257207575
9471
22432
12961
167979545
13313
22432
9119
83150549.4
18188
22432
4244
18008924.4
22198
22432
234
54612.0947
28299
22432
-5867
34425299.6
31040
22432
-8608
74102961.3
34428
22432
-11996
143911398
36330
22432
-13898
193162957
40353
22432
-17921
321173269
42664
22432
-20232
409346275
Summation
2348720603
(234)"
5
Where
1.
2.
3.
4.
5.
X = Observation
6 = 7789:
; = ::9 8<= 9:> : 7789:
= @8 99 A98> ( B 6 )
0 = 7789: >:= A:
Page 20
I
4
Where
1. 6 = 7789:
2. 0 = 7789: >:= A:
Findings and Interpretation of results:
With the coefficient of variation at 60%, it can be concluded that the distribution of data is highly
dispersed with respect to the mean.
Coefficient of Skewness = @J =
#
K
LM!( 34)
(53
)I #
= 1.65
Where
1.
2.
3.
4.
5.
6.
X = Observation
6 = 7789:
; = ::9 8<= 9:> : 7789:
= @8 99 A98> ( B 6 )(
0 = 7789: >:= A:
0
= 7789: C=D
x
x-
(x-)3
3860
22432
18572
6405520703583.78
5074
22432
17358
5229690128413.75
6394
22432
16038
4125015939940.37
9471
22432
12961
2177131197958.20
13313
22432
9119
758224275215.75
Statistics Assignment No 1
Page 21
18188
22432
4244
76424333956.14
22198
22432
234
12762426.43
28299
22432
-5867
-201983824896.17
31040
22432
-8608
34428
22432
-11996
36330
22432
-13898
40353
22432
-17921
42664
22432
-20232
-637901092000.60
1726405413819.48
2684638207112.31
5755844983504.25
8282019779521.16
Summation
-516773959359.54
Coefficient of Kurtosis = @N =
4
;
=1(6)
(;1)04
= -0.018
x-
(x-)4
3860
22432
18572
118961359577511000.00
5074
22432
17358
90775352113581700.00
6394
22432
16038
66155736409089900.00
9471
22432
12961
28217127570213800.00
13313
22432
9119
6914013865915450.00
18188
22432
4244
324321358130165.00
22198
22432
234
2982480884.74
28299
22432
-5867
1185101249535000.00
31040
22432
-8608
5491248877200220.00
34428
22432
-11996
20710490545844300.00
36330
22432
-13898
37311927844972200.00
40353
22432
-17921
103152268978605000.00
Statistics Assignment No 1
Page 22
42664
22432
-20232
Summation
167564372493050000.00
646763323866130000.00
Where
1.
2.
3.
4.
5.
6.
X = Observation
6 = 7789:
; = ::9 8<= 9:> : 7789:
= @8 99 A98> ( B 6 ).
0 = 7789: >:= A:
0
= 7789: C=D
Data Collected:
The data with which we are going to do statistical analysis is: Consumption of Conventional Energy in
India (Peta Joules).
Dataset 2
Year
2011-12
2010-11
2009-10
2008-09
2007-08
Exports (Rs
crore)
1454066
1142922
845534
840754
655863
Imports (Rs
crore)
2342217
1683467
1363736
1374434
1012312
IIP
index
170.2
165.5
152.9
145.2
141.7
Data Brief:
The data collected gives a measure of the imports and exports from India which, in turn, gives an
account of the trade deficit for the Indian economy over the last five years. Also included in the data
set is the Index of Industrial Production which is calculated with a base of 100 for the year 2004-05.
Why the data was maintained by the source and its importance?
The data was collected from the Macro Economic Indicators section of Economic and Political
Weekly, a popular fortnightly magazine which publishes articles, commentary and editorials on
current topics related to economy and politics. The articles are written by eminent academicians and
members of the industry.
Statistics Assignment No 1
Page 23
The Macro Economic Indicators section is maintained by the Economic and Political Weekly as a
regular section in their newspaper giving an overall view of the Indian economy with respect to the
trade balance, money and banking and index numbers of wholesale prices.
Type of data:
The type of data used for analysis is NUMERICAL data.
-------------------------------------------------------------------------------------------------------------------------------------Concept Name: Geometric mean
Selection of Variable: Index of Industrial Production (IIP) data
Formula and calculation steps:
The Index of Industrial Production, as mentioned before, is relative to a base of 100 for the year
2004-05. Hence, in order to find the average index over the course of 5 years, the arithmetic mean is
not a suitable measure whereas the Geometric mean is.
In order to calculate the geometric mean, the index numbers are divided by 100 to factor to a base
of 1 instead of 100 resulting in the table below:
Year
2011-12
2010-11
2009-10
2008-09
2007-08
IIP index
170.2
165.5
152.9
145.2
141.7
Scaled to 1
1.702
1.655
1.529
1.452
1.417
Geometric mean is then calculated as the 5th root of product of all index numbers across 5 years.
Hence, geometric mean = (1.702 * 1.655 * 1.529 * 1.452 * 1.417)^0.2 = 1.547
Findings and Interpretation of results:
Hence, from the geometric mean, we can conclude that the average index for industrial production
across 5 years starting from 2007-08 is 155.5 scaled to a base of 100.
Statistics Assignment No 1
Page 24
The Scatter plot is a graph which describes the relationship between two variables. In this case, the
Scatter Plot is plotted with the Exports on the x-axis and the Imports on the y-axis to provide a
relationship between the exports and imports of India.
Statistics Assignment No 1
Page 25