Documente Academic
Documente Profesional
Documente Cultură
SET-I
Q.1 Write Short notes on following methods of classification in Statistical Survey
i. One Way Classification.
ii. Two Way Classification
iii. Manifold Classification.
Answer:
Figure below depicts the number of students who has secured more than 60% in various sub-
modules of statistics. This can be classified using one-way classification method.
1
Figure below depicts the classification of students according to gender, who has secured more
than 60% in respective sub-modules of statistics. In the sub-module titled ‘Basic Concepts’, ten
students got more than 60%. Out of ten students, four are males and six are females.
c. Manifold classification – Classification done according to more than two attributes or variables
is known as manifold classification.
Figure below depicts the classification of employees according to skill, sex and education.
2
Q.2 What do you mean by Statistical Averages? List various requisites of a Good Average.
Answer:
The statistical average or simply an average refers to the measure of middle value of the data
set. The objectives of statistical average are to:
Present mass data in a concise form: The mass data is condensed to make the data
readable and to use it for further analysis. It is very difficult for human mind to grasp a
large body of numerical figures. A measure of average is used to summarise such data
into a single figure, which makes it easier to understand.
Facilitate comparison: It is difficult to compare two different sets of mass data. However,
we can compare those two after computing the averages of individual data sets. While
comparing, the same measure of average should be used. It leads to incorrect
conclusions when the mean salary of employees is compared with the median salary of
the employees.
Establish relationship between data sets: The average can be used to draw inferences
about the unknown relationships between the data sets. Computing the averages of the
data sets is helpful for estimating the average of population.
Provide basis for decision-making: In many fields such as business, finance, insurance and
other sectors, managers compute the averages and draw useful inferences or conclusions
for taking effective decisions.
3
It should be simple to calculate and easy to understand.
It should be based on all values.
It should not be affected by extreme values.
It should not be affected by sampling fluctuation.
It should be rigidly defined, preferably by an algebraic formula, so that different persons
obtain the same value for a given set of data.
Q.3 In a beauty contest, the ranks provided by three different judges to 10 competitors are given
in following table. Find out which pair of judges are more associated in term of same pattern for
ranking
Competitors A B C D E F G H I J
Judge 1 3 4 6 7 9 8 2 10 1 5
Judge 2 4 5 6 8 7 10 1 9 2 3
Judge 3 5 7 9 8 10 6 3 4 1 2
Answer:
With a view to find out the Rank Correlation between all the 3 judges we will have to find the
correlation between the following:
4
The Ranks given by the three judges would be denoted by R1, R2 and R3
5
= 1 – 0.4485
= 0.55
Interpretation:
Correlation between all the judges are positive and opinions of all the judges are of similar type,
(their correlation is positive) i.e., their likings and disliking are very much common.
6
SET-II
Q.1 Write short notes on
a. Type I and Type II error
b. Level of Significance
c. Null Hypothesis
d. Two–tailed Tests and One–tailed Tests
e. Test Statistics
Answer:
Type I error: Rejecting a true null hypothesis. The probability of a type I error is indicated by
alpha (α).
Type II error: Not rejecting a false null hypothesis. The probability of a type II error is indicated by
beta (β).
Level of significance: The smallest probability at which the null hypothesis would be rejected
(type I error). Usually, if the significance level is less than a number such as 0.05 (5%), the null
hypothesis would be rejected in favour of the alternative; the chance of getting a sample like the
one being analysed if the null hypothesis were true. A small significance level would imply that
getting such a sample was highly unlikely, suggesting that the null hypothesis is probably not
true; also called the P-value of the test.
In the test for independence, the null hypothesis is that the row and column variables are
independent of each other. We have studied earlier, that the hypothesis testing is done under
the assumption that the null hypothesis is true.
7
The data are the observed frequencies
The data is arranged in the form of a contingency table
A two-tailed test of a hypothesis will reject the null hypothesis, if the sample mean is significantly
higher than or lower than the hypothesised population mean. Thus, in a two-tailed test, rejection
region is split in two parts under the distribution curve.
In statistical significance testing, a one-tailed test and a two-tailed test are alternative ways of
computing the statistical significance of a parameter inferred from a data set, in terms of a test
statistic. A two-tailed test is used if deviations of the estimated parameter in either direction
from some benchmark value are considered theoretically possible; in contrast, a one-tailed test
is used if only deviations in one direction are considered possible. Alternative names are one-
sided and two-sided tests; the terminology "tail" is used because the extreme portions of
distributions, where observations lead to rejection of the null hypothesis, are small and often
"tail off" toward zero as in the normal distribution or "bell curve"
A test statistic is a statistic (a quantity derived from the sample) used in statistical hypothesis
testing. A hypothesis test is typically specified in terms of a test statistic, considered as a
numerical summary of a data-set that reduces the data to one value that can be used to perform
the hypothesis test. In general, a test statistic is selected or defined in such a way as to quantify,
within observed data, behaviours that would distinguish the null from the alternative hypothesis,
where such an alternative is prescribed, or that would characterize the null hypothesis if there is
no explicitly stated alternative hypothesis.
An important property of a test statistic is that its sampling distribution under the null hypothesis
must be calculable, either exactly or approximately, which allows p-values to be calculated. A
test statistic shares some of the same qualities of a descriptive statistic, and many statistics can
be used as both test statistics and descriptive statistics. However, a test statistic is specifically
8
intended for use in statistical testing, whereas the main quality of a descriptive statistic is that it
is easily interpretable. Some informative descriptive statistics, such as the sample range, do not
make good test statistics since it is difficult to determine their sampling distribution.
Answer:
Direct method
= 1051/10
= 105.1
9
110 14
120 24
135 39
110 14
96 0
145 49
55 -41
95 -1
125 29
91
Mean = a + ∑dx/N
= 96 + 91/10
= 96 + 9.1
= 105.1
Q.3 Production for last 7 years of XYZ ltd is given in following table:
Answer:
10
The trend line can be fitted by using the method of least squares for the given data.
Production
Year y X= Year - 2012 XY X^2
2009 8 -3 -24 9
2010 12 -2 -24 4
2011 13 -1 -13 1
2012 17 0 0 0
2013 25 1 25 1
2014 22 2 44 4
2015 30 3 90 9
127 0 98 28
Y a , therefore,
a = Y/N
= 127/7
= 18.14
XY b X^2 , therefore, b = XY/ X
= 98/28
= 3.5
Y a bX 18.14 3.5 X
for 2016, X = 4
11
Y = 18.14 + (3.5 *4)
= 79.65 – 7.44
= 32.14
for 2017, X = 5
Y = 18.14 + (3.5 * 5)
= 79.65 – 7.44
= 35.64
12