Documente Academic
Documente Profesional
Documente Cultură
Web: http://statdu.ac.bd/akanda/
E-mail:
mail: akanda@du.ac.bd
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
Sample
(1) A small but representative part with
finite number of individuals or items of
a population is called a sample.
(2) A sample is always finite
(3) The statistical measures obtained from
the sample observations has been
termed as statistics
(4) Sample size is always smaller than the
Population size.
(5) Sample survey deals with the sample.
(6) Sample is a subset of the population.
(7) Small letters are used to denote sample
size usually by .
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
Statistic
(1) Any function of the sample observation
is called statistic.
(2) Statistic does not contain unknown
constant.
(3) Statistic are used to estimate population
characteristics (such as parameters)
(4) Statistics are subject to sampling and
non-sampling error.
(5) Statistic has distribution, which is called
sampling distribution.
(6) The sample mean x , variance s 2 etc are
called statistic.
Web: http://statdu.ac.bd/akanda/
E-mail:
mail: akanda@du.ac.bd
Variable: A variable is measurable quantity and quality which can assume any of a
prescribed set of values, called the domain of the variable. Thus the height of a person, the
yield of a crop, the price of a commodity, and the number of children in a family are some
examples of variables.
Constant: The term constant refers to a property whereby the members of a group or
category remain fixed and do not differ one from another.
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
Qualitative Variable: A qualitative variable is one for which numerical measurement is not
possible, such as hair color (brown, black, white etc.) religion (Muslim, Hindu, Christian,
etc.).
Quantitative Variable: A quantitative variable is one for which the resulting observations
are numeric and thus possesses a natural ordering. Examples of such variable include height,
weight, family size, age, number of accidents, etc.
Difference between quantitative and qualitative variables:
Qualitative variable
(1) A qualitative variable is one for which
numerical measurement is not possible.
(2) Qualitative variable can only be counted.
(3) These types of variable are always
discrete.
(4) The average of qualitative variable can
be measured by median and mode.
(5) Intelligence, beauty are the examples of
qualitative variable.
Quantitative variable
(1) Quantitative variable can be expressed
numerically.
(2) Quantitative variable can be counted and
measured.
(3) These types of variable may be discrete
or continuous.
(4) The average of quantitative variable can
be measured by any measure of central
tendency.
(5) Heights, weights, price of commodity are
the examples of quantitative variable.
Attribute: The distinct categories of qualitative variables are sometimes called attribute. In
other words, the characteristics used to classify an individual into different categories are
called an attribute. A worker when reported to be smoking is attributable to the category
smoker. His smoking behavior is used to classify him as smoker and thus it is an attribute.
Types of Quantitative Variable: Quantitative Variable may be either discrete or continuous.
Discrete Variable: When a variable can assume only isolated values, it is called a discrete
variable. For example, if the number of children in a family is the variable of interest, it is
obvious that it cannot assume fractional values and hence it is a discrete variable.
Continuous Variable: A variable is said to be continuous if it can theoretically assume any
value within a given range or ranges. Such variables, for instance, are height of a person,
price of a commodity and time.
Difference between discrete and continuous variables:
Discrete variable
Continuous variable
(1) When a variable can assume only (1) A variable is said to be continuous if it
isolated values, it is called a discrete
can theoretically assume any value
variable.
within a given range or ranges.
(2) Discrete variable can only assume (2) Continuous variable can assume both
integral values.
integral and fractional values.
(3) Discrete variables are countable.
(3) Continuous variables are measurable.
(4) The number of children in a family is (4) Height of students of a class is an
an example of discrete variable.
example of continuous variable.
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
Mathematical operations
Examples
Gender, Religion
Economic Status
Temperature,
IQ
score
Age, Family size
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
The important points in the selection of measurement scale for a variable are:
(i) Scale selected should be appropriate for the variables one wishes to categories.
(ii) Scale should be of practical use.
(iii) Scale should be clearly defined.
(iv) The number of categories created (when necessary) should cover all possible values.
(v) The number of categories created (when necessary) should not overlap that is, it
should be mutually exhaustive.
(vi) The scale should be sufficiently powerful.
Data: Data are the raw, disorganized facts and figures collected from any field of inquiry.
Types of data: Statistical data depending upon the sources are of two types
(a) Primary data
(b) Secondary data
Primary data: The data which are originally collected by an investigator or an agent for the
first time for the purpose of statistical enquiry are known as primary data.
Example: An investigator wants to study the salaries of teachers working in the campuses.
Then the data collected for this purpose by the investigator himself or with the help of his
representative, are primary data.
Secondary data: Data which are originally collected but obtained from some published or
unpublished sources are secondary data.
Example: The reports and publications made by Bangladesh Bureau of Statistics are primary
for that organization but secondary for those who use it.
Difference between the primary and secondary data:
The main difference between primary and secondary data is only of degree one. Data which
are primary in the hands of one becomes secondary in the hands of other. That is primary data
once collected and published becomes secondary data for other investigators. For example:
the data relating the population of Bangladesh published by Bangladesh Bureau of Statistics
are primary for that organization but secondary for those who use it.
There are the following differences between primary and secondary data:
Basis
(1) Definition
(2) Originality
(3) Expenses
(4) Suitability
Primary data
The data which are obtained by
direct observations from the
population or sample is called
primary data.
It is original. Primary data are
collected from the original
sources.
It involves large expanses in
terms of time, energy and
money.
If the data has been collected in
Secondary data
The data which are already obtained by
some other persons or organizations
and are already published or utilized
are called secondary data
It is not original. Secondary data is
collected from some organizations,
journals, newspapers etc.
It is relatively a less costly method.
(5) Reliable
(6) Dependency
(7) Precautions
(8) Qualified
interviewers
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
the survey.
Secondary data is less reliable than
primary data.
Secondary data depend on the primary
data.
It should be used with care.
Classification: Classification is the process of arranging data into sequences and groups
according to their common characteristics of separating them into different and related parts.
Types of Classification:
Broadly, the data can be classified on the following four basis:
(i)
Geographical, i.e., area-wise, e.g., cities, districts, etc.
(ii)
Chronological, i.e., on the basis of time.
(iii) Qualitative, i.e., according to some attributes.
(iv)
Quantitative, i.e., in terms of magnitudes.
(i) Geographical classification: In geographical classification data are classified on the
basis of geographical or location differences between the various items. For example,
when we present the production of sugarcane, wheat, rice, etc., for various districts,
this would be called geographical classification.
(ii) Chronological classification: When data are observed over a period of time the type
of classification is known as chronological classification. Time series are usually listed
in chorological order normally starting with the earliest period. When the major
emphasis falls on the most recent events, a reverse time order may be used.
(iii) Qualitative classification: In qualitative classification, data are classified on the basis
of some attribute or quality such as sex, color of hair, literacy, religion, etc. The point
to note in this type of classification is that the attribute under study cannot be
measured: one can only find out whether it is present or absent in the units of the
population under study.
(iv) Quantitative classification: Quantitative classification refers to the classification of
data according to some characteristics that can be measured, such as height, weight,
income, sales, etc.
Tabulation: Tabulation is a scientific process of involving the presentation of classified data
in an orderly manner so as to bring out there essential features and chief characteristics. The
purpose of the tabulation is to simplify the presentation of data and to facilitate comparison
between related information.
Different parts of a table: The different parts of a table depend upon the nature of the data
and the purpose of investigation. Generally, the main parts of a table are mentioned below:
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
1. Title of the table: Every table should have a suitable title. Title should be brief, clear,
simple and self-explanatory.
2. Table number: Every table must have a number for proper identification and for easy and
ready reference for future.
3. Caption: The title of a column is known as caption.
4. Stubs: Stubs mean the row headings.
5. Body of the Table: This is the important part of the table. The body of table is formed by
the arrangement of the data according to the description given in the captions and stubs.
6. Head note: Head note is a statement written below the title centered and enclosed in
brackets. It helps to clarify the points relating the content of the table that have not been
included in the title nor in caption and stub.
7. Footnote: Anything in table which cannot be understood by the reader from the title,
captions and stubs should be explained in footnotes. Footnotes are written directly below the
body of the table whenever necessary.
8. Source: The source from which the data have been taken should be mentioned.
Frequency distribution: A set of values together with the frequencies of occurrence of values
in each class in a given set of data, presented in a tabular form, is referred to as a frequency
distribution.
Principle of frequency distribution:
In statistics most important form of tabulation is known as frequency distribution. In frequency
distribution the following principles should be taken into account:
Raw data are grouped into classes or are groups of appropriate size.
Numbers of observation belonging to each class is recorded.
Number of observations in a particular class is called class frequency or frequency.
Construction of a frequency distribution: The first step in the construction of a frequency
distribution is to decide on the size of the groups or the class intervals. Generally we led to
use about 5 to 25 classes. The exact number of classes to be used will depend on the nature
and characteristics of data, the accuracy desired and the purpose of grouping. In particular, it
will depend on
(i) the range of the data and
(ii) the total number of observations
Suggested below are some useful rules for the construction of a frequency distribution:
(1) Find the range of the variable by subtracting the lowest value from the highest value.
(2) Divide the range by 5 and 25, and round the numbers to the same degree of accuracy
as found in the original data. Call these numbers and . The class interval should
normally be between and . By a little trial and error determine a suitable interval
and the starting points of the class intervals.
10
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
(3) Arrange a sheet with the headings: class interval, mid value, tally marks, frequency
and cumulative frequency. Begin at the top with the class interval which contains the
smallest value, and continue until the interval with the highest value is reached.
(4) Read off the items on the original table and put, for each value, a tally mark against
the appropriate class interval. It is convenient to mark each fifth by a diagonal.
(5) Count the number of tally marks opposite each interval, and write the result in the
frequency column.
Some important terms involved in a frequency distribution:
Class: In the process of condensation, raw data are assigned to some chosen groups of
appropriate size. These groups are called classes.
Frequency: The number of observations or values falling into each group or class is called
class frequency or simply frequency. For example, if in a set of data, a value 10 occurs 6
times, then 6 is the frequency of 10.
Relative frequency: The relative frequency of a class is the portion or percentage of the data
that falls in that class. To find the relative frequency of a class, divide the frequency by the
sample size .
=
=
Cumulative frequency: The cumulative frequency of a class is the sum of the frequencies of
that class and all previous classes. The cumulative frequency of the last class is equal to the
sample size .
Class Interval: Ordinarily, for numerical data, the frequencies of a particular class are
bounded by two values. The width or length of the class formed by these two boundary
values is known as the class interval.
Class width: The size of the class is referred to as the class width and is the distance between
lower (or upper) limits of consecutive classes. For a class with 45 as lower limit and 50 as
upper limit, the interval 45-50 has a class-width 5 and a mid-point
1
(45 + 50) = 47.50 .
2
Class limits: The smallest value of a class is technically known as the lower limit of the
interval, while the largest value is known as the upper class limit of the interval. Thus for a
class interval 15-19, 15 is the lower limit and 19 is the upper limit.
Midpoint: The midpoint of a class is the sum of the lower and upper limits of the class
divided by two. The midpoint is sometimes called the class mark.
+
=
2
Number of class interval: It is the number into which the total range of the data is divided
the number should be so decided that it gives the good description of the data presented into
the frequency table. The Sturges rule for deciding the number (k) is given by the following
formula:
11
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
K = 1 + 3.322 log10 N
Where = total number of observations.
By use of this formula, the size of class interval C can be written as
Range
C=
1 + 3.322 log 10 N
Methods of forming class-intervals:
Here we classify the data according to class interval. There are two ways of forming class
interval:
1. Exclusive method
2. Inclusive method
1. Exclusive method: The formation of class interval by this method is that the upper limit
of one class is the lower limit of the next class so as to make continuous without any gap.
This type of method is mostly useful in case of continuous variable. In exclusive method
class interval is obtained by taking the difference between the lower and upper limits. In
this case, if a value is exactly equal to the upper limit it will be included in the next class.
For example, if a value is 20 it will be included in the class 20-30.
Rainfall (in mm): 10-20 20-30 30-40
40-50 50-60
2. Inclusive method: The formation of the class interval by this method is that both lower
and the upper limits are included in a particular class. This method is mostly used in case
of discrete variable. In inclusive method class interval is obtained by taking the
difference between the two upper limits.
Number of students: 10-19 20-29 30-39
40-49 50-59
Some notes:
As far as possible one should avoid odd values of class intervals e.g. 3,11,26,39 etc.
Preferably, one should have class intervals of either five or multiples of five like
5,10,20,25,100 etc.
The starting point, i.e. lower limit of the first class, should either be zero or 5 or
multiple of 5.
For inclusive method class boundary is necessary which is obtained by
s
s
xiL xiH = ( xil ) ( xih + )
2
2
where, xil-xih be the ith class interval, xiL-xiH be the ith class boundary and s= smallest
unit of scale of measurement.
nearest (s=1)
nearest 10th (s=.1)
nearest 100th (s=.01)
Class
Class
Class interval Class boundary Class interval Class boundary
interval
boundary
25-29
24.5-29.5
25.0-29.9
24.95-29.95
25.0-29.99
24.995-29.995
30-34
29.5-34.5
30.0-34.9
29.95-34.95
30.0-34.99
29.995-34.995
35-39
34.5-39.5
35.0-39.9
34.95-39.95
35.0-39.99
34.995-39.995
40-44
39.5-44.5
40.0-44.9
39.95-44.95
40.0-44.99
39.995-44.995
45-49
44.5-49.5
45.0-49.9
44.95-49.95
45.0-49.99
44.995-49.995
Example: The following are the marks obtained by the candidates for the selection to a post of a
reputed Pharmaceutical company.
12
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
20
18
25
22
16
29
35
23
58
37
65
37
35
42
49
48
63
53
49
55
65
45
39
58
48
57
67
Construct a frequency distribution table by taking a suitable class interval.
42
65
69
Solution:
Let us determine the suitable class-interval with the help of the Sturges rule:
R
C=
1 + 3.322 log 10 n
= = 69 16 = 53, = 30
53
C=
= 8.97 9
1 + 3.22 1.4771
Since values like 3, 7, 9 etc. should be avoided we will take 10 as the class interval and the
first class be 15-25.
Frequency distribution table of the profits of 30 companies for the year 1989-1990
(Exclusive Method)
Class interval
Tally
Relative
Cumulative
Frequency ()
(Profits (Tk. Lakhs))
marks (No. of Companies) frequency (
. . ) frequency (. .)
15-25
5
5/30
5
25-35
||
2
2/30
7
35-45
7
7/30
14
45-55
55-65
65-75
Total
6
5
5
= 30
6/30
5/30
5/30
20
25
30
Example: The daily room rents in taka of 25 hotels in Dhaka City in June 1995 were 115,
160, 170, 80, 60, 90, 90, 80, 70, 70, 80, 80, 100, 90, 100, 90, 110, 120, 110, 100, 110, 130,
120, 140, and 105. Represent the data in a suitable frequency table and find:
(a) the highest rent
(b) the lowest rent
(c) the five highest ranking rents
(d) how many houses charged Tk. 90 or more as daily rent
(e) what percentage of houses charged above Tk. 100 but less than Tk. 110 per day
Solution: Here we form a frequency table by using frequency array in which each item is
written against the class in which it lies:
Rent
(in Tk.)
60-70
70-80
80-90
90-100
100-110
110-120
120-130
Mid
value
65
75
85
85
105
115
125
Tally marks
| ( 60)
|| (70, 70)
|||| (80, 80, 80, 80)
|||| ( 90,90,90,90)
|||| (100,100,100,105)
|||| (115,110,110,110)
|| (120,120)
Frequency
(!)
1
2
4
4
4
4
2
Relative
frequency (r.f.)
1/25
2/25
4/25
4/25
4/25
4/25
2/25
Cumulative
frequency (")
1
3
7
11
15
19
21
13
130-140
140-150
150-160
160-170
170-180
235
145
155
165
175
| (130)
| (140)
| (160)
| (170)
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
1
1
0
1
1
1/25
1/25
0
1/25
1/25
22
23
23
24
25
From the frequency table given above, it is now easy to answer the remaining questions:
(a) The highest rent = Tk. 170
(b) The lowest rent = Tk. 60
(c) The five highest ranking rents are Tk. 170, Tk. 160, Tk. 140, Tk. 130 and Tk. 120,
(d) The number of houses charging daily rent above Tk. 90 or more =
4+4+4+2+1+1+1+1=18
(e) The number of houses charging daily rent above Tk. 100 but less than Tk. 110 = 1
The required percentages =
1
100 = 4%
25
Example:
From a frequency distribution by taking a suitable class interval for the following data giving the
ages of 52 employees in a Pharmaceutical company.
67
34
36
48
49
31
61
34
43
45
38
32
28
61
29
47
36
50
46
34
46
32
30
33
45
49
48
41
53
36
37
47
47
30
46
50
28
35
35
38
46
43
34
36
62
69
50
28
44
43
60
39
Solution:
The lowest value is 28 and the largest value is 69.
So, = = 69 28 = 41.
41
Therefore, class interval is C =
= 6.119 6
1 + 3.22 1.716
Since class interval should preferably be multiple of 5, we have taken 5 as class interval.
Class
(Ages)
25-29
30-34
35-39
40-44
45-49
50-54
55-59
60-64
65-69
Total
Cumulative
frequency (#)
4
14
24
29
42
46
46
50
52
14
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
30
90
54
57
64
58
69
78
28
44
83
88
17
70
93
33
59
55
20
46
23
91
51
63
18
53
38
69
27
65
61
73
85
41
87
95
67
75
15
40
Tabulate the results in the form of a frequency distribution grouping by intervals of 10
grades. Also with reference to the records of grades, find
(a) the highest grade
(b) the lowest grade
(c) the range
(d) the grades of five highest ranking students
(e) the grades of five lowest ranking students
(f) how many students received grades of 70 or higher
(g) how many students received grades below 70
(h) what percentage of students received grades higher than 70 but less than 95.
Exercise-2: The population of villages in a district is
suitable class interval, prepare a frequency distribution:
42
34
33
29
27
37
51
39
21
31
42
21
38
42
49
52
38
53
39
71
17
33
61
59
27
19
54
61
59
43
53
37
39
57
16
41
42
42
57
37
53
37
44
7
66
Example 3: The following are the ages of 48 patients admitted to the emergency room of a
hospital. Construct a frequency distribution using a suitable class interval.
32
53
16
30
43
23
13
29
25
24
61
42
63
35
53
44
46
12
16
28
23
21
30
16
27
22
55
13
33
54
42
48
61
13
31
28
23
17
14
28
21
23
34
26
57
38
51
37
Example 4: The following are the number of babies born during a year in 60 community
hospitals.
30
40
54
59
34
37
59
48
42
24
32
43
42
53
47
39
45
54
31
24
55
34
53
32
53
52
29
31
35
28
55
58
45
42
57
26
46
32
21
56
56
56
29
24
57
57
54
30
57
59
27
53
22
46
50
52
49
49
54
29
15
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
Limitations:
(i) Graphs are sometimes misleading unless drawn and studied carefully.
(ii) Conclusions from graphs are crude.
Difference between diagrams and graphs:
1. Diagrams are constructed on plain paper whereas the graphs are on graph paper.
2. Diagrams are used only for the comparison but graphs help in studying the
mathematical relationship between two variables.
3. In diagrams, the numerical data are presented by bars, rectangles, circles, cubes etc.
whereas in graphs, the data are presented in terms of points and lines.
4. Presentation of frequency distribution in diagrams is not used but the presentation of
frequency distribution and time series in graph is more appropriate.
Some graphs and diagrams:
(i) Bar diagram
(ii) Pie diagram
(iii) Line diagram
(iv) Histogram
(v) Frequency Polygon
(vi) Cumulative frequency polygon/ Ogive
(vii) Scatter diagram
Setting in X-axis
Class interval of the class
Midpoint of the class
Midpoint of the class
Upper limit of the class
Time
Setting in Y-axis
Frequency
Frequency
Frequency
Cumulative frequency
Value of the variable
16
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
(ii)
Construct a rectangle at each category of the variable with a height equal to the
frequency in the category.
(iii) Leave a space between each category to provide distinct, separate categories and to
clearly the presentation.
Example: Consider the health professional data. The number of responses in each category
was totaled to give the following distribution.
Response
Frequently
Occasionally
Rarely
Never
Frequency
49
71
24
6
80
60
40
20
0
Frequen tly
Occasion ally
Ra rely
Never
7820.45
Most deciduous
1029.14
7820.45
360 = 173.13
16261.99
1029.14
360 = 22.78
16261.99
17
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
Mangrove
7412.40
7412.40
360 164.09
16261.99
Total
16261.99
360.00
Mangrove,
7412.4
Evergreen
, 7820.45
Most
Deciduous
, 1029.14
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
120
Population
100
80
60
40
20
0
1880
1900
1920
1940
1960
1980
2000
Census year
19
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
(v)
Histogram:
Histogram is a graphical method of representing a frequency distribution in which a
frequency distribution can be shown in the form of a diagram. It shows the pattern of the
distribution, whether for example, it is symmetrical or not. The histogram is particularly
important when the variable is continuous. A discrete variable can also be treated as a
continuous one while constructing a histogram.
Guide lines:
(i) Horizontal axes are divided into segments corresponding to the class boundaries of
the frequency distribution.
(ii) On each segment a rectangle with area proportional to the frequency in the class is
created.
(iii) The set of adjacent rectangles so constructed, constitute a histogram.
Note:
Histogram is the area, not the height that represents frequency of a class. Thus if a histogram
of frequency distribution with unequal class-widths is to be constructed, necessary
modification must be made to adjust the vertical height of the rectangle, so that the area of the
rectangle represents the frequency.
Table 1: Data for constructing histogram with equal class intervals
Class interval Class frequency Class width Height of the rectangles
04.5-9.5
8
5
8
9.5-14.5
29
5
29
14.5-19.5
27
5
27
19.5-24.5
12
5
12
24.5-29.5
4
5
4
Total
80
Table 2: Data for constructing histogram with unequal class intervals
Class interval Class frequency Class width Height of the rectangles
48.5-58.5
4
10
4/10=.4
58.5-68.5
8
10
8/10=.8
68.5-73.5
5
5
5/5=1.0
Col.4 10
4
8
10
20
73.5-78.5
78.5-98.5
Total
5
28
50
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
5
20
-
5/5=1.0
28/20=1.4
-
10
14
-
21
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
(iv) If the coordinate points are joined by a free hand smooth curve, the resulting graph is
called a cumulative frequency curve or ogive.
Example: Construct ogive curve for the temperature data:
22
D
E
F
G
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
9
12
5
8
74
58
90
78
Solution:
Step 1 Draw and label the x and y axes.
Step 2 Plot each point on the graph, as shown in Figure
23
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
Stem Leaf
1
79
2
2
3
839
4
57
5
345414
6
65
7
652
8
4
Key: 1|7 represents 17
Therefore the final figure is
Stem
1
2
3
4
5
6
7
8
Leaf
79
2
389
57
134445
56
256
4
24
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
25
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
Sample questions:
1. Define Classification and Tabulation. What is meant by frequency distribution?
Describe how will you construct frequency distribution from raw data?
2. What do you mean by Graphical representation of data? State the uses and limitations
of graphs and diagrams.
3. The following table shows the number of hours 45 hospital patients slept following
the administration of a certain anesthetic.
7
10
12
4
8
7
3
8
5
12
11
3
8
1
1
13
10
4
4
5
5
8
7
7
3
2
3
8
13
1
7
17
3
4
5
5
3
1
17
10
4
7
7
11
8
a) From these data construct:
(i) A frequency distribution
(ii) A relative frequency distribution
(iii)A histogram
(iv) A frequency polygon
(v) A cumulative frequency curve/ogive
b) Construct a stem and leaf display from these data. Describe these data relative to
symmetry and skewness.
4. The following are the numbers of babies born during in 60 community hospitals.
30 55 27 45 56 48 45 49 57 47 56
37 55 52 34 54 52 32 59 46 24 57
26
32 26 40 28 53 54 29 42 54
39 56 59 58 49 53 30 21 34
52 57 43 46 54 31 22 24 24
(a) From these data construct:
(i) A frequency distribution
(ii) A histogram
(iii)A relative frequency distribution
(iv) A frequency polygon
(v) A cumulative frequency curve/ogive
(b) Construct a stem and leaf display from these data.
symmetry and skewness.
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
53 59
28 50
57 29
5. The following are the ages of 30 patients seen in the emergency room of a hospital on
a Friday night. Construct a stem and leaf display from these data. Describe these data
relative to symmetry and skewness.
35
36
12
45
45
38
21
35
43
45
36
22
10
44
56
39
37
54
64
55
45
32
34
55
45
60
53
22
56
57
6. In a study of physical endurance levels of male college freshmen the following
composite endurance scores based on several exercise routines were collected. From
these data construct:
(i) A frequency distribution
(ii) A relative frequency distribution
(iii) A histogram
(iv) A frequency polygon
(v) A cumulative frequency polygon/ogive
7. Ellis et al. (A-3) conducted a study to explore the platelet imipramine binding
characteristics in manic patients and to compare the results with equivalent data for
healthy controls and depressed patients. As part of the study the investigators obtained
maximal receptor binding (Bmax) values on their subjects. The following are the values
for the 57 subjects in the study who had a diagnosis of unipolar depression.
1074
392
286
179
a)
372 473 797 385 769 797 485 334 670 510 299 333 303 768
475 319 301 556 300 339 488 1114 761 571 306 80
607 1017
511 147 476 416 528 419 328 1220 438 238 867 1657 790 479
530 446 328 348 773 697 520 341 604 420 394
From these data construct:
(i) A frequency distribution
(ii) A histogram
(iii) A relative frequency distribution
(iv) A frequency polygon
(v) A cumulative frequency distribution
(vi) A cumulative relative frequency distribution
(vii) A cumulative frequency curve/ogive
b) What percentage of the measurements are less than 500?
c) What percentages of the measurements are between 500 and 999 inclusive?
d) What percentage of the measurements are greater than 749?
27
Web: http://statdu.ac.bd/akanda/
E-mail: akanda@du.ac.bd
28