Chapter 1 BFC34303 (Lyy)

CIVIL ENGINEERING
STATISTICS
BFC 34303
Chapter 1 :
Review on Descriptive Statistics
INTRODUCTION
These are Mathematics marks for 30
students who are taking Test 1
12 , 23, 24, 45, 34, 48, 56, 63, 23, 44,

69, 78, 84, 95, 98, 67, 73, 69, 58, 70,
40, 88, 59, 47, 37, 15, 17, 36, 63, 38
How to interpret these marks?

WHAT IS STATISTICS ?
~ Statistics is the science that deals
with collecting, classifying, presenting,
describing, analyzing and interpreting
data to enable us to draw conclusions
and making reasonable decisions
~ Can be divided into 2 categories
(a) Descriptive statistics
(b) Inferential statistics
Descriptive statistics
~ The activities of collecting, classifying, presenting
and describing quantitative data
~ Methods for organizing (frequency table), representing
(graphs) and summarizing data (central tendency and
variability).
Inferential statistics
~ The part dealing with technique and method of
interpretation of the results obtained from the descriptive
statistics
WHAT IS POPULATION ?
~ Population is the entire (complete)
collection of data whose properties are
analyzed. It contains all the subjects of
interest.
~ Can be of any size, its items need not
be uniform but must share at least one
measurable feature.
WHAT IS SAMPLE?
~ A portion of population selected

for study
~ Sample is any set of entities, cases,

subjects, items or experimental
units chosen from the population.
WHAT IS RANDOM SAMPLE?
~ A random sample is a sample
selected in such a way that each
element of the population has the
same chance of being selected
WHAT IS PARAMETER ?
~ Parameter is a numerical measurement
describing some characteristics of a
population
~ Eg: The population mean , variance
WHAT IS STATISTIC?
~ Statistic is a numerical measurement
describing some characteristics of a
sample
~ Eg: The sample mean ,variance
WHAT IS VARIABLE ?
~ Any measured characteristic or
attribute that differs for different
elements
~ For example, if the weight of 30

subjects were measured, then
weight would be a variable.
~ Can be classified as quantitative or

qualitative
WHAT IS QUANTITATIVE
VARIABLE ?
~ The variable being studied is
numeric
~ measured on an ordinal, interval,
or ratio scale
~ eg: If the time it took them to

respond were measured, then the
variable would be quantitative.
WHAT IS QUALITATIVE
VARIABLE ?
~ The variable being studied is non-numeric
~ Called "categorical variables”
~ Measured on a nominal scale

~ eg: gender, educational level, eye
colour
If five-year old students were asked to
name their favourite colour, then the
variable would be qualitative.
WHAT IS DATA ?
~ A set of data is a collection of
observation, measurements or
information obtained
~ Can be classified as quantitative or

qualitative
~ Can be presented in various ways

WHAT IS QUANTITATIVE
DATA ?
~ Quantitative data refers to
observations which can be
measured numerically or counted
~ Can be divided into discrete data
and continuous data
~ eg: length, time,
temperature and mass
WHAT IS QUALITATIVE DATA ?
~ Qualitative data are not in

numerical form but instead
assigned as attributes
~ eg: race, marital status, age, gender
Discrete data
~ is a set of data that can only take exact
and countable values
~ For example:
a) The number of students in a class.

b) The number of cars sold on any day
at a car dealership.
c) The number of persons in a family.
d) The number of students in a class.
Continuous data
~ is a data can take any value over
certain interval and can be measured
to a certain degree of accuracy
(correct to certain decimal places)
~ For example:
a) The weight of students in a class.
b) The time taken to complete an
examination.
c) The amount of soda in a 150ml can.
d) The income of a family.
WHAT IS UNGROUPED DATA ?
~ (a) Raw data
(b) Not in the term of interval
(c) Frequency distribution that has
been arranged in order
~ Example:
(i) 3,5,6,2,5,2,4,6,5
(ii) Number of books 0 1 2 3

Frequency 3 7 4 2
WHAT IS GROUPED DATA ?
~ The data can be grouped into class
interval before the frequency
distribution is constructed
~ The table constructed is called
frequency distribution table
~ Example:
Height 150-155 155-160 160-165 165-170

(cm)
Frequency 2 8 6 5
WHAT IS FREQUENCY DISTRIBUTION?
• One method for simplifying and organizing data is to

construct a frequency distribution.
• A frequency distribution is an organized tabulation

showing exactly how many individuals are located in
each category on the scale of measurement.
Examples:
Determine whether the data obtained is discrete or continuous data.
(a) The number of books sold by a stationary shop.

(b) The time taken to travel from Kuala Terengganu to Batu Pahat
(c) The weight of FKAAS students
(d) The diameter of twenty spheres
REMARKS…
• All data are to be considered as sample

unless otherwise stated in the questions.
Example :
The number of male children in 20 families chosen at
random is as follows.
14 2 0 2 3 3 2 1 4 5 2 1 2 0 1 2 3 1 2
The above data is called a raw data and it can be

summarized as a frequency distribution as shown :
Number of male 0 1 2 3 4 5
children
Frequency 2 5 7 3 2 1
The data shown in this frequency distribution table is known

as ungrouped data.
CENTRAL TENDENCY
• In general terms, central tendency
(mean, median, and mode) is a statistical
measure that determines a single value
that accurately describes the center of the
distribution and represents the entire
distribution of scores.
• The goal of central tendency is to identify

the single value that is the best
representative for the entire set of data.
MEASURES OF LOCATION
( CENTRAL TENDENCY)
MEAN
Given a set data of x1,x2,x3,..xn.
The mean, is defined as
sum of all observations
x 
number of observations
x1  x 2  ...  x n

n
n For a set of data k
x i which can be fx i i
 i1 represented in a
 i 1
n frequency distribution k
table, the mean is
given by
f
i 1
i
Example :
Find the mean of the following data
14 2 0 2 3 3 2 1 4 5 2 1 2 0 1 2 3 1 2
Solution:
n
x i
1  4  2  ...  3  1  2
x i 1

n 20
41
  2.05
20
OR
x 0 1 2 3 4 5
f 2 5 7 3 2 1
fx i i
2(0)  5(1)  7(2)  3(3)  2(4)  1(5)
x i 1 
k 20
f
i 1
i  2.05
Example :
To obtain grade A, Saleha must achieve an average
of at least 75 marks in four tests. If her average
mark for the first three tests is 70, calculate the
lowest mark she must get in her fourth test in order
to obtain grade A.
Solution:
Let the four tests : w,x,y,z
Mean for w,x,y : 70
Mean for w,x,y,z :
3(70)  z
 75
4
210  z
 75
4 So, the lowest mark
210  z  300 she must get in her
fourth test in order to
z  90 obtain grade A is 90
MEDIAN
The median is the middle value of a set of data that is arranged in
order of magnitude.
th
Let x(k) be the k observation in a set of data which has been
arranged in ascending or descending order.
For example, consider the following set of numbers
9 2 7 10 5 16
After arrangement, it becomes
2 5 7 9 10 16
Thus, between x3  7 and x 4  9
 median is 8
Themedianof a set data x1 ,x 2 ,...,x n is denoted
by x(m) and x m may becalculated as:
 x n1  ,if n is odd

 
 2 


xm    
1
 x  x 
 2   2   2 1  ,if n is even
n n 
     
Example :
Find the median for the following sets of data
a) 21, 24, 17, 28, 36, 20, 32
b) 3.56, 2.7, 5.48, 8.61, 4.35, 6.22
Solution:
a) The data arranged in ascending order :
17 , 20 , 21 , 24 , 28 , 32 , 36
Since n = 7 , which is odd, thus the
median is x  x
m n 1  x  24
4
2
b) The data arranged in ascending order :
2.71 , 3.56 , 4.35 , 5.48 , 6.22 , 8.61
Since n = 6 , which is even, thus the
median is  
1
xm   x  6   x  6  
2   2   2 1 
 
1
 x3  x 4 
2 
1
  4.35  5.48 
2
 4.915
MODE
• The mode of a set of data is the value that

occurs most frequently.
• The mode may not be unique or they may be

no mode at all.
Example :
Find the mode for the following set of data
a) 2, 3, 3, 4, 5, 28, 5, 5
b) 2, 3, 5, 8, 10
c) 0.2, 0.4, 0.4, 0.4, 0.5, 0.7, 0.7, 0.7, 0.5

QUARTILES
Quartiles divide a set of data which are arranged in
ascending order into 4 equal parts.
To find quartile ( Qk ):
Let k
r n
4
where : n  number of observations
k  quartile for Qk
(i) If r is an integer:
1 th
Qk   r observation  ( r  1) observation 
 th
2
(ii) If r is not an integer, then round up to the next
integer.
Q2 is also called median.
Interquartile Range = Q3  Q1
PERCENTILES
Percentiles divide a set of data which are arranged in
ascending order into 100 equal parts.
To find percentile ( Pk ):
k
Let r n
100
where : n  number of observations
k  percentile for Pk
(i) If r is an integer:
1 th
Pk   r observation  ( r  1)th observation 
2
(ii) If r is not an integer, then round up to the next
integer.
Notes:Q1 =P25 , Median  Q2 =P50 , Q3 =P75

Example :
Find the median, first quartile (Q1) ,third
quartile (Q3 ) and 40th percentile ( P40 ) for the
following sets of data
a) 21, 24, 17, 28, 36, 20, 32
b) 3.5, 2.7, 5.4, 8.6, 4.3, 6.2, 9.9, 7.6
Solution:
a) The data arranged in ascending order :
17 , 20 , 21 , 24 , 28 , 32 , 36
Median  Q2
k 2
r  n   7   3.5 ( not an integer )
4 4
 Median  Q2  4 observation  24
th
First quartile  Q1
k 1
r  n   7   1.75 ( not an integer )
4 4
 Q1  2 observation  20
th
Third quartile  Q3
k 3
r  n   7   5.25 ( not an integer )
4 4
 Q3  6 observation  32
th
40 percentile  P40
th
k 40
r n  7   2.8 (not an integer )
100 100
 P40  3 observation  21
rd
Example :
The following table shows the marks obtained

by 30 students in a Mathematics quiz, where
the maximum marks is 10.
Marks 2 3 4 5 6 7 8 9 10
No. of 2 4 3 6 4 5 4 1 1
students
Find the mean, mode, median, first and

third quartiles, interquartile range and
the 60th percentile.
Example :
Data 1: 6,7,8,6,9,6 mean = 7

Data 2: 5,7,2,6,13,9 mean = 7
• Most of the numbers in data 1 are around the mean value.

• Data 2 is more spread away from the mean.
• The difference in the spread can be determined by the measure of
dispersion
MEASURES OF DISPERSION
Variability
• The goal for variability is to obtain a measure
of how spread out the scores are in a
distribution.
• A measure of variability usually accompanies a
measure of central tendency as basic
descriptive statistics for a set of scores.
MEASURES OF DISPERSION
Three common measure of dispersion are:

• Range
• Variance
• Standard deviation
Range = Largest value – Smallest value
REMARK
• Range is not a good measure of dispersion because it is influenced by the
extreme values and the calculation does not cover all observations.
• Variance and standard deviation are most useful and widely used
measure of dispersion. Although they are influenced by the extreme
values, the calculations cover all the observations
REMARK
• Standard deviation measures how spreads out the values in a data set are.
• If the data points are all close to the mean, then the standard deviation is
close to zero.
• If many data points are far from the mean, then the standard deviation is
far from zero.
• If all the data values are equal, then the standard deviation is zero.
VARIANCE x
X

 fx i i
nf i
S 2

 (X  X) i
2
n 1 for i  1,2,...,n
Commonly in use formulae
STANDARD
 DEVIATION
2
x  nX
2 2
2
 nX fx
S 
2 i 2
S  i i
n 1 n 1
S  VARIANCE
  x  fx 
2

2
 xi2 
i
fx 2

i i  S2
 n i i
n

n 1 n 1
Example :
Calculate the variance and standard deviation for the
following sets of sample data. Hence, determine which data
is more disperse about the mean.
Set 1 : 16,10,9,2,5,2,7
Set 2 : 10,32,8,12,14,36,20,8,40,4,32,1
For Data 1:
Data 1 : 16,10,9,2,5,2,7
  n
 
2
x x2  n   xi  
   

i 1
2 4 X 2

i 1
i
n 
2 4  
 
5 25 S 
2
7 49 n 1
9 81
 51
2
10 100 519 
 7  24.571849
16 256 6
n n
 Xi  51
i1
 i  519
X
i1
2
S  24.571849  4.957
For Data 2:
Data 2 : 10,32,8,12,14,36,20,8,40,4,32,1
  n
 
2
n n
 n   xi    Xi  217  i  5929
2
X
   

i 1
X 2
 i1 i1
i 1
i
n 
 
 
S 
2
n 1
 217 
2
5929 
 12  182.265 Hence, data 2 is
11 more disperse
than data 1
S  182.265  13.5
STEM-AND-LEAF DIAGRAMS
Used to extract every data value in dataset.
The digit(s) in the greatest place value(s) of the data
values are the stems.
The digits in the next greatest place values are
the leaves.
To construct a stem-and-leaf diagram:
1. Place the stems in order vertically from smallest to
largest.
2. Place the leaves in order in each row from smallest
to largest.
3. Create a key for the stem-and-leaf diagram so that
people know how to interpret the diagram.
Example :
Shape of distribution
A perfectly symmetric curve is one in which both sides of
the distribution would exactly match the other if the figure
were folded over its central point.
An example is shown below:
A symmetric, bell-shaped distribution, a relatively common

occurrence is called a normal distribution.
A distribution is said to be skewed to the right, or
positively skewed, when most of the data are
concentrated on the left of the distribution. The right tail
clearly extends farther from the distribution's centre than
the left tail, as shown below:
A distribution is said to be skewed to the left, or
negatively skewed, if most of the data are concentrated
on the right of the distribution. The left tail clearly extends
farther from the distribution's centre than the right tail, as
shown below:
Example:
If the stem and leaf plot is turned on its side, it will look like
the following:
The distribution shows that most data are clustered at the right.
The left tail extends farther from the data centre than the right
tail. Therefore, the distribution is skewed to the left or
negatively skewed.
Example :
Marks of a recent Mathematics test are as given below:
73, 42, 67, 78, 99, 84, 91, 82, 86, 94
Based on the marks given:
(a) Construct a stem-and-leaf diagram.
(b) What is the highest and lowest mark?
(c) Interpret the distribution.
Solution:
(a) Mathematics Test Mark
Stem Leaf
4 2
5
6 7
7 3 8
8 2 4 6
9 1 4 9
Key:
9 9 means 99 marks
(b) Highest mark = 99, Lowest mark = 42
(c) Negatively skewed
Example :
Given the heights of 20 people are as follows:

154, 143, 148, 139, 143, 147, 153,
162, 136, 147, 144, 143, 139, 142,
143, 156, 151, 164, 157, 149.
Construct a stem-and-leaf diagram and state the shortest and
tallest height. Interpret the distribution.
Solution:
Stem Leaf
13 6 9 9
14 2 3 3 3 3 4 7 7 8 9
15 1 3 4 6 7
16 2 4
Key:
13 6 means 136 cm
Shortest height =136 cm
Tallest height =164cm
Positively skewed
Exercise:
The length of a straight line that were estimated by 22

students in mm are as given below:
10.5, 8.5, 8.6, 8.1, 7.3, 4.4, 6.6, 6.6, 7.9, 8.7, 8.3,
6.0, 8.7, 7.5, 7.9, 6.0, 9.1, 7.2, 8.4, 8.1, 8.6, 9.3
Construct a stem-and-leaf diagram based on the given
data. Interpret the distribution.
BOX-AND-WHISKER PLOTS
70
max
Q1 Q2 Q3 60
min max
50
0 10 20 30 40 50 60 70
40 Q3
Horizontal Box and Whisker
30
Q2
20
10
min
Vertical Box and Whisker
0
BOX-AND-WHISKER PLOTS
To construct a box-and-whisker plot:
STEP 1: Determine the five number summary.

STEP 2: Draw a horizontal axis on which the number
obtained in step 1 can be located. Above this
axis, mark all the five number summary with
vertical lines.
STEP 3: Connect the quartiles to each other to
make a box, and then connect the box
to the maximum and minimum lines.
STEP 4: Calculate the values of upper and lower
inner fence to determine whether the data
Upper inner fence = Q3 + 1.5 (Q3 – Q1)
Lower inner fence = Q1 - 1.5 (Q3 – Q1)
Lower inner fence Upper inner fence
min max
Q1 Q2 Q3
10 20 30 40 50 60 70 80 90 100
The data lies within the upper and lower inner fence, so the data has no outlier.

Outlier
min max
Q1 Q2 Q3
10 20 30 40 50 60 70 80 90 100
The observation that lies outside fence is known as outlier.

SHAPE OF DATA DISTRIBUTION
(SYMMETRY AND SKEWNESS)
Symmetrical distribution-the ‘whiskers’ are

the same length and the median Q2 is in
the centre of the box.
Q1 Q2 Q3
min max
Positively skewed distribution-the left

‘whiskers’ is shorter than the right ‘whiskers’
and the median is nearer to Q1.
Q1 Q2 Q3
min max
Negatively skewed distribution-the left

‘whiskers’ is longer than the right
‘whiskers’ and the median is nearer to Q3.
Q1 Q2 Q3
min max
Example :
Data :
40, 32, 61, 52, 65, 68, 41, 61, 70, 66, 57, 55, 45,
51, 62, 69, 31, 50, 72, 66, 41, 54, 65, 79, 66
(a) Display the data in a stem and leaf diagram.
(b) Find the first, second and third quartiles, upper and lower inner
fence.
(a) Construct a box and whisker plot for the above data.
Solution :
(a) Stem Leaf
3 1 2
4 0 1 1 5
5 0 1 2 4 5 7
6 1 1 2 5 5 6 6 6 8 9
7 0 2 9
Key:
5 4 means 54
(b) Number of observation, n = 25, min = 31 , max = 79
1
r   25   6.25 , Q1 = the 7th observation
4
= 50
2
r  25   12.5 , Q2 = the 13th observation
4
= 61
3
r  25  18.75, Q3 = the 19th observation
4
= 66

= 66 + 1.5(66 - 50)
= 90

= 50 - 1.5(66 - 50)
= 26
(c)
26 90
Q1 Q2 Q3
31 50 61 66 79
10 20 30 40 50 60 70 80 90 100
No outlier. The data is negatively skewed (skewed to the left).

Example :
Stem Leaf
5 1 9
6 2 3 3 4 4 4 4 4 5
6 8 8 8 9 9 9
7 0 2 2 3 6 7
Key:
5 9 means 59o F
From the given Stem and Leaf diagram, construct Box

and Whiskers plot. Determine the outliers of the data.
Number of observation, n = 23, min = 51 , max = 77
1
r   23  5.75 Q1 = the 6th observation
4
= 64o F
2
r  23  11.5 Q2 = the 12th observation
4 = 68o F
3
r  23  17.25 Q3 = the 18th observation
4
= 70o F
= 70 + 1.5(70-64)
= 79o F

= 64 - 1.5(70-64)
= 55o F
55 79
Outlier
Q1 Q2 Q3
51 64 68 70 77
50 60 70 80
From the boxplot, we can see that the minimum value
51o F is outside the fence and this value is the outlier.
Therefore whiskers is drawn from 59o F to 77o F .
55 79
Q1 Q2 Q3
Outlier
51 59 77
64 68 70
50 60 70 80
The data is negatively skewed (skewed to the left).
GROUPED
DATA
MEAN of a frequency distribution
The mean of a set of grouped data given in

the form of a frequency distribution is
defined as
k
f i xi
x  i 1
k
f
i 1
i
f
i 1
i  total no. of frequency
xi  class mark
Example :
Find the mean for the following data
Class Frequency, fi
0 ≤ x <10 2
10 ≤ x <20 17
20 ≤ x <30 26
30 ≤ x <40 10
40 ≤ x <50 5
Class Frequency
0 ≤ x <10 2
10 ≤ x <20 17
20 ≤ x <30 26
30 ≤ x <40 10
40 ≤ x <50 5
0  10
SOLUTION: x
2
Class Class mark, Frequency, fixi
xi fi
0 ≤ x <10 5 2 10
10 ≤ x <20 15 17 255
20 ≤ x <30 25 26 650
30 ≤ x <40 35 10 350
40 ≤ x <50 45 5 225
 fi = 60 f x
i i  1490
k
f xi 1490
x  24.83
i
x i 1
k
f
i 1
i 60
MODE of a frequency distribution
 d1 
mod e  Lm   c
 d1  d 2 
Lm = lower boundary of the class containing the
mode
d1 = the diff. between the frequency of the mode
class and the frequency of the class
immediately before it.
d2 = the diff. between the frequency of the mode
class and the frequency of the class
immediately after it
C = size of the mode class
Example :
Find the mode of frequency distribution given below:
Class Frequency
15 - 19 1
20 - 24 4
25 - 29 22
30 - 34 35
35 - 39 20
40 - 44 8
SOLUTION:
The mode class is 30 – 34 and the

corresponding frequency is 35.
Lm  29.5
 d1 
d1  35  22 mod e  Lm   c
d 2  35  20  d1  d 2 
c5
 13 
mode  29.5    5
 13  15 
= 31.8
Mode from histogram
Draw a line from the left upper
Draw
cornera of
line from
the the right
highest upper
vertical bar
frequency corner ofestimated
the highest vertical
to the is
Mode left upper corner
from of
thethe bar
to thevertical
next right upper
intersection bar corner
point of bothof the
lines
vertical bar before it
Histogram should be drawn on a
graph paper in order to obtain an
accurate answer
mode Class boundaries

Example :
For the data in example 2, find the mode
using the histogram
SOLUTION:
35
Frequency
30
25
20
15
10
5
14.5 19.5 24.5 29.5 34.5 39.5 44.5

Mode = 31.8
MEDIAN of a frequency distribution
NOTE :
Median of frequency distribution can't be

counted like the ungrouped data
because the data has been grouped in
the form of classes. So, we will get an
estimated value of median.
MEDIAN
n 
 2  FL 
m  Lm   c
 fm 
 
L m  lower boundary
n  total no. of frequency
FL  cumulative frequency of the class before median class
fm  frequency of median class
c  size of median class
Example :
Calculate the median for the following data
Class Frequency, f
0≤x<5 7
5 ≤ x <10 27
10 ≤ x <15 35
15 ≤ x < 20 54
20 ≤ x < 25 63
25 ≤ x < 30 43
30 ≤ x < 35 25
35 ≤ x < 40 17
40 ≤ x < 45 9
45 ≤ x < 50 4
SOLUTION:
Class Frequency, f Frequency, FL
0≤x<5 7 7
5 ≤ x <10 27 34
10 ≤ x <15 35 69
15 ≤ x < 20 54 123
20 ≤ x < 25 63 186
25 ≤ x < 30 43 229
30 ≤ x < 35 25 254
35 ≤ x < 40 17 271
40 ≤ x < 45 9 280
45 ≤ x < 50 4 284
f  284
The median class is 20 ≤ x < 25 with the
corresponding frequency as 63.
Hence, the median is n 
 2  FL 
m  Lm  
Lm  20  fm
c

 
 f  284 1 
FL  123  2 (284)  123 
m  20   5
 63 
fm  63  
c5  21.51
Quartile
Quartiles divide a set of data which are
arranged in ascending order into 4 equal
parts
Percentile
Percentiles divide a set of data which are
parts
Decile
Deciles divide a set of data which are
parts
For grouped data;
k  
  4  n  FL 
Qk  Lk      Ck, k  1, 2,3,..
 fk 
 
 
 k  
  100  n  FL 
Pk  Lk      Ck, k  1, 2,3,..,99
 fk 
 
 
 k  
  10  n  FL 
Dk  Lk      Ck, k  1, 2,3,..,9
 fk 
 
 
Lk = lower boundry of the class where Qk ,Pk ,Dk lies
n = total number of observations
FL = cumulative frequency before the class Qk ,Pk ,Dk
fk = frequency of the class where Qk ,Pk ,Dk lies
ck = class width where Qk ,Pk ,Dk lies
Example :
Height (cm) 3-5 6-8 9-11 12-14 15-17 18-20
Frequency 1 2 11 10 5 1
From the above data, calculate :

(a) first , third quartiles & interquartile range
th th
(b) the 10 , 90 percentiles
 c the 5
th
decile, D5
Solution:
Class Class Cumulative frequency
Limit Bound. Freq.
3-5 2.5-5.5 1 1
6-8 5.5-8.5 2 3
9-11 8.5-11.5 11 14
12-14 11.5-14.5 10 24
15-17 14.5-17.5 5 29
18-20 17.5-20.5 1 30
Q1 is in third class with boundries (8.5 - 11.5 )
Thus, Lk  8.5, f k  11, FL  3, c=3
(a) First and third quartile

Q1  P25
 7.5  3 
= 8.5 +    3  9.73
 11 
Q3 is in third class with boundries (11.5-14.5 )
Thus, Lk  11.5, f k  10, FL  14, c=3
Q3 = P75
 22.5-14 
=11.5 +    3
 10 
 14.05
Q3  Q1  14.05  9.73  4.32

 3 - 1
(b) P10 = 5.5 +   x 3  8.5
 2 
 27 - 24 
P90 = 14.5 +   x 3  16.3
 5 
 c D5  P50  Median
 15 - 14 
= 11.5 +   x3
 10 
 11.8
RANGE
Range = upper boundary of the last data

- lower boundary of the first class
INTERQUARTILE RANGE
• Defined as the difference between the
third quartile and the first quartile
Interquartile range = Q3 - Q1
  fx 
2
 fx 
2
Variance, S2 
f
 f -1
standard deviation, S  Variance

 S 2
Example :
Find the range, variance and standard deviation
Class Frequency Class 2
Intervals mark x fx fx
1-3 5 2 10 20
4-6 3 5 15 75
7-9 2 8 16 128
10-12 1 11 11 121
13-15 6 14 84 1176
16-18 4 17 68 1156
 f  21  fx  fx 2
= 204  2676
Solution:
Range = upper boundary of the last data
- lower boundary of the first class
= 18.5 – 0.5 = 18
  fx 
2
 fx  f
2
S 2

 S  34.71
2
 f 1
 204 
2 S = 34.71
2676 
21

20  5.892
Example :
Find the mean, variance and standard deviation.
Marks Number of students
0  x < 20 9
20  x < 40 29
40  x < 60 42
60  x < 80 26
80  x < 100 14

Chapter 1 BFC34303 (Lyy)

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Chapter 1 BFC34303 (Lyy)

Încărcat de

Drepturi de autor:

Formate disponibile

CIVIL ENGINEERING

12 , 23, 24, 45, 34, 48, 56, 63, 23, 44,

How to interpret these marks?

~ A portion of population selected

~ Sample is any set of entities, cases,

~ For example, if the weight of 30

~ Can be classified as quantitative or

~ eg: If the time it took them to

~ Measured on a nominal scale

~ Can be classified as quantitative or

~ Can be presented in various ways

~ Qualitative data are not in

a) The number of students in a class.

(ii) Number of books 0 1 2 3

Height 150-155 155-160 160-165 165-170

• One method for simplifying and organizing data is to

• A frequency distribution is an organized tabulation

(a) The number of books sold by a stationary shop.

• All data are to be considered as sample

The above data is called a raw data and it can be

The data shown in this frequency distribution table is known

• The goal of central tendency is to identify

 x n1  ,if n is odd

• The mode of a set of data is the value that

• The mode may not be unique or they may be

c) 0.2, 0.4, 0.4, 0.4, 0.5, 0.7, 0.7, 0.7, 0.5

Notes:Q1 =P25 , Median  Q2 =P50 , Q3 =P75

The following table shows the marks obtained

Find the mean, mode, median, first and

Data 1: 6,7,8,6,9,6 mean = 7

• Most of the numbers in data 1 are around the mean value.

Three common measure of dispersion are:

A symmetric, bell-shaped distribution, a relatively common

Given the heights of 20 people are as follows:

The length of a straight line that were estimated by 22

STEP 1: Determine the five number summary.

Lower inner fence Upper inner fence

The observation that lies outside fence is known as outlier.

Symmetrical distribution-the ‘whiskers’ are

Positively skewed distribution-the left

Negatively skewed distribution-the left

Upper inner fence = Q3 + 1.5 (Q3 – Q1)

Lower inner fence = Q1 - 1.5 (Q3 – Q1)

No outlier. The data is negatively skewed (skewed to the left).

From the given Stem and Leaf diagram, construct Box

Lower inner fence = Q1 - 1.5 (Q3 – Q1)

The mean of a set of grouped data given in

The mode class is 30 – 34 and the

mode Class boundaries

14.5 19.5 24.5 29.5 34.5 39.5 44.5

Median of frequency distribution can't be

From the above data, calculate :

(a) First and third quartile

Q3  Q1  14.05  9.73  4.32

Range = upper boundary of the last data

standard deviation, S  Variance

S-ar putea să vă placă și