Documente Academic
Documente Profesional
Documente Cultură
61 68 65 67 68 71 69 63 74 64
66 65 62 67 60 73 69 70 70 71
A B C A C B B A B C
B A B B B A C D D B
Arrays
An arrangement of numerical raw data in ascending order or descending order of magnitude
60 61 62 63 64 65 65 66 67 67
68 68 69 69 70 70 71 71 73 74
Ungrouped data
Contains information on each member of a sample or population individually
Examples: Data presented in Table 1 and Table 2
Grouped data
Data presented in classes or intervals.
Example:
UCCM2623 Scores 10 12 13 15 16 18 19 21
Number of students 4 12 20 14
Example 1.1. A sample was taken of 25 students who were planning to go to college. The courses he/she
intended to choose:
Engineering Infotech Engineering Business Business
Business Business Other Biotech Biotech
Biotech Biotech Infotech Biotech Biotech
Other Business Engineering Business Other
Engineering Biotech Biotech Other Infotech
Construct a frequency distribution table for these data.
Chapter 1 - 1
UECM2623 Numerical Methods and Statistics/UECM1693 Mathematics for Physics II
Solution.
Course Tally Frequency
Biotech 8
Business
Engineering 4
Infotech
Others 4
Total: 25
Example 1.2. Determine the relative frequency and percentage distributions for the data in Example 1.1.
Solution.
Course Relative Percentage
Frequency
Biotech 32%
Business 0.24
Engineering 16%
Infotech 0.12
Others 16%
Total: 1 100%
Example 1.3. Construct a bar chart for the data in Example 1.1.
Solution.
Frequency
8
6
4
2
Course
Biotech Business Engineering Infotech Others
Chapter 1 - 2
UECM2623 Numerical Methods and Statistics/UECM1693 Mathematics for Physics II
Note:
Generally, the grouping process destroys some of the original information
The classes are non-overlapping i.e. each value belongs to one and only one class
Class
An interval that includes all the values that falls within two numbers, the lower and upper limits
Class limits
Endpoints of each interval
Class Boundary
Class boundary is the dividing line between two classes. It is given by the midpoint of the upper limit of
one class and the lower limit of the next higher class
1. Determine the number of classes, usually varies from 5 to 20, depending mainly on the number of
observations in the data set.
Find 2k where k is the smallest number such that 2k is greater than the number of observations
(n).
3. Determine the lower limit of the first class or the starting point.
Any convenient number that is equal to or less than the smallest value in the data set can be used
as the lower limit of the first class.
Chapter 1 - 3
UECM2623 Numerical Methods and Statistics/UECM1693 Mathematics for Physics II
Example 1.4. Sample of birth-weights (oz) from 50 consecutive deliveries is given below. Construct a
frequency distribution table.
Solution.
Example 1.5. Calculate the relative frequencies and percentages distributions for the data in Example
1.4.
Solution.
Chapter 1 - 4
UECM2623 Numerical Methods and Statistics/UECM1693 Mathematics for Physics II
1.3.3 Histogram
Three types of histogram
1. Frequency histogram
2. Relative frequency histogram
3. Percentage histogram
1.3.4 Polygon
Polygon is a line graph formed by joining the midpoints of the tops of successive bars in a histogram.
Next, we mark two more classes (with zero frequencies), one at each end, and mark the midpoints.
Chapter 1 - 5
UECM2623 Numerical Methods and Statistics/UECM1693 Mathematics for Physics II
10
0.30
0.25
0.20
0.15
0.10
0.05
79.5 89.5 99.5 109.5 119.5 129.5 139.5 149.5
Birth-weight (oz)
30
25
20
15
10
5
79.5 89.5 99.5 109.5 119.5 129.5 139.5 149.5
Birth-weight (oz)
Example 1.7. The frequency distribution gives the weight of 35 objects, measured to the nearest kg.
Draw a histogram to illustrate the data.
Weight (kg) 68 9 11 12 17 18 20 21 29
Frequency 4 6 10 3 12
Solution.
standard class width
adjusted frequency frequency
class width
Chapter 1 - 6
UECM2623 Numerical Methods and Statistics/UECM1693 Mathematics for Physics II
Adjusted Frequency
6
5
4
3
2
1
Example 1.8. Refer to data in Example 1.4, construct its cumulative frequency distribution, cumulative
relative frequency and cumulative percentage.
Chapter 1 - 7
UECM2623 Numerical Methods and Statistics/UECM1693 Mathematics for Physics II
Note:
1. The ogive starts at the lower boundary of the first class and ends at the upper boundary of the last
class.
2. If relative cumulative frequency is used in place of cumulative frequency, the graph is called
relative cumulative frequency curve or percentage ogive.
Example 1.9. Draw an ogive for the data in Example 1.4. Estimate from the ogive,
a) the total number of deliveries that their birth-weights were less than 95oz.
b) the value of X , if 20 % of the deliveries were of birth-weights X oz or more.
Solution.
Ogive
55
50
Cumulative frequency
45
40
35
30
25
20
15
10
5
0
79.5 89.5 99.5 109.5 119.5 129.5 139.5 149.5
Birth-Weight (oz)
1.4.1 Median
Median is the value of the middle term in a data set that has been ranked in increasing or decreasing order
n 1
Median is the value of the th term in a ranked data set; n total number of elements in the set .
2
Note:
1. If n is odd, then median is the value of the middle term in the ranked data.
2. If n is even, then median is the average value of the two middle terms.
Chapter 1 - 8
UECM2623 Numerical Methods and Statistics/UECM1693 Mathematics for Physics II
Example 1.10. Find the median of set A = { 10, 5, 19, 8, 3 } and set B = { 2, 7, 3, 6, 4, 5 }
Solution.
Note:
Median is not influenced by the extreme value. (Extreme values are values that are very small or very
large relative to the majority of the values in a data set.)
No. of children 0 1 2 3 4 5
Frequency 3 5 12 9 4 2
Solution.
1.4.2 Mode
Mode is the value that occurs with the highest frequency in a data set.
Example 1.12. Find the mode of each of the following data set.
i) 74, 9, 5, 8, 3, 8, 8 iii) 2, 6, 6, 6, 3, 8, 8, 8, 3
ii) 2, 2, 6, 6, 8, 8, 9, 9 iv) B, C, D, A, A, C, C, C, B, A
Solution.
Note:
1. Mode is not influenced by the extreme value.
2. Mode may not exist, exist one mode(unimode), two modes(bimodal) or more than two
modes(multimodal).
3. Mode can be used for both quantitative and qualitative data
Chapter 1 - 9
UECM2623 Numerical Methods and Statistics/UECM1693 Mathematics for Physics II
No. of children 0 1 2 3 4 5
Frequency 3 5 12 9 4 2
Solution.
1.4.3 Mean
The mean for population data x1 , x2 , ..., x N is denoted by and is defined as
x x ... x N 1 N
1 2 xi
N N i 1
The mean for sample data x1 , x2 , ..., xn is denoted by X and is defined as
x1 x2 ... xn 1 n
X xi
n n i 1
Example 1.14. Find the arithmetic mean for the data set { 158, 189, 265, 127, 191 }
Solution.
Note:
1. Mean not necessary takes one of the values in the original data
2. Mean is influenced by extreme value
f1 x1 f 2 x2 ... f n xn 1 n f x
X f i xi i i
n n i 1 f i
xi 2 5 6 8
fi 1 3 4 2
Solution.
xi 2 5 6 8
fi 1 3 4 2
f i xi 2 24 16
Chapter 1 - 10
UECM2623 Numerical Methods and Statistics/UECM1693 Mathematics for Physics II
f i mi
mean for population data:
N
f i mi
mean for sample data: X
n
Weight (kg) 68 9 11 12 17 18 20 21 29
Frequency 4 6 10 3 12
Solution.
Class interval 68 9 11 12 17 18 20 21 29
Class midpoint ( mi ) 10 14.5 19 25
Frequency ( f i ) 4 6 10 3 12
f i mi 60 145 57 300
Chapter 1 - 11
UECM2623 Numerical Methods and Statistics/UECM1693 Mathematics for Physics II
Example 1.17. Find the range for data set A and data set B above.
Variance
The variance is the average of the squared deviation of the data from the mean.
1 N 1 N 2
Population Variance = 2
N i 1
( x i ) 2
( xi ) 2
N i 1
Standard Deviation
The standard deviation is the positive square root of the variance
Sample standard deviation = s s 2
Population standard deviation = 2
Note: 1. A small standard deviation means that the data are distributed closely to their mean.
2. A large standard deviation means that the data are widely scattered about their mean.
3. It is influenced by extreme values.
Example 1.18. Data shows the salary per day for all 6 employees of a small company.
29.50, 16.50, 35.40, 21.30, 49.70, 24.60
Calculate the variance and standard deviation for these data.
Solution.
Mean, =
xi xi ( xi ) 2 xi
2
Chapter 1 - 12
UECM2623 Numerical Methods and Statistics/UECM1693 Mathematics for Physics II
Method 1:
N
1
Population variance = 2
N
(x )
i 1
i
2
Method 2:
xi2
1 N 2
Population variance = 2 ( xi ) 2
N i 1
Example 1.19. A sample consists of 5 data values: 72, 49, 79, 55 and 57. Calculate the variance and
standard deviation.
Solution.
n 5 , xi
xi2
1 n 2 1 n
2
Sample variance = s 2
xi xi =
n 1 i 1 n i 1
N
N
1 n
2
1 n 1 n
Sample Variance = s 2 i i
n 1 i 1
f ( m X ) 2
i i
n 1 i 1
f m 2
i i
n i 1
f m
Example 1.20. Find the variance from the following frequency distribution if it represent
a) population
b) sample
Height (m) 20 22 23 25 26 28 29 31 32 34
Frequency 3 6 12 9 2
Chapter 1 - 13
UECM2623 Numerical Methods and Statistics/UECM1693 Mathematics for Physics II
Solution.
f m 2 f m
2
i i i i
2
N N
1 n
2
1 n
s
2
f i mi f i mi
2
n 1 i 1 n i 1
1.6.1 Quartiles
Quartiles are 3 summary measures that divide a ranked data set into 4 equal parts.
- second quartile (Q2) is the median of a data set.
- first quartile (Q1) is the value of the middle term among the observations that are less than
the median.
- third quartile (Q3) is the value of the middle term among the observations that are greater
than the median.
1
The first quartile = Lower quartile = Q1 = (n 1)th value
4
1
The second quartile = Median = Q2 = (n 1)th value
2
3
The third quartile = Upper quartile = Q3 = (n 1)th value
4
When n is odd, the rule locate the exact position of the quartiles.
When n is even,
n 1 3
a) When n is even and is even, then round all decimal values of (n 1) or (n 1) values,
2 4 4
into .5 value , for example: 2.25 2.5
6.75 6.5
Chapter 1 - 14
UECM2623 Numerical Methods and Statistics/UECM1693 Mathematics for Physics II
n 1 3
b) When n is even and is odd, then round up the decimal value of the (n 1) or (n 1)
2 4 4
value which is greater than .5 value and round down the values which is smaller than .5 value, for
example:
3.75 4
2.25 2
1.6.3 Percentiles
The (approximate) value of the kth percentile, denoted by Pk is
kn
Pk = value of the th term in a ranked data set
100
kn
where k denotes the number of the percentile and n represents the sample size. Note that round to
100
the nearest integer or .5 value, for example: 2.2 2.0
2.3 2.5
2.7 2.5
2.8 3.0
Example 1.21. The following are the scores of 12 students in a mathematics class.
75 80 68 53 99 58 76 73 85 88 91 79
a) Find the values of the three quartiles. Where does the score of 88 lie in relation to these quartiles?
b) Find the interquartile range.
c) Find the quartile deviation.
d) Find the value of the 62nd percentile.
Solution.
Chapter 1 - 15