Documente Academic
Documente Profesional
Documente Cultură
PROBABILITY AND
STATISTICS
SYLLABUS FOR ENGINEERING
PROBABILITY AND STATISTICS
Course Description:
This course focuses on the descriptive branch of
statistics that comprise of data analysis and
organization of raw data into frequency table,
measures central tendency and dispersion,
introduction to probability and counting
techniques, probability laws, Bayes rule, random
variables, discrete and continuous probability
distribution and its applications to real world
setting.
COURSE OUTLINE:
PRELIM PERIOD
Definition of terms
Types of Statistics (Descriptive and inferential statistics
Level of data measurements
Grouped and Ungrouped data
Measures of Central Tendencies (Mean, median and Mode)
Measures of Variability (Range, variance an std. dev.)
Organizing data - Construction of frequency table, and its
graphical representation (freq. histogram and polygon)
Measures of Position (Percentile, Decile and Quartile)
Shape of data (Skewness and Kurtosis)
COURSE OUTLINE:
MIDTERM
Probability concept and theories
Events and Sample Space
Counting Rules
Tree Diagram
Venn Diagram
Addition Rule (Mutually and Not mutually exclusive events)
Multiplication Rule (dependent and Independent events)
Conditional Probability
Bayes Rule
Concept of random variables
- Discrete and continuous probability
COURSE OUTLINE:
FINAL PERIOD
Random Variable and Mathematical expectations
Special discrete probability Distribution
(binomial,multinomial geometric, hypergeometric, neg.
binomial and poisson distribution)
Special Continuous probability Distribution
(Uniform and Normal distribution)
Z and T test
Testing of Hypothesis
Quizzes: there are two exams in the course per period.
The time limit on all exams is 1.5 hours for T-Th class
and 1 hour for MWF.
Assignment: There are one homework (problem set) in
one period. Show all your calculations. You will receive
credit for honest attempts to answer all the questions,
even if your answers are incorrect. Homework that are
sloppy or incomplete will not earn full credit.
Frequency Polygon
Constructed by plotting class frequencies against class marks
and connecting the consecutive points by straight lines.
DESCRIPTIVE STATISTICS
Measures of Central Location
When the data are grouped into a frequency distribution, the median is obtained
by finding the cell that has the middle number and then interpolating within the
cell.
n/2 <cf1-1 n/2 >cfi-1
x = Lb + -------------------- (i) OR x = Ub - -------------------- (i)
fi fi
where:
Lb = lower class boundary of the interpolated interval
Ub = lower class boundary of the interpolated interval
<cfi-1 = less than cumulative frequency of the class before interpolated interval
>cfi-1 = greater than cumulative frequency of the class before interpolated interval
fi = frequency of the interpolated interval
i = class size
n = number of data points.
THE MODE
The last measure of central tendency is the mode. The value
that is observed most frequently. The mode is undefined for
sequences in which no observation is repeated.
The total frequency of all values less than the upper class
boundary of a given interval up to and including that interval is
called the Cumulative frequency
Example:
One hundred families were chosen at random, and their yearly income was
recorded.
Income of 100 families
100
Frequency distribution table
14 and 10 are Income in No. of families
called class limits thousands
14 is the upper limit 10 14 3
10 is the lower limit
15 19 12
20 24 19
Class intervals 25 29 20
Class frequency
30 34 23
In the table, 35 39 18
the class
width is 5 40 - 44 5
Total 100
27 32 28 32 31
35 28 44 45 36
33 40 41 36 35
39 37 39 37 44
41 41 35 35 33
23 60 79 32 57 74 52 70 82 36
80 77 81 95 41 65 92 85 55 76
52 10 64 75 78 25 80 98 81 67
41 71 83 54 64 72 88 62 74 43
60 78 89 76 84 48 84 90 15 79
34 67 17 82 69 74 63 80 85 61
Formula:
Range = Highest score Lowest Score or R = (H L)
s = sqrt (0.6325)
= 0.795298686 or 0.80 (sample standard deviation)
The frequency table (on the right side) represent the final
examination for an statistics course. Find the population range,
population variance and population standard deviation
10 19 3 14.5 3
20 29 2 24.5 5
30 39 3 34.5 8
40 49 4 44.5 12
50 59 5 54.5 17
60 69 11 64.5 28
70 79 14 74.5 42
80 89 14 84.5 56
90 99 4 94.5 60
Range = Highest Upper Class Boundary - Smallest Lower Class Boundary
= 99.5 9.5
= 90
(x - )
= -----------------
3(14.5 66)2 +2 (24.5 66)2 +3 (34.5 66)2 + 4(44.5 66)2 +
5(54.5 66)2 +11 (64.5 66)2 +14 (74.5 66)2 +
14(84.5 66)2 + 4(94.5 66)2
= ----------------------------------------------------------------------------
60
= 432.75
= 20.80264406 or 20.80
Measures
Measures of
of Shape
Shape
SKEWNESS
KURTOSIS
Measures
Measures of
of Shape
Shape
Skewness
SK = 0
Symmetric (Normal)
SK= (Xi - X)/ 3
n SK > 0
where: Positively Skewed
Xi - individual reading
s - standard deviation
X - sample mean SK< 0
n - sample size Negatively Skewed
SKEWNESS
- degree of Symmetry
Population Sample
= 3
MesoKurtic (Normal)
= [(Xi - X)/ 4
where:
n
> 3
Xi - individual reading LeptoKurtic
s - standard deviation
X - sample mean
n - sample size < 3
PlatyKurtic
KURTOSIS
- flatness or peakedness of a distribution
Population Sample
= 3 > 3 < 3
MesoKurtic (Normal) LeptoKurtic PlatyKurtic
Measures of Position :
PERCENTILES, DECILES AND QUARTILES
Measures of position are used to described the location of a particular
observation in relation to the rest of the data set.
Percentile - are values that divide the ranked data set into 100 equal
parts.
The pth percentile is the value that separate the bottom p% of the
ranked scores from the top (100 p)%.
Quartiles are values that divide the ranked data set into four equal parts.
The three quartiles denoted by Q1 , Q2 , Q3 divide the ranked scores into four
equal parts.
Deciles - are values that divide the ranked data set into ten equal parts.
here are nine deciles denoted by D1 , D2 , .D9 which partition the data
into 10 groups with about 10 % of the data in each group
The table below gives the ages of commercial aircraft randomly selected
from several airlines.
2 7 11 15 19
2 7 11 15 19
2 7 12 15 20
2 7 12 15 20
4 7 12 15 20
4 10 14 15 22
4 10 14 16 24
4 10 14 16 25
5 10 14 17 25
5 10 15 17 27
Find the percentiles for the ages 10, 15, and 20. 10 = P30 15 = P58 20 = P84
Find P90, D8, and Q3.
The percentile for observation x if found by dividing the
number of observations less than x by the total number of
observations and then multiplying this quantity by 100. then
rounded to the nearest whole number.
Median = P50 = D5 = Q2
IQR = Q3 Q1
RAW DATA NEED TO BE RANKED PRIOR TO
FINDING MEASURES OF POSITION.
3.0 5.0 6.2 7.6 9.4
3.3 5.2 6.3 7.6 9.5
3.5 5.5 6.4 7.7 9.5
3.5 5.5 6.6 7.8 10.0
3.6 5.5 6.6 7.8 10.5
4.0 5.8 6.8 8.5 10.8
4.0 5.8 6.8 8.5 10.9
4.2 5.9 6.8 8.8 11.0
4.6 6.0 7.0 8.8 11.0
To find for the tenth percentile for the data : compute for i = 10 (45) / 100 = 4.5
The next integer greater than 4.5 is 5. The observation in the fifth position in the
above data is 3.6. therefore P10 = 3.6. Note that at least 10% of the data are 3.6
or less and at least 90% of the data are 3.6 or more.