Documente Academic
Documente Profesional
Documente Cultură
≠ × O
≠ O
√ ⁺ ₋ ×D
₋
× √ M
≠ O
R
⁺ ₋ ≠ N
√ I
× ≠ ⁺ ₋ ⁺ ₋ N
× G
‘’ The essence of mathematics is not to make simple
things complicated, but to make complicated things
simple.”
PRINCIPLE OF
BIOSTATISTICS AND ITS
APPLICATION IN
DENTISTRY
Contents
• Introduction
• Basic terminologies
• Statistical packages
• Conclusions
•
introduction
Statistics has been derived from the Latin word
status.
Object
Variables
Qualitative
Quantitative
characteristic of people
characteristic of
or objects that can’t be people or objects that
naturally expressed in can be naturally
a numeric value. expressed in a numeric
value.
• Age
• Sex( male, female ) • Height
• • Blood pressure
Orthodontic facial type (brachy /
• Attachment level
dolico /meso cephalic) • Survival time of implants
• Level of oral hygiene (G/F/P) • Fluoride concentration of water
• Bone loss affected by periodontitis
• Random variable:
o If a variable can assume a number of different values
such that any particular value is obtained purely by
chance.
Discrete
variable
TYPES
Continuous
variable
• Discrete variable:
o Is a random variable that
can take on a finite number of
values or a countable infinite
numbers(as many as there
are whole numbers) of
values.
• Ex:
o The number of DMF teeth. It
can be any one of the 32
numbers, 0,1,2,3 …………32
o Size of the family
• Continuous variable:
o A random variables
• Ex:
o Amount of new bone
growth
o Force required to
extract teeth
o Amount of blood loss in
surgical procedures
Levels of Measurement
Nom Ordi Inter Rati
inal nal val o
●
The ●
can be ordered,
●
Values are ●
●
possesses
categories and precise
unordered differences the same
can be
categories between units properties of
ordered or
●
Have no of measure the interval
ranked exist
quantitative scale
●
Can not be ●there is no
●
relationship ●
there exists
quantified meaning for
s absolute zero a true zero.
Interval
DATA:
- OR -
Qualitative
DATA
Primary
Secondary
Basic terminology
• Quantitative Data:
o Are those which can be quantified-that is the character
which we take into consideration can be expressed in
numeric value.
o Ex. Plaque score, Incisor width, Height, weight, pulse
rate etc
Basic terminology
• Qualitative data:
o Are those which can not be quantified that is the character which
we take into consideration can not be expressed in numeric
value.
Ex. Beauty, Oral Hygiene status G, F, P etc.
Basic terminology
• Primary Data:
o Are those which are collected afresh and the first
time, and thus happen to be original in character.
Basic terminology
• Secondary Data:
o Are those which have already been collected by someone else &
which have already been passed through the statistical process
Principles of Biostatistics
Interpretation
Presenting of the Data
the Data
Analyzing
the Data
Organization
of the Data
Collection
of the
Data
population
Probability
sampling
Non probability
sampling
Probability sampling
Simple
Systematic
random
sampling
sampling
Stratified
Cluster
random
sampling
sampling
Multistage Multiphase
sampling sampling
Simple Random
Sampling
(UNRESTRICTED
RANDOM SAMPLING)
Applicable when
population is small,
homogenous and
readily available.
Used mainly in
experimental medicine
or clinical trials to
check the efficacy of a
particular drug.
Systematic
sampling
Simple procedure.
Utilized when a complete list
of population from which
sample is to be drawn is
available.
Systematic procedure is
followed to choose a sample by
taking every Kth house or
patient where k refers to the
sample interval
homogenous.
Treatment types
Quota sampling
Convenience
Voluntary – convenience
sampling –
sampling – the sampling
sampling those
sample is self- within groups
most
selected of the
convenient
population
Non parametric
samplings
ry
dat onda
dat mary
a
a
Sec
Pri
Primary data Collection
• Observation method
• Interview method
• Through questionnaire
• Through schedules
• Other methods
o Warranty cards
o Distribution audits
o Content analysis etc
Observation methods
• This method is mainly
used in studies relating to
behavioral sciences
• This method becomes
scientific tool when the
method of collection of
data serves a formulated
research purpose
Observation Methods
Advantage Dis
s advantages
epen
Indep nt of
endeen
spoon
resp ent’s
nden
pond.
willingness to res
vided
rmat
Inform atiioon obtained ed Information pro
hod rreellaates by this method is
by this meetth limited
at is ccu
to what ntly
urrrreen
happening
od
Subjeccttiivve bias is Expensive Meth
eliminated
Interview method
• Involves collecting data through oral verbal stimuli and
reply in terms of oral verbal responses
Personal Telephone
Interview
interview interview
Personal interview Personal interview
Easy Expensive
More information method
DISADVANTAGES
obtained More time
Interviewer by his consuming
own skills can
overcome the
resistance
Telephone interview
• Merits :
o Faster than other methods
o Recall is easy
embarrassment to respondents.
questions
Closed
Ended o Adopted by private
(MCQs)
individuals, research
Open governments
Ended
o A questionnaire consists of a
Coding the
Data
Qualitative • Ex:sex:(male/female)
data
Quantitativ • (Male=1,
e data • Female =2)
Organizing the Data
• Editing the Data:
o Examining the raw data for errors & omissions &
correcting them
o Condensed into manageable groups & tables which
are amenable for further analysis.
Organizing the Data
• Frequency Distribution
o The researches organizes the raw data by using
frequency distribution.
o The frequency is the number of values in a
specific class of data.
o A frequency distribution is the organizing of raw
data in table form, using classes and frequencies.
Organizing the Data
• For the first data set, a frequency distribution is
shown as follow:
Dispersion
(variation)
Measure
of central Shape
tendency
Numerica
l data
(constant)
Measurement of central tendency
03/12/2020 65
IN SIMPLE TERM MEASURE OF
CENTRAL TENDANCY MEANS
Condense the
entire mass of
data
OBJECTIVES
OF CENTRAL
TENDENCY
Facilitate
comparison
03/12/2020 67
MEASURE OF CENTRAL TENDENCY
03/12/2020 68
Measure of central tendency
Harmonic mean
Geometric mean
MODE
MEDIAN
WEIGHTED MEAN
ARITHMETIC
MEAN
ARITHMETIC MEAN
• Mean is obtained by summing up all the observations
and dividing the total by the number of observations
Example
• Community dentist selected 7 chronic periodontitis patients
and measured their attachment loss in mm
• His observations were (in mm): 2.5, 3.1, 1.9, 2.0, 2.97,
1.75, 3.7.
• The mean attachment loss in millimeters of seven
periodontitis patients are given by
03/12/2020 71
For Grouped Data , Mean can be calculated by
using the formula :
• Mean =
• OR
_
x = (f x) / n
Where,
x = Mid value of the class
f = Frequency of the class
n = Total number of observations
Merits Demerits
What is the
difference
between MEAN
and
AVERAGE ??????
WEIGHTED MEAN
• In order to properly reflect the relative importance of the
observations, it is necessary to assign them weights and
then calculate a mean
X w1 X 1 w2 X 2 .........wn X n
n
w X
i 1
i i
• =
If n is odd number,
03/12/2020 78
MEDIAN
• If n is even number,
Median is the MEAN of the middle two terms.
Here n = 8,
2, 3, 4, 4, 5, 5, 6, 7,
03/12/2020 79
Median
Easy to understand It is not based
and calculate. upon all the
observation.
Not affected by
extreme values It is not amenable
to algebraic
treatment
MODE
It is the value of the variable which occurs most
frequently in a series of observations.
E.g.: Find the mode respiration rate per minute in 9 cases
when the rate was found to be 23, 22, 20, 24, 16, 17, 22, 18,
19.
The value of mode is 22.
Mode = 3Median-2mean.
03/12/2020 81
Mode
Merits
Demerits
Calculated both from
qualitative and quantitative
data
Geometric mean
• Geometric mean is the rate of growth is multiplicative not
additive
• Ex:
o During a flu epidemic, 80 cases were reported to the county public
health department in the first week, 160 cases in the second week, 320
cases in the third week, and 640 cases in the fourth week
1
X G n X 1 * X 2 * ........ X n X 1. X 2 ......... X n n
• Ex:
o The geometric mean of two values 4 and 6 is 416 8
Harmonic mean
•
• Example :
o Suppose a dental clinic is 10 miles away from Raj’s home. On the
way to his office the traffic was light and Raj was able to drive 60
miles per hour. However, on the return trip the traffic was heavy and
he drove 30 miles per hour and he totally travelled 20 miles in 30
min. What was the average speed?
• = = 40mph
Harmonic mean
• It is defined as the reciprocal of the arithmetic mean of
the reciprocals of the n observations
• Harmonic mean is given by
1
X HM
1 n
( 1 / X i )
n i 1
280
Colony count
Measures of variability
Coefficient of
Range
variation
Standard
Percentile
deviation
Inter-quartile
Range
Range
• Distance b/n largest and smallest observation
• R = X max – X min
• Ex:
o From the last example,
• Mean = = 394
• Second step: we calculate each dogs difference from the
Mean
• Third step : calculate the variance, i.e. take each
difference, square it, and then average the
result
• Forth step: calculate the standard deviation
26 . 57
CV CFP x100 39 . 33 %
CV of the fracture load for CFP and are 67 . 57
36 . 19
CV x 100 27 . 30 %
CV of the fracture load for PFRP are . PFRP
132 . 55
Standard deviation
Standard error is =
Sample size
03/12/2020 104
Skeweness
• Skeweness means lack of symmetry.
03/12/2020 105
• Skewness is said to be ‘positive’ if the curve is more
elongated to the right side i.e mean > median.
• Skewness is said to be ‘negative’ if the curve is more
elongated to the left side i.e median > mean.
03/12/2020 106
Kurtosis
• The relative flatness or peakedness of the frequency
curve is called kurtosis.
03/12/2020 107
If the value of the Skeweness is zero and the value of the
kurtosis is 3, then the frequency distribution is known as
normal distribution
03/12/2020 108
Good Morning
PRINCIPLE OF
BIOSTATISTICS AND ITS
APPLICATION IN
DENTISTRY
part 2
BY:
MANOHAR BHAT
1ST yr. PG
Contents
• Introduction
• Basic terminologies
• Statistical packages
• Conclusions
•
NORMAL DISTRIBUTION
03/12/2020 114
NORMAL DISTRIBUTION
03/12/2020 120
The limits of acceptance and rejection region may
be constructed from mean ± 1.96SE (5%) and mean
± 2.58SE (1%)
Acceptance region
Rejection region
Rejection region
-1.96 +1.96
95%CL
03/12/2020 121
Type 1 & type 2 error
●
Patient doesn’t have
Patient has disease &
the disease &
diagnostic test detects
diagnostic test doesn’t
the condition
detect the disease
True True
Positive negative
False False
negative positive
Type 2
Patient has the
disease but Type 1
●
Patient doesn’t have
the disease &
diagnostic test is diagnostic test detects
negative
error error the disease
Type 1 and type 2 errors
` Diseased Healthy Reject Null
Hypothesis
(H0), when
it is true
Diagnosed
positive
T+ F+
ROR
E R
1
Y PE
T
Diagnosed R
negative F- E RR
O T-
∏
Y PE
T
α error
Accept Null
Hypothesis(H0),
when it is false
β error
Power of the test
TESTS OF SIGNIFICANCE
03/12/2020 125
P value
• “p” values are used to assess the degree of
dissimilarity between two or more sets of
measurements or between one set of measurements
and a standard.
• p values measure the strength of evidence in scientific
studies.
03/12/2020 127
Procedure and steps
6. Comparison of the calculated test criterion value with
that of theoretical at the prefixed level of significance
03/12/2020 128
Types of Tests of
Significance
1. Parametric tests
2. Non-Parametric tests
03/12/2020 129
Types of tests of significance
Parametric tests
• Z-test for large samples & Z-proportionality test
• Students t-test
• Unpaired t-test for small samples
• Paired t-test for small samples
• Chi-square test
• Poisson test
• Analysis of Variance
• Analysis of Covariance
03/12/2020 130
Z-TEST (Large Samples)
• This test is for testing significant difference between
two means (n>30).
• They compare between two means to suggest whether
both samples come from the same population
CRITERIA
1. Random Samples
2. Quantitative Data
x1 - x2
Where, SE( x1 -x2) is defined
z=
SE( x1 - x2)
03/12/2020 131
Z-PROPORTIONALITY TEST
P1 - P2
z =
SE( P1 - P2)
03/12/2020 132
t-TEST:
This test applied to small samples
03/12/2020 133
UNPAIRED t-TEST
SE(x1 - x2) = /n
1 1 + 2 /n2
1
and 2 respectively called S.D’s
03/12/2020 134
PAIRED t-TEST
It is applied to paired data of independent
observation from one sample only when each
individual gives pair of observation
d
t =
SD/ n
Where,
d = difference between x1 and x2
SD = Std. deviation for the difference
n = sample size
CHI-SQUARE TEST:
(O - E)2
2 =
E
Where, O = Observed frequency
E = Expected frequency & it is given by
3) Goodness of fit:
(to check if data is normally distributed)
03/12/2020 137
Poisson test
• It is a discrete distribution of the number of
2
P= SD
X
03/12/2020 138
Analysis Of Variance (ANOVA)
03/12/2020 140
Multiple Comparison test Procedures
post hoc procedure
• If there are >two treatment groups
• Ex: treatment groups A, B, C, and D
o Dunken’s method : with one control group with multiple test groups
03/12/2020 142
• Ex: hyper sensitivity study which we conducted
Prod
uct
A
Prod
uct B
Prod
uct C + =Age ANOVA
ANCOVA
LIMITATIONS OF TESTS OF HYPOTHESIS
1 ) These tests are not decision making itself but only
useful aids for decision making. Hence proper
interpretation of the statistical evidence is important to
intelligent decisions.
03/12/2020 144
LIMITATIONS OF TESTS OF HYPOTHESIS
3 ) Results of significance are based on probability and as
such can’t be expressed with full certainty
03/12/2020 145
NON- PARAMETRIC TESTS
• Tests in which the population from which the samples are
drawn is not normally distributed alternative
procedures based on less stringent assumptions Non
parametric tests (Distribution free statistics)
03/12/2020 146
NON- PARAMETRIC TESTS
ADVANTAGES
• Can be used without normality assumption
03/12/2020 147
NON- PARAMETRIC TESTS
DISADVANTAGES
• They tend to use less information than parametric methods.
03/12/2020 148
NON- PARAMETRIC TESTS
03/12/2020 155
McNemer test
• It is one of the important test often used when the data
happens to be nominal and related to two related
samples.
03/12/2020 156
Sl No Situation Parametric test Non Parametric test
03/12/2020 157
Correlation
03/12/2020 159
Scatter diagrams
Examples
Scattered diagram
Example
Scattered diagram
• Correlation coefficient is given by the formula:
03/12/2020 171
Simple linear regression
• To study the statistical relationship between two
03/12/2020 172
Multiple regression
• In simple linear regression we study only two variables,
but the prediction will improve if we consider and
include other independent variables.
03/12/2020 173
Advantages achieved by presenting data
03/12/2020 174
PRESENTING THE DATA
Methods of presentation of data
1) Tabulation.
2) Drawing
Diagrams
Graphs
03/12/2020 176
Methods of presentation of data
Tabulation.
03/12/2020 177
Diagrammatic Representation
1.One Dimensional Diagrams
i. Simple Bar Diagram
ii. Multiple Bar Diagram
iii. Component Bar Diagram
iv. Percentage Bar Diagram
3. Pictograms/Picture diagram
years No.
93
94
subjects 92
Subjects
90
First 93 88
85
86
Second 84
84
84
Third 85
81
82
80
Fourth 81 78
76
Total 313 74
First Second Third Four
03/12/2020 179
Distribution of subjects by Year and sex
Year Male Female Multiple Bar Diagram
First 39 54 60
50
54
Second 38 46 40 46 47
39 38
30 38
34
Third 47 38 20
10 17
Fourth 34 17 0
First Second Third Fourth
03/12/2020 180
Component Bar Diagram/proportional
Bar diagram
• Distribution of subjects by blood groups and sex:
• The bars are constructed on the basis of total and The total divided
into its components.
A 39 54 93
B 38 46 84
O 47 38 85
AB 34 17 51
03/12/2020 181
Component Bar Diagram/proportional
Bar diagram
• Distribution of subjects by blood groups and sex
N o . of s tu de n ts
100
90
80 39
70 38 47
60
50
40 34
54 46
30 38
20 17
10
0
A B AB O
03/12/2020 Male Female
182
Percentage Bar Diagram
The absolute values are converted into percentage, and are
presented accordingly.
Percentage distribution of subjects by blood groups
03/12/2020 183
Percentage Bar Diagram
• Percentage distribution of subjects by blood groups and
sex:
Percentage
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
A B AB O
sector.
• Size of each angle is calculated with the following
formula:
Pie- Diagram
• Year-Wise Distribution of Study Population
First 93 107
Second 84 97
Third 85 98
Fourth 51 59 THIRD year
98 degree SECOND year
03/12/2020 186
Pictogram or Picture diagram
Popular method to explain the frequency of
of an attribute.
03/12/2020 187
• Eg. Number of deaths due to cholera in four
cities
Presentation
City Deaths
A 200
B 400
C 800
D 600
03/12/2020 188
Map diagram or spot map
Geographical distribution of frequencies of a
characteristic.
A dot indicates one unit of occurrence.
03/12/2020 189
Graphical Representation
(Quantitative Data)
• Histogram
• Frequency Polygon
03/12/2020 190
Frequency Distribution Table
• Distribution of patients according to their plaque scores
03/12/2020 191
Histogram
• It is a graphical representation of frequency distribution.
03/12/2020 192
Histogram
• Distribution of patients according to their plaque score
Frequency
28 25
24 20
20 15
16 12
10
12
6
8
2
4
0
0.5-1.0 1.0-1.5 1.5-2.0 2.0-2.5 2.5-3.0 3.0-3.5 3.5-4.0
Plaque Class
03/12/2020 193
Frequency polygon
• It is an area diagram of frequency distribution
03/12/2020 194
Frequency polygon
• Distribution of patients according to their plaque score.
Frequency
28
24
20
16
12
8
4
0
0.5-1.0 1.0-1.5 1.5-2.0 2.0-2.5 2.5-3.0 3.0-3.5 3.5-4.0
Plaque Class
03/12/2020 195
Line Chart or Graph
Frequency polygon representing variations by a line.
Shows trend of an event occurring over a period of time.
600
Population
500
in millions
400
300
200
1901 1911 1921 1931 1941 1951 1961 1971
03/12/2020
years 196
Stem and Leaf Plot
• Uses part of the data as
“Stem” and part of the
data as “Leaf”
• They grouped in such a
way that individual
observed values are
retained while shape of
observations are shown
Scatter or Dot diagram
Correlation diagram.
Graphic presentation showing the nature of
correlation between 2 variables.
Scatter diagram showing + ve correlation.
.
..
03/12/2020
. 198
Interpreting the data
INTERPRIT
-George Eliot
Interpretation of the data
• Numbers do not speak for themselves.
INTERPRITATION
Interpretation
• Interpretation demands fair and careful judgments.
Often the same data can be interpreted in different
ways. So, it is helpful to involve others or take time to
hear how different people interpret the same
information
• Interpretation is done based on the knowledge of
collection , organization , analyzation and presentation
of the data
Statistical packages
• STATA
• SPSS
• Statistica
• Biostat
• Epi Info
o Ralloc
o nMASTER
Some of the online
biostatistics calculators
• STATISTIC CALCULATOR - VERSION 3
o http://www.danielsoper.com/statcalc3/default.aspx
• STAT TREK
o http://stattrek.com/tables/stattables.aspx
• GRAFF-PAD SOFTWEAR
o http://www.graphpad.com/quickcalcs/index.cfm
treatments interact, and evaluate many life and death situations in dental and
medical sciences.
• In this era of evidence based studies, biostatistics lays down the scientific
foundation03/12/2020
for rational thinking. 204
References
• Biostatistics for oral health care-1st edition; by Jay S. kim