Documente Academic
Documente Profesional
Documente Cultură
Bayesian classifiers are statistical classifiers, they can predict class membership probabilities, such as the probability that a given tuple belongs to a particular class. Bayesian classification is based on Bayes theorem. Comparing Bayesian classifier known as the nave Bayesian classifier to be comparable in performance with decision tree. Bayesian classifiers have also exhibited high accuracy and speed when applied to large databases.
1
1) Bayes Theorem
Bayesian Classification
1. Bayesian Theorem
Given training data X, posteriori probability of a hypothesis H, P(H|X), follows the Bayes theorem.
P ( X | H ) P ( H ) P(H | X) P(X)
Where
X is considered evidence.
H be some hypothesis
P(H/X) is the posterior probability of H conditioned on X. P(X/H) is the posterior probability of X conditioned on H.
5
This greatly reduces the computation cost: Only counts the class distribution. If Ak is categorical, P(xk|Ci) is the # of tuples in Ci having value xk for Ak divided by |Ci, D| (# of tuples of Ci in D) If Ak is continous-valued, P(xk|Ci) is usually computed based on Gaussian distribution with a mean and standard deviation and P(xk|Ci) is ( x )2 1 2 2 g ( x, , ) e 2
P ( X | C i ) g ( xk , C i , Ci )
The data tuples are described by the attributes age, income, student, and credit_rating. The class label attribute, buys_computer, has two distinct values (namely, {yes, no}). Let C1 correspond to the class buys_ computer = yes and C2 correspond to buys_computer = no.
We need to maximize P(X/Ci)P(Ci), for i = 1, 2. P(Ci), the prior probability of each class, can be computed based on the training tuples
9
P(Ci):
P(age = <=30 | buys_computer = yes) = 2/9 P(age = <= 30 | buys_computer = no) = 3/5 P(income = medium | buys_computer = yes) P(income = medium | buys_computer = no) P(student = yes | buys_computer = yes) P(student = yes | buys_computer = no) P(credit_rating = fair | buys_computer = yes) P(credit_rating = fair | buys_computer = no)
= 0.222 = 0.6 = 4/9 = 0.444 = 2/5 = 0.4 = 6/9 = 0.667 = 1/5 = 0.2 = 6/9 = 0.667 = 2/5 = 0.4
How to deal with these dependencies? With help of Bayesian Belief Networks
11
X Z
X and Y are the parents of Z, and Y is the parent of P No dependency between Z and P Has no loops or cycles
12
Smoker
LC
LungCancer Emphysema
0.8
0.5
0.7
0.1
~LC
0.2
0.5
0.3
0.9
CPT shows the conditional probability for each possible combination of its parents
PositiveXRay
Dyspnea
Issues of Classification
1. 2. 3. 4. 5. 1. 2. 3. 4. 5. 6. Accuracy Training time Robustness Interpretability Scalability Credit approval Target marketing Medical diagnosis Fraud detection Weather forecasting Stock Marketing
15
Typical applications