Documente Academic
Documente Profesional
Documente Cultură
Semester 2, 2004
The aim of this exercise is to learn how to construct and use a Naı̈ve Bayes Classifier for data with
categorical attributes.
In this dataset, there are five categorical attributes outlook, temperature, humidity, windy, and play. We
are interested in building a system which will enable us to decide whether or not to play the game on the
basis of the weather conditions, i.e. we wish to classify the data into two classes, one where the attribute
play has the value “yes”, and the other where it has the value “no”. This classification will be based on
the values of the attributes outlook, temperature, humidity, and windy.
X = (x1 , x2 , x3 , . . . , xn ). (1)
For the weather data, n = 4. The first data instance in Table 1 would be written X = (sunny, hot, high, false).
In a Bayesian classifier which assigns each data instance to one of m classes C1 , C2 , . . . , Cm , a data
instance X is assigned to the class for which it has the highest posterior probability conditioned on X,
i.e. the class which is most probable given the prior probabilities of the classes and the data X (Duda
et al.; 2000). That is to say, X is assigned to class Ci if and only if
P (Ci |X) > P (Cj |X) for all j such that 1 ≤ j ≤ m, j 6= i. (2)
i.e. the product of the probabilities of each of the values of the attributes of X for the given class Ci .
To see how this works, let us consider an example. What is the probability of outlook = sunny given
that play = no? Of the five cases where play = no, there are three where outlook = sunny, thus
P (outlook = sunny|play = no) = 3/5. In the notation of Equation 4, we may write
3
P (x1 = sunny|C2 ) = .
5
Now we will consider how to put these attribute value probabilities together to calculate a P (X|Ci )
according to Equation 4. Let us consider the probability of the first data instance in Table 1, given the
class C2 (i.e. given that play = no). We have
3 Questions
You may answer the following questions using calculations done by hand, as above. If you wish, you
may set up an Excel spreadsheet to help, or even write a small program in the language of your choice.
Question 1 Calculate P (C1 |X = (sunny, hot, high, false)). How would the Naı̈ve Bayes
classifier classify the data instance X = (sunny, hot, high, false)?
Question 2 Does this agree with the classification given in Table 1 for the data instance
X = (sunny, hot, high, false)?
Question 3 Consider a new data instance X 0 = (overcast, cool, high, true). How would
the Naı̈ve Bayes classifier classify X 0 ?
Question 4 Some algorithms (e.g. ID3) are able to produce a classifier that classifies the
data in Table 1 without errors. Does the Naı̈ve Bayes classifier achieve the
same performance? (n.b. This will take some time to compute by hand.)
References
Duda, R. O., Hart, P. E. and Stork, D. G. (2000). Pattern Classification, 2nd edn, Wiley, New York, NY,
USA.
Witten, I. H. and Frank, E. (1999). Data Mining: Practical Machine Learning Tools and Techniques
with Java Implementations, Morgan Kaufmann, San Francisco, CA, USA.
URL: http://www.cs.waikato.ac.nz/˜ml/weka/book.html