Sunteți pe pagina 1din 2

National Institute of Technology Rourkela

Department of Computer Science and Engineering


B.Tech. (7th Semester) Mid Semester Examination (September), 2017
Subject: Data Warehousing and Mining (CS 425)
Unnecessary long answers may attract negative mark. It is a 2-page question.

Full Marks: 30 Time: 2 Hours

1. Modified Weighted k-NNC assigns non-linear weight as e−d to a nearest neighbors of an unseen object (u),
where d is the distance from unseen object to the neighbor. Sorted distances from the unseen objects to its
neighbors (first to last) and their class labels are given. Identify the class label of the unseen object.
K-NN (u) = { x1C1 , x2C2 , x3C2 , x4C3 , x5C2 }. Superscript represents class label.
Distance vector from u to the neighbors = (1, 4, 5, 7, 10) [ 3]
2. A training dataset is given in Table 1 with two attributes X and Y, and two classes ” + “ and ”−′′ . Each
attribute can take values from {0, 1, 2}. Answer the following questions.
(a) Build a decision tree on the training dataset.
(b) The concept for “ + ” class is Y = 1 and the concept for ” − “ class is X = 0 ∨ X = 2. Does your
decision tree capture this concept.
(c) What are the accuracy, precision, recall and F1-measure of the decision tree on the training set.
(d) What are the accuracy, precision, recall and F1-measure of the decision tree on the training set if
following cost matrix is considered.


 0 if i = j;

C (i, j) = 1 if i = +1, j = −1;
 #” − “ instances

if i = −, j = +;

#” +′′ instances
[ 3 + 1 + 3 + 3]

3. Consider the dataset given in Table 1 and predict the class label of an unknown instance with X = 2, Y = 2.
using KNN classifier (K = 111). [ 4]
4. Consider the dataset given in Table 2 (overleaf) and predict the class label of an unknown object X =
(Yes, Single, Low) using Naive Bayes classifier. [ 5]
5. What is model over-fitting? How do you estimate generalization error of a decision tree? [ 4]
6. Apply Random Forest with T = 3. Each tree is built with one attribute, from a bootstrap sample with
number of instances 5. Identify the class label of X = (Yes, Single, Low) (Table 2) [ 4]
[ P.T.O]
2

Table 1: Training data for Question No. 2 and Question No. 3


X Y #Instances
+ -
0 0 0 100
1 0 0 0
2 0 0 100
0 1 10 100
1 1 10 0
2 1 10 100
0 2 0 100
1 2 0 0

Table 2: Dataset: Question 4 and Question 6


Tid Home Marital Income Defaulter
Owner Status (Class)
1 Yes Single High No
2 No Married Medium No
3 No Single Low No
4 Yes Married High No
5 No Divorced Low Yes
6 No Married Low No
7 Yes Divorced High No
8 No Single Low Yes
9 No Married Low No
10 No Single Low Yes

S-ar putea să vă placă și