Sunteți pe pagina 1din 6

Assignment #2

Dataset of Celebrity’s Facial Attributes . This dataset has attributes that classify celebrities that either they
are young or not according to the many attributes mentioned in dataset.
Attributes: 21
Number of objects: 189752

1. Fully-grown Tree: This fully grown decision tree tells either the celebrity is young or not
according to his/her facial features. Each internal node represents a test on an attribute and
each node represents a class e.g. young or not.
Number of leaves: 1246
Size of the tree: 249
Figure 1: Chart of Fully Grown tree.

Figure 2: Visualization of Fully grown tree.

2. Binary Tree: This tree splits into two sub-branches at every node. Each internal node
represents a test on an attribute and each node represents a class e.g. young or not.
Number of leaves: 1246
Size of tree: 2491

Figure 3: Chart of Binary split tree.


Figure 4: Visualization of binary split tree.

3. Pruned Tree: Many of the branches will reflect anomalies in the training data due to noise
or outliers. Tree pruning methods address this problem of overfitting the data. This tree is
pruned by specifying minimum number of objects to be 20. Any branch having less that
20 objects will not be in the tree. Thus increasing the accuracy.
Number of leaves: 118
Size of tree: 235

Figure 5: Chart of Pruned tree.


Figure 6: Visualization of pruned tree with minimum objects 20.

Performance
Trees Accuracy Precision Recall F-measure
Fully Grown 83.8966 % 0.831 0.839 0.818
Tree
Pruned Tree 84.0372 % 0.834 0.840 0.819

Binary Tree 83.8966 % 0.831 0.839 0.818

Pruned tree has the most accuracy , precision and recall and F-measure. This means pruned tree
classifies the problem for this dataset most accurately than either binary tree or fully grown tree.
Giving us an optimal solution for this dataset related problems.
Assignment #3
OneR Classifier
OneR, short for "One Rule", is a simple, yet accurate, classification algorithm that generates one
rule for each predictor in the data, then selects the rule with the smallest total error as its "one
rule". According to our data, if the instance does not have gray hair then he/she is young and if
the instance does have gray hair, he/she will not be young.

Naïve Bayes Classifier


Random Forest Tree

S-ar putea să vă placă și