Documente Academic
Documente Profesional
Documente Cultură
College / Department:
Assignment No. 8
Online Education
Laboratory Exercise
To perform this activity, you need to download and install WEKA.
Data Transformation
The most compatible file extension that WEKA can process is with .arff and .csv files.
An ARFF (Attribute-Relation File Format) file is an ASCII text file that describes a list of
instances sharing a set of attributes.
A CSV is a simple file format used to store tabular data, such as a spreadsheet or
database. CSV stands for "comma-separated values”.
@data
sunny,hot,high,FALSE,no
sunny,hot,high,TRUE,no
overcast,hot,high,FALSE,yes
rainy,mild,high,FALSE,yes
rainy,cool,normal,FALSE,yes
rainy,cool,normal,TRUE,no
overcast,cool,normal,TRUE,yes
sunny,mild,high,FALSE,no
sunny,cool,normal,FALSE,yes
rainy,mild,normal,FALSE,yes
sunny,mild,normal,TRUE,yes
overcast,mild,high,TRUE,yes
overcast,hot,normal,FALSE,yes
rainy,mild,high,TRUE,no
WEKA
The Explorer is the most useful interface in testing separate classifier. Clicking on the button will
launch the Explorer interface.
1. Kindly click the explorer button
The Explorer Interface
Data Visualization
The figure below indicates that there are five attributes in the given dataset. At the left
panel of the figure it shows the visualization of the data using simple descriptive statatitics.
The dataset contains 14 observations (instances) with five (5) attributes. The ‘play’
attribute will be selected as the class attribute.
The activity aims to determine patterns of playing ‘yes’ or playing ‘no’ based on given
sets of attributes and observations.
3. Click Visualize all to visualize the frequency distribution of each predictor.
Classify Tab
By default, zero classifier is selected. In this activity, we need to select the “Play” attribute in
the dropdown list. The selection enables the process to determine in which the predictor is the
target variable. The ‘play’ attribute has been suggested as the class attribute (i.e. the one that
will be predicted from the others).
4. Get to the Classify mode (by clicking on the Classify tab) as shown below:
5. Next we must select a machine learning classifier to apply to this data. The task is
classification so click on the ‘classify’ tab near the top of the Explorer window.
J48 Results
The results will provide the following information:
Correctly Classified Instances 7 50 %
Incorrectly Classified Instances 7 50 %
The table below indicates that the model derived from the dataset using J48 method has accuracy
results of 50 percent.
Visualizing Data Model (Tree Diagram)
The panel on the lower left headed ‘Result list (right-click for options)’ provides access
to more information about the results. Right clicking will produce a menu from which ‘Visualize
Tree’ can be selected. This will display the decision tree in a more attractive format:
The Generated RuleSets
J48 pruned tree
------------------
Number of Leaves : 5