Documente Academic
Documente Profesional
Documente Cultură
AbstractE-Complaint
documents
provide
information that can be used to measure or
evaluate the services that given by campus to its
students, lecturers, staff, and public. Using text
classification, the documents can be classified
based on its importance and urgency. This
classification will be useful for campus to make the
services better. Classifying the documents can also
make the complaints follow-up from campus
become faster than before. This paper discussed
Directed Acyclic Graph Support Vector Machine
(DAGSVM) method based on Analytic Hierarchy
Process (AHP) to classify E-Complaint documents
into four classes based on the importance and
urgecy. Highest accuracy that is obtained from this
research is 82,61% with Sequential Training SVM
parameters are = 0.5, constant of = 0.01,
MaxIter = 10, and = 0.00001, training data 70%,
using stemming, and Gaussian RBF kernel without
using AHP weight.
Keywords: documents classification, E-Complaint,
AHP, DAGSVM.
E-Complaint User
I. INTRODUCTION
E-
Center of Information,
Documentation, and
Complaints (PIDK),
Brawijaya University
Center of Information,
Documentation, and
Complaints (PIDK),
Brawijaya University
continue response to
user
Important
Important
and Urgent
(weight 1)
weight 1
term 11
term
Important
and
Important and
Not Urgent
Urgent
Not
(weight 2)
weight 2
...
term
5
Level
Level 00
Not
Not Important
Important
and Urgent
(weight 3)
weight 3
...
term
9
Not
Not Important
Important
and Not Urgent
Urgent
(weight 4)
weight 4
Level
Level 11
...
Level
Level 22
term
term15n
Linguistic judgments
X is equally preferred over Y
X is equally to moderately preferred over Y
X is moderately preferred over Y
X is moderately to strongly preferred over Y
X is strongly preferred over Y
X is strongly to very strongly preferred over Y
X is very strongly preferred over Y
X is very strongly to extremely preferred over Y
X is extremely preferred over Y
H1
Margin = 2 / ||w||
H
w
H2
Optimal
hyperplane
w x b 0
Support Vector
w x b 1
f ( x) w . x b
we have defined
n
w i yi xi
Dij yi y j ( K ( xi , x j ) 2 )
i 1
and
1
b w . x w . x
2
+
2.
i min{max[ (1 Ei ), i ], C i }
i i i
3.
4.
f ( x) i yi x . xi b
i 1
|| x y || 2
K ( x, y) exp
2
2
not 1
f ( x) i yi K ( x, xi ) b
2
3
4
i 1
1 vs 4
not 2
3
4
3 vs 4
not 4
1
2
3
2 vs 4
not 4
not 1
2
3
2 vs 3
1 vs 3
not 3
1
2
1 vs 2
3
2
1
Fig. 4. Ilustration of DAGSVM classifier for four
classes.
Input E-Complaint
documents
Text
Preprocessing
Feature
Normalization
III. RESULT
The number of total dataset that is used in this
paper is 153 documents in Indonesian language. From
this number, then data are divided into two kind of
data, training data and testing data. The number of
each kind of data is based on the ratio of the amount
training and testing data. Total for each training data
and testing data in every ratio is shown in Table II.
TABLE II
TOTAL OF TRAINING DATA AND TESTING DATA IN EVERY RATIO
Ratio
80% : 20%
70% : 30%
60% : 40%
50% : 50%
40% : 60%
Total of Training
Data
122
107
91
76
61
Total of Testing
Data
31
46
62
77
92
AHP
Weighting
DAGSVM
Output Classification
Result
Finish
IV. CONCLUSION
Fig. 9. Accuracy Result of Testing Constant of Value in
Sequential Training SVM Without Using AHP Weight and
Using AHP Weight ( = 0.5, C = 1, MaxIter = 10,
= 0.00001)
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
ACKNOWLEDGMENT
We would like to express our very great
appreciation to Ms. Prima Vidya Asteria, M.Pd for her
advice and assistance in classification of the EComplaint documents and terms selection. We would
also like to thank to Center of Information,
Documentation, and Complaints (PIDK), Brawijaya
University for enabling us to get complaint
documents.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
E-Complaint Brawijaya University. (2014, Jan 7). EComplaint UB | Home. [Online]. Available: http://ecomplaint.ub.ac.id/.
E. Horvitz, A. Jacobs, and D. Hovel, Attention-Sensitive
Alerting, Proceedings of UAI 99, Conference on Uncertainty
and Artificial Intelligence, Stockholm, Sweden, pp. 305-313,
1999.
S. Joshi and B. Nigam, Categorizing The Document Using
Multi Class Classification in Data Mining, International
Conference on Computational Intelligence and
Communication Systems, IEEE, pp. 251-255, 2011.
C. N. Silla Jr and A. A. Freitas, A Global-Model Nave Bayes
Approach to the Hierarchical Prediction of Protein Functions,
Ninth IEEE International Conference on Data Mining, 2009.
D. Ghazi, D. Inkpen, and S. Szpakowicz, Hierarchical versus
Flat Classification of Emotions in Text, Proceedings of the
NAACL HLT 2010 Workshop on Computational Approaches
to Analysis and Generation of Emotion in Text, pp. 140-146,
2010.
A. Khan, B. Baharudin, L. H. Lee, and K. Khan, A Review of
Machine Learning Algorithms for Text-Documents
[19]
[20]