Documente Academic
Documente Profesional
Documente Cultură
[n])
5. EXPERIMENTAL RESULTS AND
DISCUSSIONS
Table 1 and 2 denotes the database on which the
modified Apriori was tested. Table 1 is a real
time database for finding the contents of
drinking water. Algorithm was used to predict
the common minerals and other ingredients
found in water.
Table 2 is the 4-transaction database used to
affirmthe comparison results.
By considering the database of water content the
step wise process is explained here by. The
attributes under consideration include, dissolved
solids, carbonates, chlorides, nitrates and
sulphates. Each transaction is scanned one after
the another and the probabilistic array PA
0
[n] is
populated. Now C
o
is calculated by equation
(1). C
o
is the initial correlation threshold and is
initialized to 2/9, the correlation constant is
also initialized [ ] to 5/9. The data pruning is
carried out by checking the PA
0
[0] with C
0
.
Again frequent itemsets are listed and the
probabilistic array is modified by calculating a
new correlation threshold. In second step the 2-
itemset threshold need to be found out. From
equation (1), the new C1 is found out and the
value was nearly, 0.20027. There are itemsets
whose threshold is below C1, thus they get
pruned away. The process repeats until all the
frequent items are visited. The correlation
threshold in next step was obtained to be,
0.18049. Table 3 shows the various item sets
obtained in each iteration. Table clearly draws
distinction between the Apriori and the proposed
algorithm.
Observation of 4-Transaction database
Result affirms that the proposed algorithm
generates more candidate keys than the
traditional Apriori. The number of rules
generated by n-transaction (nT) database is 2n [
]. Thus the number of rules generated in each
step during itemset generation increases by a
factor 2n* a; where a is the difference in number
of frequent item in traditional to proposed
algorithm. The comparison on the number of
rules generate is shown in graph 1. The point
that needs to be noted is that throughout the
problem the confidence rate was fixed to be
70%. If we need to have more accurate results
we can fix the confidence rate to a higher level.
Higher the confidence rate, greater is the
performance of the algorithm.
Time complexity of algorithm
6. CONCLUSIONS
In this paper, correlation threshold was proposed
to modify the traditional Apriori algorithm.
Through pruning the infrequent itemsets and by
retaining the frequent ones strong rules are
created. Database scan which was fully
depended on the length of frequent itemset was
supplanted by the introduction of probabilistic
0 5 10
1
2
3
Frequent itemsets
I
t
e
m
s
e
t
Proposed
Apriori (9T)
Existing
Apriori(9T)
Proposed
Apriori(4T)
Existing Apriori
(4T)
array. This helped to attain a better time
complexity. Results affirm the fact that; with
extended inter-transactional association,
comprehensive and more interesting relations
were able to mine fromthe databae.
REFERENCES
[1] Sanjeev Tao, Prinyanka Gupta,
"Implementing Improved Algorithm over
Apriori Data Mining Association Rule
Algorithm", IJ CST, Volume 3, Issue 1, J an-
March 2012, pp.489-493.
[2] Huan Wu, Zhigang Lu, Lin Pan, Rongsheng
Xu and Wenbao J iang, "An Improved Apriori-
based Algorithm for Association Rules
Mining",Sixth international conference on fuzzy
systems and knowledge discovery,2009.pp.51-
55.
[3] Colin Cooper, and Michele Zito, Realistic
Synthetic Data for Testing Association Rule
Mining Algorithms for Market Basket
Databases, Knowledge Discovery in Databases:
PKDD 2007, Volume 4702/2007, pp.398-405.
[4] Lei J i, Baowen Zhang, J ianhua Li,"A New
Improvement on Apriori Algorithm",IEEE 1-
4244-0605-6/06, 2006, pp.840-844
[5] David L.Olson and Desheng Wu. Decision
making with uncertainity and data mining. In
X. Li, S. Wang and Z.Y. Dong(Eds.), Lecture
notes in Artificial Intelligence (pp. 1-9). Berlin:
Springer(2005).
[6] Aparna S. Varde, Makiko Takahashi, Elke A.
Rundensteiner, Matthew 0.Ward, Mohammed
Maniruzzaman and Richard D. Sisson
J r.,"Apriori Algorithm and game of life for
predictive analysis in materials science",
International J ournal of Knowledge based and
Intelligent Engineering Systems 8, 2004, pp.1-
16
[7] Nandagopal,S., "Mining of Meteorolofical
Data Using Modified Apriori Algorithm",
European J ournal of Scientific Research,
Volume 47 No.2, 2010, pp.295-308.
[8] R. Agrawal, and R. Srikant, "Fast
Algorithms for Mining Association Rules", In
Proc. VLDB 1994, pp. 487-499.