Documente Academic
Documente Profesional
Documente Cultură
ABSTRACT
In the Internet era, the volume of data we deal with has
grown to terabytes and petabytes. As the volume of data
keeps growing, the types of data generated by
applications become richer than before. As a result,
traditional relational databases are challenged to capture,
store, search, share, analyze, and visualize data.
Traditional data modeling focuses on resolving the
complexity of relationships among schema-enabled data.
However, these considerations do not apply to nonrelational, schema-less databases. As a result, old ways
of data modeling no longer apply. We need a new
methodology to manage big data for maximum business
value. HACE theorem that characterizes the features
of the Big Data revolution, and proposes a Big Data
processing model, from the data mining perspective
which is disusing in this paper.
I.
INTRODUCTION
II.
www.ijsret.org
768
International Journal of Scientific Research Engineering & Technology (IJSRET), ISSN 2278 0882
Volume 4, Issue 7, July 2015
III.
Our capacity for big data era has never been so intense
furthermore, colossal following the time when the
creation of the data innovation in the mid nineteenth
century. As another sample, on 4 October 2012, the first
presidential level headed discussion between President
Barack Obama and Governor Mitt Romney activated
more than 10 million tweets inside of 2 hours [3].
Among every one of these tweets, the particular minutes
that produced the most dialogs really uncovered the
general population hobbies, for example, the dialogs
about Medicare and vouchers. Such online discussions
provide a new means to sense the public interests and
generate feedback in real-time, and are mostly appealing
compared to generic media, such as radio or TV
broadcasting.
IV.
PROBLEM STATEMENT
V.
EXISTING APPROACHES
Right now, Big Data preparing for the most part relies
on upon parallel programming models like MapReduce,
and additionally giving a distributed computing stage of
Big Data administrations for people in general.
MapReduce is a bunch situated parallel figuring model.
There is still a certain crevice in execution with social
databases.
Enhancing the execution of MapReduce and improving
the ongoing way of expansive scale information
preparing have gotten a noteworthy measure of
consideration, with MapReduce parallel writing
computer programs being connected to numerous
machine learning and information mining calculations.
Information mining calculations generally need to look
over the preparation information for getting the
measurements to explain or streamline model
parameters. It calls for escalated registering to get to the
expansive scale information every now and again. To
enhance the productivity of calculations, Chu et al.
proposed a broadly useful parallel programming system,
which is pertinent to a substantial number of machine
learning calculations taking into account the
straightforward MapReduce programming model on
multicore processors.
Ten traditional information mining calculations are
acknowledged in the system, including by regional
standards weighted direct relapse, k-Means, logistic
relapse, gullible Bayes, direct bolster vector machines,
the free variable examination, Gaussian discriminant
investigation, desire expansion, and back-proliferation
neural systems [1]. With the examination of these
traditional machine learning calculations, we contend
that the computational operations in the calculation
learning procedure could be changed into a summation
operation on various preparing information sets.
Summation operations could be performed on distinctive
subsets freely and accomplish punishment executed
effectively on the MapReduce programming stage[1].
Along these lines, a vast scale information set could be
isolated into a few subsets and allocated to numerous
Mapper hubs.
At that point, different summation operations could be
performed on the Mapper hubs to gather middle of the
road results. At long last, learning calculations are
www.ijsret.org
769
International Journal of Scientific Research Engineering & Technology (IJSRET), ISSN 2278 0882
Volume 4, Issue 7, July 2015
VI.
RESEARCH INITIATIVES
VII.
PROPOSED SOLUTION
www.ijsret.org
770
International Journal of Scientific Research Engineering & Technology (IJSRET), ISSN 2278 0882
Volume 4, Issue 7, July 2015
VIII.
Fig 3: FP Growth
IX.
CONCLUSION
www.ijsret.org
771
International Journal of Scientific Research Engineering & Technology (IJSRET), ISSN 2278 0882
Volume 4, Issue 7, July 2015
REFERENCES
[1] C.T. Chu, S.K. Kim, Y.A. Lin, Y. Yu, G.R. Bradski,
A.Y. Ng, and K. Olukotun, Map-Reduce for Machine
Learning on Multicore,Proc. 20th Ann. Conf. Neural
Information Processing Systems (NIPS 06), pp. 281288, 2006.
[2] X. Wu, Building Intelligent Learning Database
Systems, AI Magazine, vol. 21, no. 3, pp. 61-67, 2000.
[3] Twitter Blog, Dispatch from the Denver Debate,
http://blog.twitter.com/2012/10/dispatch-from-denverdebate.html,Oct. 2012.
www.ijsret.org
772