Documente Academic
Documente Profesional
Documente Cultură
1.Ambika M Patil, M.Tech Computer Science Engineering, Center for P G Studies Jnana Sangama VTU Belagavi,
Belagavi, INDIA, Ambika702@gmail.com
2.Assistant Prof.Ranjana B Nadagoudar, Computer Science Engineering Department, Center for P G Studies Jnana Sangama
VTU Belagavi, Belagavi, INDIA
3.Dhananjay A Potdar , Dhananjay.potdar@gmail.com
ABSTRACT - While Big Data gradually become a hot topic Variety. Volume shows the huge amount of data being
of research and business and has been everywhere used in produced from multiple sources. Velocity is concerned with
many industries, Big Data security and privacy has been both how fast we produce and collect data, but also how fast
increasingly some of the collected data is changing. Variety shows their
concerned. However, there is an obvious contradiction highly distributed and various nature. The data generation rate
between Big Data security and privacy and the widespread use is growing so rapidly that it is becoming very difficult to
of Big Data. There have been a various different privacy handle it using traditional methods or systems [1]. In the
preserving mechanisms developed for protecting privacy at 3Vs model, Variety indicates the various types of data
different stages (e.g. data generation, data storage, data which include structured, semistructured and unstructured
processing) of big data life cycle. The goal of this paper is to data; Volume means data scale is large; Velocity indicates all
provide a complete overview of the privacy preservation processes of Big Data must be quick and timely in order to
mechanisms in big data and present the challenges for existing maximize value of Big Data as shown in Fig.1. These features
mechanisms and also we illustrate the infrastructure of big that Big Data handles huge amount of data and uses various
data and state-of-the-art privacy-preserving mechanisms in types of data including unstructured data and attributes that
each stage of the big data life cycle. This paper focus on the were never used in the past distinguish data mining from Big
anonymization process, which significantly improve the Data.
scalability and efficiency of TDS (top-down-specialization) In 2011, IDC defined big data as big data technologies
for data anonymization over existing approaches. Also, we describe a new generation of technologies and architectures,
discuss the challenges and future research directions related to designed to economically extract value from very large
preserving privacy in big data. volumes of a wide variety of data, by enabling the high-
velocity capture, discovery, and/or analysis[2].
In this definition, features of big data may be abridged as
KEYWORDS - Big data, privacy, big data storage, big data 4Vs, i.e., Variety, Velocity, Volume and Value, where the
processing. Data anonymization, top-down specialization, implications of Variety, Velocity, Volume is same as the 3Vs
MapReduce, cloud, privacy preservation. model respectively and Value refers big data have great social
value. The 4Vs model was widely recognized because it
indicates the most critical problem which is how to discover
I. INTRODUCTION value from an enormous, various types, and rapidly generated
As a result of recent technological development, the amount of datasets in big data.
data generated by social networking sites, sensor networks,
Internet, healthcare applications, and many other companies,
is significantly increasing day by day. The term Big Data
reflects the trend and salient features of the data being
produced from various sources. Basically Big Data can be
described by 3Vs which stands for Volume, Velocity and