Documente Academic
Documente Profesional
Documente Cultură
School of Law
University of Petroleum and Energy Studies
Dehradun
(November, 2019)
DECLARATION/UNDERTAKING OF ORIGINALITY
I undertake full responsibility of the contents of this Synopsis complying with the
‘Academic Integrity’ policy of UPES and I understand that if this work is found in
violation of the same, this may result in rejection of Synopsis/Dissertation and entail
appropriate disciplinary proceedings as per Rules of the University.
Signature
[Name of the Student]
Date……………
Place…………….
Signature
[Name of the Mentor]
Date……………
Introduction
These days, web app and services provide the users a wide range of services such as e-
commerce, e-banking, e-governance, etc. To access them, the users need to provide
private information such as social security nos., card numbers. Companies and
organizations want to gather and analyze this information efficiently through data
mining processes. The aim of data mining is to turn raw data into useful information
that allows developing more effective marketing strategies, increase sales, decrease
costs, etc.1
The field of data mining is attaining significant recognition due to the availability of
large amounts of data, easily collected and stored via computer systems. The
information can be used to increase revenue, cut costs or both. Data mining software is
one of a number of analytical tools for analyzing data.2 It allows users to analyze data
privacy is growing constantly. Data mining, popularly known as Knowledge Discovery
in Databases (KDD), is the non-trivial extraction of implicit, previously unknown and
potentially useful information from databases.3 Though, data mining and KDD are
frequently treated as synonyms, data mining is actually part of the knowledge discovery
process.4
Data mining is the process of analyzing data from different perceptions and
summarizing it into useful information from many different angles, categorizing it, and
summarizing the relationships recognized. Continuous innovations in computer
processing power, disk storage, and statistical software are dramatically increasing the
accuracy of analysis while driving down the cost.5 “Data mining, the discovery of new
and interesting patterns in large datasets, is an exploding field. One aspect is the use of
data mining to improve security, e.g., for intrusion detection. A second aspect is the
potential security hazards posed when an adversary has data mining capabilities.”6
1
https://ieeexplore.ieee.org/document/8123561
2
Introduction to Data Mining and Knowledge Discovery, Third Edition ISBN: 1-892095-02-5, Two
Crows Corporation, 10500 Falls Road, Potomac, MD 20854 (U.S.A.), 1999
3
Dunham, M. H., Sridhar S., “Data Mining: Introductory and Advanced Topics”,Pearson Education,
New Delhi, ISBN: 81-7758-785-4, 1st Edition, 2006
4
Fayyad, U., Piatetsky-Shapiro, G., and Smyth P., “From Data Mining to Knowledge Discovery in
Databases,” AI Magazine, American Association for Artificial Intelligence, 1996
5
L. Getoor, C. P. Diehl. “Link mining: a survey”, ACM SIGKDD Explorations, vol. 7, pp. 3-12, 2005.
6
http: //ijcttjournal.org/ Volume4/issue-2/IJCTT-V4I2P129.pdf
Privacy issues have appealed the attention of the media, government agencies, privacy
advocates and businesses.
“Data mining is the analysis of (often large) observational data sets to find unsuspected
relationships and to summarize the data in novel ways that are both understandable and
useful to the data owner”.8
The legal and policy foundation for data mining is based on the some specified
protocols, which established penalization for data security and privacy Government
Act, which requires consequence to provide a level of security for data mining, that is
adequate with the level of security provided for data. 10
PRIVACY
As additional information sharing and data mining initiatives have been announced,
increased attention has focused on the implications for privacy. Concerns about privacy
focus both on actual projects proposed, as well as concerns about the potential for data
mining applications to be expanded beyond their original purposes. For example, some
7
Ibid
8
David Hand, Heikki Mannila, and Padhraic Smyth,”Principles of Data Mining”,MIT Press,
Cambridge, MA, 2001
9
Peter Cabena, Pablo Hadjinian, Rolf Stadler, JaapVerhees, and Alessandro Zanasi, Discovering Data
Mining: From Concept to Implementation, Prentice Hall, Upper Saddle River, NJ, 1998.
10
http: //ijcttjournal.org/ Volume4/issue-2/IJCTT-V4I2P129.pdf
experts suggest that anti-terrorism data mining applications might also be useful for
combating other types of crime as well.11 “Observers contend that tradeoffs should be
made regarding privacy to ensure security. Others suggest that existing laws and
regulations regarding privacy protections are adequate, and that these initiatives do not
pose any threats to privacy.” 12 Still some observers argue that not enough is known
about how data mining projects will be carried out, and that greater oversight is needed.
There is also some disagreement over how privacy concerns should be addressed. Some
observers suggest that technical solutions are adequate initiatives.13 From the security
perspective, data mining has been shown to be beneficial in confronting various types
of attacks to computer systems. However, the same technology can be used to create
potential security hazards.14 In addition to that, data collection and analysis efforts by
government agencies and businesses raised fears about privacy, which motivated the
privacy preserving data mining research.15
Statement of Problem
The important problem raised by data mining is the problem of individual privacy. Data
mining helps in analyzing business transactions, etc. and gathering a significant amount
of information about individuals’ habits and preferences.
The data is collected by different organizations on first hand basis by directly asking
customers about it or in second hand by buying the data from other organization. But
what do they do with the data is not known to the Customers. The data that the
Organizations receive is used by them for digital profiling of individuals which can be
detrimental to their privacy.
Since these days, huge importance is given to the data floating on the network, and the
society becoming more and more money minded, organizations are mining huge
11
Agrawal, R, and R. Srikant,“Privacy-preserving Data Mining,” Proceedings of the ACM SIGMOD
Conference, Dallas, TX, May2000.
12
Clifton, C., M. Kantarcioglu and J. Vaidya,“Defining Privacy for Data Mining,” Purdue University,
2002.
13
Evfimievski, A., R. Srikant, R. Agrawal, and J. Gehrke, “Privacy Preserving Mining of Association
Rules,” In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining. Edmonton, Alberta, Canada, July 2002.
14
Fung B., Wang K., Yu P. ”Top-Down Specialization for Information and Privacy Preservation.
ICDE Conference, 2005
15
Wang K., Yu P., Chakraborty S., “Bottom-Up Generalization: A Data Mining Solution to Privacy
Protection.”, ICDM Conference, 2004
amount of data to sell them for their economic benefits. There entities are not concerned
with the damages caused to the individual but the benefits they are reaping.
Review of Literature
16
http://ijcttjournal.org/Volume4/issue-2/IJCTT-V4I2P129.pdf
17
http://mg.scihub.ltd/10.1145/1014052.1014126
andmutual information loss, while privacy and privacy loss arequantified by
interval-based metrics. Two different typesof problems are defined to identify
optimal randomizationfor PPDM. Illustrative examples and simulation results
arereported.18
18
http://mg.scihub.ltd/10.1145/1014052.1014153
and security of customer data. There is also the question of whose
welfare, preferences and opinions are to prevail in the formulation of big
data related laws and policies in the future. The increasing consumer
concerns are likely to force further regulatory response to ensure that
consumers' interests are protected.19
Research Objective
To understand the applicability of the privacy laws on the data mining and their
technologies.
Research Questions
Whether the analysis of Data violates the privacy of individuals whose data is
referred to?
Can Privacy be preserved while mining the data.
Is masking of the data bases possible
Whether the present laws on privacy able to tackle the problem posed by data
mining?
What are the impact on privacy with the growth in technology?
Hypothesis
That the Data Mining in general sense is having a great detrimental effect on privacy
of individuals since the data of the users which is taken by the organizations is many
times used to relate and access the crucial personal and private data of the individuals
19
http://mg.scihub.ltd/10.1016/j.telpol.2014.10.002
which can and is generally being used to cause grievous damage to the physical and
mental health of the individuals.
Methodology
The research methodology adopted for making this project is doctrinal research
methodology .Doctrinal research asks what the law is on a particular issue. It is
concerned with analysis of the legal doctrine and how it has been developed and
applied. This type of research is also known as pure.
The research methodology includes comparative study, inductive order, Qualitative
analysis and most importantly historical and recent analysis.
Historical analysis is defined as the integral component of the study of history.
Specifically, it entails interpretation and understanding of various historical events,
documents and processes. History is best understood as not a series of facts, but rather
as a series of competing interpretive narratives. Whereas Qualitative analysis is a
research method that uses open-ended interviewing to study and understand the
attitudes, opinions, feelings, and behavior of individuals or a group of individuals.
Scope of Study
The scope of this study is the use of the internet for the purposes of understanding the
process of data mining and its impact on the privacy of individuals.
Chapterization (tentative)
Introduction
Data mining
Data Privacy
Impact of Mining on Privacy
Measures that can be adopted to prevent loss of privacy.
Conclusion