Documente Academic
Documente Profesional
Documente Cultură
Presented By:
PRAVEEN GANGULA SRIVASTHAV NANDANAVANAM
III/IV B.Tech (C.S.E), III/IV B.Tech (C.S.E),
G.M.R.I.T., G.M.R.I.T.,
RAJAM. RAJAM.
E-mail:sairamanababu@yahoo.com nssagar@gmail.com
ABSTRACT
Organisations are today suffering from a malaise of data overflow.
The developments in the transaction processing technology has given rise to a situation
where the amount and rate of data capture is very high, but the processing of this data
into information that can be utilised for decision making, is not developing at the same
pace Data Mining is the process of extracting valid, previously unknown,
comprehensible, and actionable information from large databases and using it to make
crucial business decisions. Data Mining contains two models: predictive model and
descriptive model. It contains various tasks such as Classification, Regression, Time
series analysis, prediction, clustering, summarization etc.
The main aim of Data Mining is Knowledge Discovery in Databases
(KDD). KDD is used to derive the patterns that are useful for Data Extraction. KDD
process contains a mechanism that includes Selection, Preprocessing Transformation, and
Interpretation. Data Mining basically depends on Classification and Clustering, which
provides strength to the Data Mining. Data mining metrics are applied to measure the
effectiveness of functions using ROI (Return on investment).
Data Mining contains the relative concepts such as OLTP systems,
Fuzzy sets, and Web search engines etc.Data Mining can also be extended to Web
Mining, Spatial Mining, and Temporal Mining.
Our paper focuses on the need for information repositories and
discovery of knowledge and thence the overview of, the so hyped, Data Mining.
INTRODUCTION:
One of the reasons behind maintaining any database is to enable the user
to find interesting trends in the data. Data mining has been defined as "The nontrivial
extraction of implicit, previously unknown, and potentially useful information from
data". It uses machine learning, statistical and visualization techniques to discovery and
present knowledge in a form which is easily comprehensible to humans.
DEFINITION: -
The process of extracting valid, previously unknown, comprehensible
and actionable information from large databases and using it to make crucial business
decisions.
Why data mining?
Data mining got its start in what is now known as “customer relationship
management” (CRM). It is widely recognized that companies of all sizes need to learn to
emulate what small; service-oriented businesses have always done well – creating one-to-
one relationships with their customers. In every industry, forward-looking companies are
trying to move towards the one-to-one ideal of understanding each customer individually
and to use that understanding to make it easier for the customer to do business with them
rather than with a competitor. These same companies are learning to look at the lifetime
value of each customer so they know which ones are worth investing money and effort to
hold on to and which ones to let drop.
As noted, a small business builds one-to-one relationships with its
customers by noticing their needs, remembering their preferences, and learning from past
interactions how to serve them better in the future. In large commercial enterprises, the
first step - noticing what the customer does - has already largely been automated. On-line
transaction processing (OLTP) systems are everywhere, collecting data on seemingly
everything. The customer-focused enterprise regards every record of an interaction with a
client or prospect as a learning opportunity. But, learning requires more than simply
gathering data. In fact, many companies gather hundreds of gigabytes or terabytes of data
from and about their customers without learning anything. Data is gathered because it is
needed for some operational purpose, e.g. inventory control or billing.
DATA MINING MODELS: -
Data mining mainly concerned with the use of software techniques for
finding hidden and unexpected patterns and relationships in sets of data. There are two
data mining models:
1. Predictive Model: This makes a prediction about values of data using known results
found from different data and it may be made based on the use of other historical data.
Predictive model data mining tasks are Classification, Regression, Time series analysis
and prediction. Example: - Credit Card Usage
2. Descriptive Model: This identifies patterns or relationships in data. It is used to
explore the properties of the data examined. Descriptive model data mining tasks are
Clustering, Summarization, Association rules, and Sequence discovery etc.
Example: - Manual Evaluation
DATA MINING
Predictive Descriptive
Classification Clustering
Regression Summarization
Time Series Analysis Association rules
Prediction Sequence discovery
Fig: Data Mining Models and Tasks
Mining
ADVANCED TOPICS: -
1. Web Mining: -Web mining is mining of data related to the World Wide Web.
Web data can be classified into the following classes.
• Content of actual web pages
• Intra page structure includes the HTML or XML code for the page
• Inter page structure is the actual linkage structure between web pages
• User profiles include demographic and registration information obtained about
uses. This could also be includes information found in cookies.
The above classes of Web data may be manipulated using different web mining tasks
such as: Web Content Mining, Web Structure Mining and Web Usage Mining.
2. Spatial Mining: -Spatial mining is data mining as applied to spatial databases or
spatial data.
Specialized operations and data structures are used to access spatial data. Some of
those are:
• Spatial queries
• Thematic maps
• Spatial data structures
• Image databases
Spatial data mining primitives: -
Primitive operations involved between spatial objects are,
• Disjoint
• Equals
• Overlaps or intersects
• Covered by or inside or contained
in
Generalization and Specialization: -The use of a concept hierarchy shows levels of
relationships among data. Spatial data mining techniques have involved both
generalization and specialization type approaches.
Spatial rules: -There are different types of rules found in spatial data mining.
• Spatial characteristic rules • Spatial discriminant rules
• Spatial discriminant rules
Areas of applications of spatial data mining: -
• GIS systems, Geology, Environmental Science
• Resource management, Agriculture, Medicine, Robotics
3. Temporal Mining: -Temporal data mining is the mining process of temporal data from
temporal databases. Temporal mining involves the concepts such as Modeling
temporal events, Time series, Pattern detection
There are other types of mining like distributed mining, ubiquituous mining, constrained-
based data mining ,phenomenal data mining etc