Sunteți pe pagina 1din 13

TERM PAPER/SEMINAR

On

DIFFERENT DATA MINING APPROACHES

Submitted to

Amity university, Uttar Pradesh

Guided By:
Submitted By:
MR. Tanupriya
Siddharth Jain
3CSE-8X
A2305216519

AMITY UNIVERSITY UTTAR PRADESH


TABLE OF CONTENTS

 Abstract of work done


 Acknowledgement
 Introduction
 Techniques
 Data implementation and preparation
 Conclusion
 References
ABSTRACT

In this project different approaches to data mining have been covered.


Data mining retrieve data by using a well defined structure or model has makes data
retrieval easy. Key techniques and methods have been discussed, first of which is
association. Then classification is detailed, followed by clustering. Then decision tree
which is also a key tool. Data implementation and preparation is further discussed.
The conclusion to the research is laid out at last followed by the references that
helped to build knowledge about the research.
AKNOWLEDGEMENT

I Siddharth Jain of 3CSE-8X would like to thank Mr Tanupriya for guiding me


throughout the project of “Different data mining approaches” that helped me gain
plenty of knowledge.

Siddharth Jain

(Signature of student)
INTRODUCTION

Basically, data mining is about processing data and identifying patterns and trends in
that information so that you can choose. Data mining principles have been around
for many years, but, with the initiation of big data, it is even more established.
Big data caused an eruption in the use of more extensive data mining approaches
moderately because the size of the information is much big and because the
information tends to be more diverse and widespread in its very environment and
content. With large data sets, it is no longer enough to get comparatively easy and
clear-cut statistics out of the system. With 30 or 40 million records of thorough
customer information, knowing that two million of them live in one spot is not enough.
You want to know whether those two million are an exacting age group and their
average wages so that you can aim your customer needs better.
These business-driven requirements changed simple data recovery and figures into
more composite data mining. The business problem drives an examination of the
data that helps to build a mould to describe the information that eventually leads to
the creation of the resulting report.

The data analysis is a procedure that often follows strict rules that can be used
repeatedly and recognize the diverse data that can be retrieved. It is also very
important to be able to, map, relate, cluster and associate it with different data to get
a particular outcome.
Data mining is not only restricted to the software or hardware that is in use. Data
mining can also be performed on simple software. The benefits of complex data
mining and algorithms are being appreciated a lot.

A flow chart that could properly explain the process of data mining is displayed
Key Techniques and Examples

There are a no of different techniques that can be used in data mining to


describe the different types of mining and operations used to recover data.

Following are the different techniques and examples that explain the building
of data mining:

ASSOCIATION

Association is a well know, understood and probably the most widely used data
mining technique. A relation between different items or different types of data is
observed and identified to build a particular pattern. For example, it can be observed
in a sports market, a person buying bat may also apparently land up buying a ball, so
if this data is studied then both bat and a ball may be associated together in order
for future demand.

This technique of association can be used with the help of different tools. For
example, Info Sphere Warehouse is tool that can be used for association
.

Following is an example from the sample database:

Classification
A classification may be made to build an idea of the different type of items, data by
putting a number of constraints to eventually make a class under which different data
can be organised efficiently. For example, cars are classified into different types like
suv, sedans etc. Now a car can be slotted into one of these classified categories by
comparing the constraints.

Clustering

By studying different constraints and class, data can be grouped together in order to
identify or examine the grouped data. Commonly, clustering is studying two or more
constrained data in order to observe the correlation between them. Clustering is very
important to identify plenty of data in order to examine the similarities between them.

The graph displayed shows a good example. In this example, size of sale is
compared with the age of the customer.
From the data shown in the graph, it can be noticed that the number of points
clustering together has the highest probability of certain age group purchasing a
certain amount of products.

Decision trees

Decision trees are related to most of the other tools and techniques and it is used as
a part of the selection category, or to select and use the data from structure. In
decision tree, a question may have two answers. Each and every answer follows up
with different question to identify the data that can be slotted a separate category.
Data implementations and preparation

Data mining depends upon a well defined structure or model that can retrieve the
correct or particular data user wants. Data implementation work to make data mining
as effective and efficient as possible.

The most important step is to make and translate the data more than to retrieve it. It
also follows the difficult process identifying, aggregating, simplifying, or expanding
the data to ensemble the entered data.
Conclusion

Data mining is not only about performing a command on the data present in the
database. Data must be organised whether by structuring it or building a model by
using SQL or software such as hadoop .Getting the format of the data that one need
depends upon the tools and techniques. Once we get the data that we want to
retrieve we can use any tool or technique in spite of any data type or structure.
REFERENCES

1. WIKIPEDIA
2. IBM
3. ZENTUT
4. SEMANTIC SCHOLAR

S-ar putea să vă placă și