Bine ați venit la Scribd!

Săriți peste schemele de tip carusel

New Link Based Approach For Categorical Data Clustering

Încărcat de

Chiranth BO

0% au considerat acest document util (0 voturi)

23 vizualizări17 pagini

Data mining

Titlu original

New link based approach for categorical data clustering

Drepturi de autor

Formate disponibile

PPTX, PDF, TXT sau citiți online pe Scribd

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Raportați acest document

Data mining

Drepturi de autor:

Attribution Non-Commercial (BY-NC)

Formate disponibile

Descărcați ca PPTX, PDF, TXT sau citiți online pe Scribd

Indicator pentru conținut neadecvat

0% au considerat acest document util (0 voturi)

23 vizualizări17 pagini

New Link Based Approach For Categorical Data Clustering

Încărcat de

Chiranth BO

Data mining

Drepturi de autor:

Attribution Non-Commercial (BY-NC)

Formate disponibile

Descărcați ca PPTX, PDF, TXT sau citiți online pe Scribd

Indicator pentru conținut neadecvat

Salt la pagina

Sunteți pe pagina 1din 17

Căutați în document

NEW LINK BASED APPROACH FOR CATEGORICAL DATA CLUSTERING

KNOWLEDGE AND DATA ENGINEERING

By, CHIRANTH B O 4th Sem M.tech

Presentation Outline
Introduction to Clustering Abstract Existing System Proposed System Experimental Design

Experimental results
Conclusion
July 22, 2013 2

Clustering
Introduction
Clustering Grouping similar kind of data. Data clustering concerns how to group a set of objects based on their similarity of attributes. Main methods

Partitioning : K-Means Hierarchical : BIRCH,ROCK, Density-based: DBSCAN,

A good clustering method will produce high quality clusters with

high intra-class similarity low inter-class similarity

July 22, 2013

ABSTRACT
The categorical data clustering methods are generating results based on incomplete information. This problem degrades the quality of the clustering result. This paper presents a new link-based approach for categorical data clustering which improves results by discovering unknown entries through similarity between clusters

July 22, 2013

Existing Methods

K-means cannot cluster the categorical data.

SQUEEZER and CACTUS generates final clustering

using incomplete information. Many data entries are left unknown.

July 22, 2013

Proposed Methods

Link based approach improves the matrix by discovering the unknown entries.

An efficient link based algorithm used to find similarity between clusters.

July 22, 2013 6

Introduction to NLCD

Designed for very large data sets:

Time and memory are limited

Only one scan of data is necessary Does not need the whole data set in advance

Two key Modules:

Scans the database to build an Binary Matrix.

Building refined matrix using Weighted Triple Quality Algorithm.

July 22, 2013 7

Basic process
Clustering 1 Clustering 2 Consensus Function

Dataset X

Clustering M

July 22, 2013

Clustering

PairWise-Similarity Matrix

Binary Matrix

July 22, 2013

Weighted Triple Quality

ALGORITHM - WTQ (G, , ) G = (V, W), a weighted graph, where , ; , a set of adjacent neighbors of ; =

, the WTQ measure of and ; 0 For each c If c + Return Following that, the similarity between clusters and can be estimated by
July 22, 2013 10

Sim , =

Over Lapping Member

Wx,y W where Cx ,Cy V
Cluster Network

wxy =

July 22, 2013

Experimental Results
Input parameters:
Memory (M): 5% of data set Disk space (R): 20% of M Initial threshold (T): 0.0 Page size (P): 1024 bytes
July 22, 2013 12

Experimental Results
KMEANS clustering
No 1 2 3 Time 43.9 13.2 32.9 D 2.09 4.43 3.66 # Scan 289 51 187 DS 1o 2o 3o Time 33.8 12.7 36.0 D 1.97 4.20 4.35 # Scan 197 29 241

NLCD clustering
No 1 2 3 Time 11.5 10.7 11.4 D 1.87 1.99 3.95 # Scan 2 2 2 DS 1o 2o 3o Time 13.6 12.1 12.2 D 1.87 1.99 3.99 # Scan 2 2 2

July 22, 2013

Conclusions
A New Link Based Clustering that stores the clustering features in Matrix.
Given a limited amount of main memory, NLCD can minimize the time required for I/O. The problem of constructing the refined matrix is efficiently resolved by similarity among categorical clusters
July 22, 2013 14

Future Work
The first prominent future work includes an extensive study regarding the behavior of other link-based similarity measures within this problem context.
The second prominent future work is the new method will be applied to specific domains, including tourism and medical data sets.

July 22, 2013

References
IEEE Journal on Data Mining http://ilpubs.stanford.edu:8090/508/1/2001-41.pdf IEEE Journal on Knowledge and data engineering http://en.wikipedia.org/wiki/Clustering_algorithm

July 22, 2013

Q&A

Thank you for your patience

July 22, 2013

S-ar putea să vă placă și

Ds Final Manual
Document41 pagini
Ds Final Manual
Chiranth BO
Încă nu există evaluări
Operational Research PHD Thesis
Document301 pagini
Operational Research PHD Thesis
Chiranth BO
Încă nu există evaluări
CBIT DBMS Lab Manual
Document44 pagini
CBIT DBMS Lab Manual
Chiranth BO
Încă nu există evaluări
CCP Lab Manual
Document36 pagini
CCP Lab Manual
Chiranth BO
Încă nu există evaluări
Unit 8
Document7 pagini
Unit 8
Chiranth BO
Încă nu există evaluări
Ece-I-Computer Concepts & C Programming (10ccp-13) - Notes
Document105 pagini
Ece-I-Computer Concepts & C Programming (10ccp-13) - Notes
Chiranth BO
Încă nu există evaluări
EC Manual PDF
Document11 pagini
EC Manual PDF
Chiranth BO
Încă nu există evaluări
Web Manual
Document30 pagini
Web Manual
gopivrajan
Încă nu există evaluări
Information Retrieval: Literature Searching in Today's Information Landscape
Document7 pagini
Information Retrieval: Literature Searching in Today's Information Landscape
Chiranth BO
Încă nu există evaluări
OS-unit 5
Document14 pagini
OS-unit 5
Chiranth BO
Încă nu există evaluări
OS - Chapter-4 File System Interface
Document27 pagini
OS - Chapter-4 File System Interface
Anamika Raj
Încă nu există evaluări
Chapter 1: Introduction
Document40 pagini
Chapter 1: Introduction
jessk26
Încă nu există evaluări
Operating Systems Ch2
Document44 pagini
Operating Systems Ch2
Mahmoud Abdelrahman
Încă nu există evaluări
1.1 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts
Document11 pagini
1.1 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts
Chiranth BO
Încă nu există evaluări
Shoe Dog: A Memoir by the Creator of Nike
De la Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Evaluare: 4.5 din 5 stele
4.5/5 (537)
Grit: The Power of Passion and Perseverance
De la Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Evaluare: 4 din 5 stele
4/5 (587)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
De la Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Evaluare: 4 din 5 stele
4/5 (894)
The Yellow House: A Memoir (2019 National Book Award Winner)
De la Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Evaluare: 4 din 5 stele
4/5 (98)
The Little Book of Hygge: Danish Secrets to Happy Living
De la Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Evaluare: 3.5 din 5 stele
3.5/5 (399)
On Fire: The (Burning) Case for a Green New Deal
De la Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Evaluare: 4 din 5 stele
4/5 (73)
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
De la Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Evaluare: 4 din 5 stele
4/5 (5794)
Never Split the Difference: Negotiating As If Your Life Depended On It
De la Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Evaluare: 4.5 din 5 stele
4.5/5 (838)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
De la Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Evaluare: 4.5 din 5 stele
4.5/5 (474)
Yes Please
De la Everand
Yes Please
Amy Poehler
Evaluare: 4 din 5 stele
4/5 (1891)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
De la Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Evaluare: 3.5 din 5 stele
3.5/5 (231)
Principles: Life and Work
De la Everand
Principles: Life and Work
Ray Dalio
Evaluare: 4 din 5 stele
4/5 (599)
The Emperor of All Maladies: A Biography of Cancer
De la Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Evaluare: 4.5 din 5 stele
4.5/5 (271)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
De la Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Evaluare: 4 din 5 stele
4/5 (1090)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
De la Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Evaluare: 3.5 din 5 stele
3.5/5 (2219)
Team of Rivals: The Political Genius of Abraham Lincoln
De la Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Evaluare: 4.5 din 5 stele
4.5/5 (234)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
De la Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Evaluare: 4.5 din 5 stele
4.5/5 (344)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
De la Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Evaluare: 4.5 din 5 stele
4.5/5 (265)
Fear: Trump in the White House
De la Everand
Fear: Trump in the White House
Bob Woodward
Evaluare: 3.5 din 5 stele
3.5/5 (738)
Angela's Ashes: A Memoir
De la Everand
Angela's Ashes: A Memoir
Frank McCourt
Evaluare: 4.5 din 5 stele
4.5/5 (440)
Rise of ISIS: A Threat We Can't Ignore
De la Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Evaluare: 3.5 din 5 stele
3.5/5 (137)
Steve Jobs
De la Everand
Steve Jobs
Walter Isaacson
Evaluare: 4.5 din 5 stele
4.5/5 (806)
John Adams
De la Everand
John Adams
David McCullough
Evaluare: 4.5 din 5 stele
4.5/5 (2409)
The Unwinding: An Inner History of the New America
De la Everand
The Unwinding: An Inner History of the New America
George Packer
Evaluare: 4 din 5 stele
4/5 (45)
Bad Feminist: Essays
De la Everand
Bad Feminist: Essays
Roxane Gay
Evaluare: 4 din 5 stele
4/5 (1015)
The Glass Castle: A Memoir
De la Everand
The Glass Castle: A Memoir
Jeannette Walls
Evaluare: 4.5 din 5 stele
4.5/5 (1711)
Wolf Hall: A Novel
De la Everand
Wolf Hall: A Novel
Hilary Mantel
Evaluare: 4 din 5 stele
4/5 (3811)
The Outsider: A Novel
De la Everand
The Outsider: A Novel
Stephen King
Evaluare: 4 din 5 stele
4/5 (1839)
The Perks of Being a Wallflower
De la Everand
The Perks of Being a Wallflower
Stephen Chbosky
Evaluare: 4.5 din 5 stele
4.5/5 (2099)
The Woman in Cabin 10
De la Everand
The Woman in Cabin 10
Ruth Ware
Evaluare: 3.5 din 5 stele
3.5/5 (2322)
The Light Between Oceans: A Novel
De la Everand
The Light Between Oceans: A Novel
M.L. Stedman
Evaluare: 4.5 din 5 stele
4.5/5 (789)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
De la Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Evaluare: 4.5 din 5 stele
4.5/5 (119)
Little Women
De la Everand
Little Women
Louisa May Alcott
Evaluare: 4 din 5 stele
4/5 (104)
Brooklyn: A Novel
De la Everand
Brooklyn: A Novel
Colm Toibin
Evaluare: 3.5 din 5 stele
3.5/5 (1937)
A Man Called Ove: A Novel
De la Everand
A Man Called Ove: A Novel
Fredrik Backman
Evaluare: 4.5 din 5 stele
4.5/5 (4609)
The Art of Racing in the Rain: A Novel
De la Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Evaluare: 4 din 5 stele
4/5 (4200)
Manhattan Beach: A Novel
De la Everand
Manhattan Beach: A Novel
Jennifer Egan
Evaluare: 3.5 din 5 stele
3.5/5 (792)
A Tree Grows in Brooklyn
De la Everand
A Tree Grows in Brooklyn
Betty Smith
Evaluare: 4.5 din 5 stele
4.5/5 (1929)
Sing, Unburied, Sing: A Novel
De la Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Evaluare: 4 din 5 stele
4/5 (1103)
The Constant Gardener: A Novel
De la Everand
The Constant Gardener: A Novel
John le Carre
Evaluare: 3.5 din 5 stele
3.5/5 (104)
Her Body and Other Parties: Stories
De la Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Evaluare: 4 din 5 stele
4/5 (821)
NACE CIP Part II - (6) Coatings For Industry - (Qs - As)
Document23 pagini
NACE CIP Part II - (6) Coatings For Industry - (Qs - As)
Almagesto Quenaya
Încă nu există evaluări
Movement Joints (NHBC)
Document5 pagini
Movement Joints (NHBC)
hemendraeng
Încă nu există evaluări
Develop Network Monitoring System IP Subnet Calculator
Document23 pagini
Develop Network Monitoring System IP Subnet Calculator
abiraman
Încă nu există evaluări
The Ethological Study of Glossifungites Ichnofacies in The Modern & Miocene Mahakam Delta, Indonesia
Document4 pagini
The Ethological Study of Glossifungites Ichnofacies in The Modern & Miocene Mahakam Delta, Indonesia
Ery Arifullah
100% (1)
Test Unit 7 m.2
Document6 pagini
Test Unit 7 m.2
Petchara Sridakun
Încă nu există evaluări
DC Cheatsheet
Document2 pagini
DC Cheatsheet
Rashi Singh
Încă nu există evaluări
Comparison of Waste-Water Treatment Using Activated Carbon and Fullers Earth - A Case Study
Document6 pagini
Comparison of Waste-Water Treatment Using Activated Carbon and Fullers Earth - A Case Study
DEVESH SINGH
100% (1)
Excel Data Analysis
Document30 pagini
Excel Data Analysis
Роман Удовичко
Încă nu există evaluări
Lux Meter
Document4 pagini
Lux Meter
Nmg Kumar
Încă nu există evaluări
Sewer Cad
Document10 pagini
Sewer Cad
Alvaro Jesus Añazco Yllpa
Încă nu există evaluări
16SEE - Schedule of Papers
Document36 pagini
16SEE - Schedule of Papers
Piyush Jain
0% (1)
By Emile Alline: Technical Art by Fred Wolff
Document4 pagini
By Emile Alline: Technical Art by Fred Wolff
Jim
100% (3)
Science-6 - Q4 - W8-DLL - Mar 10
Document2 pagini
Science-6 - Q4 - W8-DLL - Mar 10
cristina quiambao
Încă nu există evaluări
Mock PPT 2023 Tiet
Document22 pagini
Mock PPT 2023 Tiet
tsai42zig
Încă nu există evaluări
Action Plan On Gad
Document1 pagină
Action Plan On Gad
Cherish Devora Artates
Încă nu există evaluări
Investigation of Twilight Using Sky Quality Meter For Isha' Prayer Time
Document1 pagină
Investigation of Twilight Using Sky Quality Meter For Isha' Prayer Time
resurgam52
Încă nu există evaluări
9-Lesson 5 Direct and Indirect Speech
Document8 pagini
9-Lesson 5 Direct and Indirect Speech
laiwelyn
100% (4)
Running Head:: Describe The Uses of Waiting Line Analyses
Document6 pagini
Running Head:: Describe The Uses of Waiting Line Analyses
Henry Anubi
Încă nu există evaluări
Introduction To Gemology
Document286 pagini
Introduction To Gemology
Ehtesham Siddiqui
100% (2)
Country Wing Auto-Mobile Garage
Document25 pagini
Country Wing Auto-Mobile Garage
Dmitry Pigul
Încă nu există evaluări
Workflowy - 2. Using Tags For Navigation
Document10 pagini
Workflowy - 2. Using Tags For Navigation
SteveLang
Încă nu există evaluări
Board of Intermediate & Secondary Education, Lahore: Tahir Hussain Jafri
Document2 pagini
Board of Intermediate & Secondary Education, Lahore: Tahir Hussain Jafri
dr_azharhayat
Încă nu există evaluări
Mock Data
Document56 pagini
Mock Data
Anonymous O2bvbOu
Încă nu există evaluări
BA 302 Lesson 3
Document26 pagini
BA 302 Lesson 3
ピザンメルビン
Încă nu există evaluări
Alside Brochure - Zen Windows The Triangle
Document13 pagini
Alside Brochure - Zen Windows The Triangle
ZenWindowsTheTriangle
Încă nu există evaluări
SEM 3037E Tower Piping.
Document52 pagini
SEM 3037E Tower Piping.
Kodali Naveen Kumar
Încă nu există evaluări
MI 276 Rev B - Conversion of Turbochargers For Opposite Engine Rotation
Document15 pagini
MI 276 Rev B - Conversion of Turbochargers For Opposite Engine Rotation
Jesse Barnett
Încă nu există evaluări
Superficial Conclusion
Document49 pagini
Superficial Conclusion
Ingrid Pariss
Încă nu există evaluări
Quiz 1
Document3 pagini
Quiz 1
JULIANNE BAYHON
Încă nu există evaluări
Popular Mechanics 2010-06
Document171 pagini
Popular Mechanics 2010-06
Bookshebooks
Încă nu există evaluări