Bine ați venit la Scribd!

COL870: Clustering Algorithms: Ragesh Jaiswal, CSE, IIT Delhi

Încărcat de

0% au considerat acest document util (0 voturi)

15 vizualizări16 pagini

This document provides information about the COL870: Clustering Algorithms course taught by Ragesh Jaiswal at IIT Delhi. The grading scheme and policies are outlined, including that cheating will result in an F grade. There are no required textbooks; reference materials will be provided online. The course website provides additional resources. Clustering involves grouping similar objects without predefined labels, unlike classification which uses supervised learning with labeled training data. The course will focus on understanding clustering techniques and algorithms, with optional programming assignments.

Descriere originală:

Titlu original

lec-1

Drepturi de autor

Formate disponibile

PDF, TXT sau citiți online pe Scribd

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Raportați acest document

Drepturi de autor:

Formate disponibile

Descărcați ca PDF, TXT sau citiți online pe Scribd

Indicator pentru conținut neadecvat

0% au considerat acest document util (0 voturi)

15 vizualizări16 pagini

COL870: Clustering Algorithms: Ragesh Jaiswal, CSE, IIT Delhi

Încărcat de

Pooja Sinha

Drepturi de autor:

Formate disponibile

Descărcați ca PDF, TXT sau citiți online pe Scribd

Indicator pentru conținut neadecvat

Salt la pagina

Sunteți pe pagina 1din 16

Căutați în document

COL870: Clustering Algorithms

Ragesh Jaiswal, CSE, IIT Delhi

COL870: Clustering Algorithms

Administrative Information

Ragesh Jaiswal, CSE, IIT Delhi

COL870: Clustering Algorithms

Administrative Information

Instructor
Ragesh Jaiswal
Office: 403, SIT Building
Email: rjaiswal@cse.iitd.ac.in

Please send email to set up a meeting with me.

Ragesh Jaiswal, CSE, IIT Delhi

COL870: Clustering Algorithms

Administrative Information

Grading Scheme
Homework: 25 (12.5 Theory and 12.5 programming)
Minor 1 and 2: 15 points each.
Major: 25 points.
4 Project: 18 points.
5 Attendance: 2 points.
1
2
3

Policy on cheating:
Anyone found using unfair means in the course will receive an
F grade.

Ragesh Jaiswal, CSE, IIT Delhi

COL870: Clustering Algorithms

Administrative Information

Textbook: There are no textbooks for this course. This is an

advanced level course. The reference material will include
book chapters, course notes, research papers, and other
material available online.

Course webpage:
http://www.cse.iitd.ac.in/~rjaiswal/2015/col870.
The site will contain course information, references, and
announcements. Please check this page regularly.

Ragesh Jaiswal, CSE, IIT Delhi

COL870: Clustering Algorithms

Introduction

Ragesh Jaiswal, CSE, IIT Delhi

COL870: Clustering Algorithms

Introduction

What is data clustering?

Ragesh Jaiswal, CSE, IIT Delhi

COL870: Clustering Algorithms

Introduction

What is data clustering?

The task of grouping a set of objects such that objects in the
same group (called cluster) are more similar than objects in
different groups.
Given a representation of n objects, find k groups based on a
measure of similarity (dissimilarity) such that the similarities
between objects in the same group are high while similarities
between objects in different groups are low.

Ragesh Jaiswal, CSE, IIT Delhi

COL870: Clustering Algorithms

Introduction
Why study data clustering?
Custering is usually the first step when trying to make sense of
Big Data.
Common tool across many different fields such as image
analysis, pattern recognition, bioinformatics, information
retrieval etc.

Data clustering has been used for the following three main
purposes [Jain09]:
Underlying structure: to gain insight into data, generate
hypotheses, detect anomalies, and identify salient features.
Natural classification: to identify the degree of similarity
among forms or organisms.
Compression: as a method for organizing the data and
summarizing it through cluster prototypes.

Ragesh Jaiswal, CSE, IIT Delhi

COL870: Clustering Algorithms

Introduction

How is clustering different from classification?

Ragesh Jaiswal, CSE, IIT Delhi

COL870: Clustering Algorithms

Introduction

How is clustering different from classification?

Classification
The task of assigning
objects to predefined
classes.

Clustering
The task of grouping objects without any knowledge of the classes.

The more general terms used are supervised learning versus

unsupervised learning.
Supervised Learning
- Classes/labels known
- Expert available
- Training data available

Ragesh Jaiswal, CSE, IIT Delhi

Unsupervised Learning
- Classes/labels unknown
- No experts
- No training data

COL870: Clustering Algorithms

Introduction
Frequently Asked Questions

What are the pre-requisites for the course?

Answer: Data Structures and Algorithms. I will assume that you
are familiar with Algorithm design and analysis techniques and
intractability concepts like NP-completeness etc.

What flavour will this course have, algorithms or machine

learning?
Answer: The course will have the rigour of typical algorithms
courses. That is, we will spend time proving things. However,
techniques we learn in the course will have applications in many
different areas.

Does the course involve programming?

Answer: Depends on the interest of the class. Our focus will be to
understand clustering techniques. However, if the class is
interested, I can set up programming assignments.

Does the course involve a project?

Answer: Yes. You will be asked to read and present research
papers on clustering topics towards the later part of the course.

Do we become more marketable after doing all this?

Answer: Perhaps yes!
Ragesh Jaiswal, CSE, IIT Delhi

COL870: Clustering Algorithms

Introduction

What is data clustering?

Given a representation of n objects, find k groups based on a
measure of similarity (dissimilarity) such that the similarities
between objects in the same group are high while similarities
between objects in different groups are low.

Suppose the given objects to be clustered can be represented

as points in two-dimensional space (i.e., R2 ).
What is a reasonable notion of similarity between objects?

Ragesh Jaiswal, CSE, IIT Delhi

COL870: Clustering Algorithms

Introduction
What is data clustering?
Given a representation of n objects, find k groups based on a
measure of similarity (dissimilarity) such that the similarities
between objects in the same group are high while similarities
between objects in different groups are low.

Suppose the given objects to be clustered can be represented

as points in two-dimensional space (i.e., R2 ).
What is a reasonable notion of similarity between objects?
Distance between points.

Ragesh Jaiswal, CSE, IIT Delhi

COL870: Clustering Algorithms

Suppose the given objects to be clustered can be represented

as points in two-dimensional space (i.e., R2 ).
What is a reasonable notion of similarity between objects?
Distance between points.
The notion of similarity/dissimilarity has to be defined
carefully.

Ragesh Jaiswal, CSE, IIT Delhi

COL870: Clustering Algorithms

End

Ragesh Jaiswal, CSE, IIT Delhi

COL870: Clustering Algorithms

S-ar putea să vă placă și

Weekly 12
Document3 pagini
Weekly 12
Pablo Escobar
Încă nu există evaluări
Data Mining and Visualization Question Bank
Document11 pagini
Data Mining and Visualization Question Bank
ghost
100% (1)
Data Mining
Document9 pagini
Data Mining
Chethan.M
100% (4)
Ecml12 Verma
Document16 pagini
Ecml12 Verma
saurabh_34
Încă nu există evaluări
An Overview of Anomalous Sub Population: Mukti, Hari Singh
Document4 pagini
An Overview of Anomalous Sub Population: Mukti, Hari Singh
erpublication
Încă nu există evaluări
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
De la Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
Încă nu există evaluări
CS583 Chapter 4 Supervised Learning
Document166 pagini
CS583 Chapter 4 Supervised Learning
Kunal Deore
Încă nu există evaluări
Sequence Alignment Dissertation
Document4 pagini
Sequence Alignment Dissertation
PayToWritePaperUK
100% (1)
Data Mining - Tasks
Document7 pagini
Data Mining - Tasks
vijayganesh pinisetti
Încă nu există evaluări
Ho01 DMM 01
Document15 pagini
Ho01 DMM 01
Jaleelah Abdallah
Încă nu există evaluări
DSA Presentation Group 6
Document34 pagini
DSA Presentation Group 6
AYUSHI WAKODE
Încă nu există evaluări
An Enhanced Technique For Database Clustering Using An Extensive Data Set For Multi-Valued Attributes
Document6 pagini
An Enhanced Technique For Database Clustering Using An Extensive Data Set For Multi-Valued Attributes
International Journal of Application or Innovation in Engineering & Management
Încă nu există evaluări
AI ML Unit 4 QB
Document38 pagini
AI ML Unit 4 QB
Sumedh Rane
Încă nu există evaluări
知识图谱三元组定义参考
Document9 pagini
知识图谱三元组定义参考
luoluo Li
Încă nu există evaluări
Data Clustering: 50 Years Beyond K-Means
Document35 pagini
Data Clustering: 50 Years Beyond K-Means
Soumen Chowdhury
Încă nu există evaluări
Design A Database
Document39 pagini
Design A Database
Birhanu Girmay
Încă nu există evaluări
Lec 1
Document16 pagini
Lec 1
Sahil Mittal
Încă nu există evaluări
Chapter 1 PART B
Document76 pagini
Chapter 1 PART B
sai rahul
Încă nu există evaluări
Unpacking Agile Enterprise Architecture Innovation Work Practices: A Qualitative Case Study of A Railroad Company
Document10 pagini
Unpacking Agile Enterprise Architecture Innovation Work Practices: A Qualitative Case Study of A Railroad Company
4zoli
Încă nu există evaluări
CS583 Supervised Learning
Document166 pagini
CS583 Supervised Learning
Ahmad Helmy
Încă nu există evaluări
SLIQ: A Fast Scalable Classifier For Data Mining
Document15 pagini
SLIQ: A Fast Scalable Classifier For Data Mining
Andrew Dotson
Încă nu există evaluări
Research
Document16 pagini
Research
martin martin
Încă nu există evaluări
DBA Lecture 2 ER
Document48 pagini
DBA Lecture 2 ER
arean
Încă nu există evaluări
Da Unit-4
Document43 pagini
Da Unit-4
Shruthi Sayam
Încă nu există evaluări
Activity 1 PDF
Document3 pagini
Activity 1 PDF
John Michael Reyes
Încă nu există evaluări
Data Mining Course Overview
Document38 pagini
Data Mining Course Overview
harishkode
Încă nu există evaluări
6 Text Clustering
Document66 pagini
6 Text Clustering
Tushar Shah
Încă nu există evaluări
Abstract Hi-Tech Hexagons PowerPoint Templates
Document10 pagini
Abstract Hi-Tech Hexagons PowerPoint Templates
Nikhat Parween
Încă nu există evaluări
Data Structure Interview Questions and Answers (Top 46)
Document21 pagini
Data Structure Interview Questions and Answers (Top 46)
Ritika Kamble
Încă nu există evaluări
Classvac
Document14 pagini
Classvac
نبيل يحيي
Încă nu există evaluări
BMW M-4
Document108 pagini
BMW M-4
Tarun K
Încă nu există evaluări
What Is A Set in Data Structure - Google Search
Document1 pagină
What Is A Set in Data Structure - Google Search
Srinidhi
Încă nu există evaluări
Data Analytics - Object Segmentation UNIT-IV
Document33 pagini
Data Analytics - Object Segmentation UNIT-IV
19524 Alekhya
Încă nu există evaluări
DWM Mid 2 Question Bank
Document5 pagini
DWM Mid 2 Question Bank
Indu Alluru
Încă nu există evaluări
HierarchicalClusteringASurvey - Published7 3 9 871
Document5 pagini
HierarchicalClusteringASurvey - Published7 3 9 871
thaneatharran santharasekaran
Încă nu există evaluări
Modelling Business Information: Entity relationship and class modelling for Business Analysts
De la Everand
Modelling Business Information: Entity relationship and class modelling for Business Analysts
Keith Gordon
Încă nu există evaluări
Semi-Supervised Learning A Brief Review
Document6 pagini
Semi-Supervised Learning A Brief Review
Reshma Khemchandani
Încă nu există evaluări
Data Mining With Clustering AND Classification
Document16 pagini
Data Mining With Clustering AND Classification
Amanjyot Singh Oberoi
Încă nu există evaluări
Association Rule Generation For Student Performance Analysis Using Apriori Algorithm
Document5 pagini
Association Rule Generation For Student Performance Analysis Using Apriori Algorithm
thesij
Încă nu există evaluări
Report 1 Coderindeed PDF
Document15 pagini
Report 1 Coderindeed PDF
Ayushi
Încă nu există evaluări
02 Database Design ER Model
Document22 pagini
02 Database Design ER Model
Unyil Persi
Încă nu există evaluări
Database Model
Document24 pagini
Database Model
shashwat2010
Încă nu există evaluări
A Complete Set of Systems: Thinking Skills
Document13 pagini
A Complete Set of Systems: Thinking Skills
system user
Încă nu există evaluări
Essentials of Systems Analysis and Design 6th Edition Valacich Solutions Manual
Document35 pagini
Essentials of Systems Analysis and Design 6th Edition Valacich Solutions Manual
mantlingunturnedggxqaz
100% (21)
DMW - Unit 1
Document21 pagini
DMW - Unit 1
Priya Bhalerao
Încă nu există evaluări
CA Home: Interview Preparation
Document32 pagini
CA Home: Interview Preparation
Sai Ram
Încă nu există evaluări
The Steps of Qualitative Data Analysis
Document92 pagini
The Steps of Qualitative Data Analysis
Emmanuel Michael
Încă nu există evaluări
Un-Supervised Machine Learning
Document9 pagini
Un-Supervised Machine Learning
ranamzeeshan
Încă nu există evaluări
An Efficient Classification Algorithm For Real Estate Domain
Document7 pagini
An Efficient Classification Algorithm For Real Estate Domain
IJMER
Încă nu există evaluări
Data Model - Important - Concepts
Document24 pagini
Data Model - Important - Concepts
MUKIZA Enos
Încă nu există evaluări
A Direct Data-Cluster Analysis Method Based On Neutrosophic Set Implication
Document18 pagini
A Direct Data-Cluster Analysis Method Based On Neutrosophic Set Implication
Science Direct
Încă nu există evaluări
About The Classification and Regression Supervised Learning Problems
Document3 pagini
About The Classification and Regression Supervised Learning Problems
saileshmns
Încă nu există evaluări
Explain The Concept of Foreign Key. How A Foreign Key Differs From A Primary Key? Can The Foreign Key Accept Nulls? Answer
Document5 pagini
Explain The Concept of Foreign Key. How A Foreign Key Differs From A Primary Key? Can The Foreign Key Accept Nulls? Answer
hayerpa
Încă nu există evaluări
ASSL Professor Paper
Document15 pagini
ASSL Professor Paper
minhaz21
Încă nu există evaluări
Importany Questions Unit 3 4
Document30 pagini
Importany Questions Unit 3 4
Mubena Hussain
Încă nu există evaluări
Machine Learning CS229/STATS229: Instructors: Moses Charikar, Tengyu Ma, and Chris Re
Document40 pagini
Machine Learning CS229/STATS229: Instructors: Moses Charikar, Tengyu Ma, and Chris Re
Saurabh suman
Încă nu există evaluări
Data Mining Query Languages
Document30 pagini
Data Mining Query Languages
Jámès Kõstã
Încă nu există evaluări
CSL105: Discrete Mathematical Structures: Ragesh Jaiswal, CSE, IIT Delhi
Document28 pagini
CSL105: Discrete Mathematical Structures: Ragesh Jaiswal, CSE, IIT Delhi
SRINIVASA RAO
Încă nu există evaluări
For Unit 4 Useful
Document107 pagini
For Unit 4 Useful
shilpa dirisala
Încă nu există evaluări
Chapter: Entity-Relationship Model
Document74 pagini
Chapter: Entity-Relationship Model
Raghav Rathi
Încă nu există evaluări
ISRO2016
Document29 pagini
ISRO2016
Pooja Sinha
Încă nu există evaluări
CSP 2015 Eng
Document30 pagini
CSP 2015 Eng
AakashVats
Încă nu există evaluări
ISRO2017
Document26 pagini
ISRO2017
Pooja Sinha
Încă nu există evaluări
Mobile Identiy
Document18 pagini
Mobile Identiy
Pooja Sinha
Încă nu există evaluări
Intro Analysis
Document269 pagini
Intro Analysis
Brian Precious
Încă nu există evaluări
Country Project
Document32 pagini
Country Project
SeemiAli
100% (1)
RBI Mains English Paper 2007 To 2011
Document14 pagini
RBI Mains English Paper 2007 To 2011
sssbulbul
Încă nu există evaluări
Economy
Document63 pagini
Economy
Pooja Sinha
Încă nu există evaluări
CSP 2015 Eng
Document30 pagini
CSP 2015 Eng
AakashVats
Încă nu există evaluări
Chemistry Cbse Solved
Document9 pagini
Chemistry Cbse Solved
a
Încă nu există evaluări
Ciml v0 - 8 All Machine Learning
Document189 pagini
Ciml v0 - 8 All Machine Learning
aglobal2
100% (2)
Politička Karta Svijeta
Document1 pagină
Politička Karta Svijeta
Dario Šakić
Încă nu există evaluări
11 July 2016 13:29: Unfiled Notes Page 1
Document1 pagină
11 July 2016 13:29: Unfiled Notes Page 1
Pooja Sinha
Încă nu există evaluări
World Physical 2011 Nov
Document1 pagină
World Physical 2011 Nov
Bruno Teixeira
Încă nu există evaluări
MM508 Topologi I Ugeseddel 1
Document1 pagină
MM508 Topologi I Ugeseddel 1
Pooja Sinha
Încă nu există evaluări
Completeness of R
Document10 pagini
Completeness of R
SIVA KRISHNA PRASAD ARJA
Încă nu există evaluări
Af 1
Document25 pagini
Af 1
Pooja Sinha
Încă nu există evaluări
Politička Karta Svijeta
Document1 pagină
Politička Karta Svijeta
Dario Šakić
Încă nu există evaluări
Cs 2kjkj VB
Document2 pagini
Cs 2kjkj VB
Alok Kumar Singh
Încă nu există evaluări
Mathematics IA Algebra and Geometry (Part I) Michaelmas Term 2002
Document30 pagini
Mathematics IA Algebra and Geometry (Part I) Michaelmas Term 2002
Pooja Sinha
Încă nu există evaluări
Q. 1 - Q. 25 Carry One Mark Each
Document12 pagini
Q. 1 - Q. 25 Carry One Mark Each
Pooja Sinha
Încă nu există evaluări
The Archimedean Property
Document1 pagină
The Archimedean Property
Pooja Sinha
Încă nu există evaluări
COS 511: Foundations of Machine Learning
Document7 pagini
COS 511: Foundations of Machine Learning
Pooja Sinha
Încă nu există evaluări
GATE 2014 Answer Keys For CS Computer Science and Information Technology
Document1 pagină
GATE 2014 Answer Keys For CS Computer Science and Information Technology
Manoj Bojja
Încă nu există evaluări
Q. 1 - Q. 25 Carry One Mark Each
Document12 pagini
Q. 1 - Q. 25 Carry One Mark Each
Pooja Sinha
Încă nu există evaluări
01 Streams
Document34 pagini
01 Streams
Pooja Sinha
Încă nu există evaluări
GATE 2014 Answer Keys For CS Computer Science and Information Technology
Document1 pagină
GATE 2014 Answer Keys For CS Computer Science and Information Technology
Manoj Bojja
Încă nu există evaluări
GATE 2014 Answer Keys For CS Computer Science and Information Technology
Document1 pagină
GATE 2014 Answer Keys For CS Computer Science and Information Technology
Manoj Bojja
Încă nu există evaluări
Q. 1 - Q. 25 Carry One Mark Each.: GX X X HX X GHX HGX HX GX X GX HX X X
Document11 pagini
Q. 1 - Q. 25 Carry One Mark Each.: GX X X HX X GHX HGX HX GX X GX HX X X
Fahad Rasool Dar
Încă nu există evaluări
CS: Computer Science and Information Technology 7Th Feb Shift2 65 100.0
Document25 pagini
CS: Computer Science and Information Technology 7Th Feb Shift2 65 100.0
Pooja Sinha
Încă nu există evaluări
20KVA Hipulse D - SIMHADRI POWER LTD.
Document36 pagini
20KVA Hipulse D - SIMHADRI POWER LTD.
seil iex
Încă nu există evaluări
LCP - Macatcatud PS School Year 2021-2022
Document7 pagini
LCP - Macatcatud PS School Year 2021-2022
Geoff Rey
Încă nu există evaluări
Smooth Start of Single Phase Induction Motor: Project Seminar On
Document11 pagini
Smooth Start of Single Phase Induction Motor: Project Seminar On
Mayur Pote
100% (1)
Form P
Document2 pagini
Form P
tanmoy
0% (1)
Merits and Demerits of Arithemetic Mean
Document10 pagini
Merits and Demerits of Arithemetic Mean
tarun
86% (7)
M3u Playlist 2022
Document9 pagini
M3u Playlist 2022
Carlos Danelon
Încă nu există evaluări
Project Report
Document61 pagini
Project Report
Sheth Khader
0% (1)
Prepared By: Y. Rohita Assistant Professor Dept. of IT
Document73 pagini
Prepared By: Y. Rohita Assistant Professor Dept. of IT
naman jaiswal
Încă nu există evaluări
Lecture 1 Robotics
Document34 pagini
Lecture 1 Robotics
Arshad
Încă nu există evaluări
LED Button
Document14 pagini
LED Button
labirint10
Încă nu există evaluări
Picoflow: Continuous Ow Measurement at Low Solid/Air Ratios
Document4 pagini
Picoflow: Continuous Ow Measurement at Low Solid/Air Ratios
Duc Duong Tich
Încă nu există evaluări
Internet of Things (IoT) Module 1
Document37 pagini
Internet of Things (IoT) Module 1
Richa Tengshe
Încă nu există evaluări
Exam 312-50 640
Document223 pagini
Exam 312-50 640
thyago rodrigues de souza
Încă nu există evaluări
Sale Register Report PMGKAY June 2022 PHH
Document10 pagini
Sale Register Report PMGKAY June 2022 PHH
saumya
Încă nu există evaluări
DSE7410-MKII-DSE7420-MKII Manual
Document2 pagini
DSE7410-MKII-DSE7420-MKII Manual
Carlo Jasmin
Încă nu există evaluări
Ecen 248 Lab 8 Report
Document7 pagini
Ecen 248 Lab 8 Report
api-241454978
Încă nu există evaluări
ERP
Document7 pagini
ERP
afzaljamil
Încă nu există evaluări
009-Vicitra-Kutumbam-01pdf 2 PDF
Document50 pagini
009-Vicitra-Kutumbam-01pdf 2 PDF
David Sandela
57% (7)
Threats, Vulnerabilities, and Attacks
Document9 pagini
Threats, Vulnerabilities, and Attacks
John Hector Sayno
Încă nu există evaluări
Signals and Systems PDF
Document1 pagină
Signals and Systems PDF
Vishal Garimella
Încă nu există evaluări
Internet Fundamentals: CS 299 - Web Programming and Design
Document17 pagini
Internet Fundamentals: CS 299 - Web Programming and Design
gopitheprince
Încă nu există evaluări
MAD Lab File
Document75 pagini
MAD Lab File
RAVINDER
Încă nu există evaluări
Module-4 Product and Service Design
Document30 pagini
Module-4 Product and Service Design
Ji Yu
Încă nu există evaluări
Counterfeit Bill Detection
Document4 pagini
Counterfeit Bill Detection
darel perez
Încă nu există evaluări
Star Dock
Document4 pagini
Star Dock
api-3740649
Încă nu există evaluări
Basic Electronics:: Carnegie Mellon Lab Manual
Document145 pagini
Basic Electronics:: Carnegie Mellon Lab Manual
Palak Ariwala
Încă nu există evaluări
Chapter 2: Programming Architecture 2.1 Model View Controller (MVC)
Document7 pagini
Chapter 2: Programming Architecture 2.1 Model View Controller (MVC)
Nabin Shrestha
Încă nu există evaluări
Analytics Cross Selling Retail Banking
Document11 pagini
Analytics Cross Selling Retail Banking
praveen_bpgc
Încă nu există evaluări
Phishing Websites Detection Based On Phishing Characteristics in The Webpage Source Code
Document9 pagini
Phishing Websites Detection Based On Phishing Characteristics in The Webpage Source Code
api-233113554
100% (1)
Mes3083 Set 1 PDF
Document7 pagini
Mes3083 Set 1 PDF
Ahmad Solihuddin
Încă nu există evaluări