Outline - Advanced Analytics 2017-19

Course: Advanced Analytics
Faculty: Prof. V. Nagadevara, nagadev@iimb.ernet.in

Term: Term 5 (2017-19); Pre-requisites: Core courses.
No. of sessions: 17 sessions of 90 minutes each.
Textbook: Bing Liu, “Web Data Mining: Exploring Hyperlinks, Contents, and Usage
Data” 2nd ed. 2011 Edition, Springer
In addition, a number of journal articles will be supplied for additional reading.
Course Objective
This course is planned to provide students with:
 understanding of basic concepts and methods in text mining, such as document

representation, information extraction, text classification and clustering
 the ability to use benchmark corpora, commercial and open-source text analysis and
visualization tools to explore interesting patterns in textual data
 understanding of various techniques and algorithms (such as support vector machines, naïve
bayes) for advanced text mining, text classification and clustering, opinion mining, and their
applications in real-world problems;
 knowledge of various components of web mining such as web structure mining, web
content mining and web usage mining
 familiarity with the basic concepts involved in image mining including image processing and
classification
Given the large amounts of unstructured data flooding the Internet, mining high-quality information
from text and web becomes increasingly critical. The actionable knowledge extracted from text data
facilitates effective decision making in a broad spectrum of areas, including business intelligence,
information acquisition, social behaviour analysis and strategization. This course will cover important
topics in text mining, web mining and image mining leading to text and web analytics. Students will
also be exposed to use of software for text and web mining. The course places emphasis on the use
of techniques for different aspects of text and web mining and strategization based on the mining
results. In general, exposure to each topic will be supported by a real life case study. In order to
make the course more relevant, and practice oriented, participants will be using the very popular
analytics package, WEKA and applying the techniques on different corpora (text databases).
Each topic will include a real life application.
Course Outline
S No. Topic Reading material Application

Introduction to mining
1 Chapter 6
unstructured data
Natural language processing
2 and 3 and document Article on NLP, Chapter 6
representation
Classification Techniques for
4 Chapters 3.2 and 3.3 Automated patent classification
Textual Documents
Support Vector Machines for Sentiment analysis based on sports

5 Chapter 3.8
Classification forum
Naïve Bayes method for

6 Chapters 3.6 and 3.7 Extraction of Product Attributes
classification
Clustering techniques for Automatic labelling of hierarchical
7 Chapters 4.1 to 4.4
mining text data clusters
Documentation-to-Source-Code
8 Latent Semantic Analysis Chapter 6.7
Traceability
Sentiment analysis of Movie
9 Sentiment analysis Chapter 11.1
reviews
10 Use of WEKA for Text Mining
Analyzing e-Commerce website -

11 Introduction to web mining Hand-out
Flipcart
Effectiveness of web search
12 Web structure mining Chapter 8
engines
Case – Analyzing Customer
13 Web content mining Chapter 9 and 10 Reviews (audio)
Making automated
14 Web usage mining Chapter 12 recommendations and
personalization
15 Recommendation Systems Hand-out
16 Attribution Models Hand-out
17 Project Presentations
Learning Outcomes
At the end of this course, a participant should be able to
 Mine and analyse the data from text data and web data
 Identify the appropriate techniques for analysing the data drawn from text and web sources
 Extract sentiment from social media and strategize based on sentiment analysis
 Use open source mining software package
 Strategize based on the analysis of unstructured data
Evaluation:
1. Term Paper (Individual) 30%

2. End term (Take home individual) 30%
3. Project (Group) 40%

Outline - Advanced Analytics 2017-19

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Outline - Advanced Analytics 2017-19

Încărcat de

Drepturi de autor:

Formate disponibile

Course: Advanced Analytics

Faculty: Prof. V. Nagadevara, nagadev@iimb.ernet.in

In addition, a number of journal articles will be supplied for additional reading.

 understanding of basic concepts and methods in text mining, such as document

Each topic will include a real life application.

S No. Topic Reading material Application

Support Vector Machines for Sentiment analysis based on sports

Naïve Bayes method for

10 Use of WEKA for Text Mining

Analyzing e-Commerce website -

15 Recommendation Systems Hand-out

16 Attribution Models Hand-out

1. Term Paper (Individual) 30%

S-ar putea să vă placă și