Sunteți pe pagina 1din 21

Data Analytics

(CS40003)

Dr. Debasis Samanta


Associate Professor
Department of Computer Science & Engineering
Today’s discussion…
• Semester organization

• Syllabus

• Course objective

• Course plan

• Reference and study materials

• Course web page

• Contact details
…Course organization
• Title: Data Analytics

• Code: CS40003

• Credit: 3-0-0 = 3

• Slot: F

• Timing
Wednesday: 10:00-10:55
Thursday: 09:00-09:55
Friday: 11:00-11:55 (+12:55)

• Venue: V2, Vikramshila Complex


…Course objective
This course will cover fundamental algorithms and techniques used in Data
Analytics. The statistical foundations will be covered first, followed by various
machine learning and data mining algorithms. Technological aspects like data
management, scalable computation and visualization will also be covered.
In summary, this course will provide exposure to theory as well as practical
systems and software used in data analytics.

After completing this course, you will learn how to:

• Find a meaningful pattern in data


• Graphically interpret data
• Implement the analytic algorithms
• Handle large scale analytics projects from various domains
• Develop intelligent decision support systems
…Syllabus
• Data definition
• Concept of data
• Data vs. Information
• Data categorization

• Descriptive Statistics
• Measure of central tendency
• Measure of location of dispersion

• Basic Analysis Techniques


• Statistical hypothesis generation and testing
• Chi-Square test
• t-Test, Analysis of variance, Correlation analysis
• Maximum likelihood test
…Syllabus
• Data Analysis Techniques
• Regression analysis
• Classification techniques
• Clustering techniques
• Association rule analysis

• Case Studies and Projects


• Understanding few business scenarios
• Feature engineering and visualization
• Scalable and parallel computing with Hadoop and MapReduce
• Sensitivity analysis
…Study materials
1. Probability & Statistics for Engineers & Scientists (9th Edn.), Ronald E. Walpole, Raymond
H. Myers, Sharon L. Myers and Keying Ye, Prentice Hall Inc.

2. The Elements of Statistical Learning, Data Mining, Inference, and Prediction (2nd Edn.),
Trevor Hastie Robert Tibshirani, Jerome Friedman, Springer, 2014

3. An Introduction to Statistical Learning: with Applications in R, G. James, D. Witten, T


Hastie, and R. Tibshirani, Springer, 2013

4. Software for Data Analysis: Programming with R (Statistics and Computing),


John M. Chambers, Springer, 2012

5. Mining Massive Data Sets, A. Rajaraman and J. Ullman, Cambridge University


Press, 2012.

6. Advances in Complex Data Modeling and Computational Methods in


Statistics, Anna Maria Paganoni and Piercesare Secchi, Springer, 2013
…Study materials
7. Data Mining and Analysis, Mohammed J. Zaki, Wagner Meira, Cambridge
University Press, 2012

8. Hadoop: The Definitive Guide (2nd Edn.) by Tom White, O-Reilly, 2014

9. MapReduce Design Patterns: Building Effective Algorithms and Analytics


for Hadoop and Other Systems, Donald Miner, Adam Shook, O'Reilly, 2014

10. Beginning R: The Statistical Programming Language, Mark Gardener, Wiley,


2013

Lecture slides and other materials can be had at


http://cse.iitkgp.ac.in/~dsamanta/
…Evaluation plan
• Minimum attendance required: 75% of the total classes

• Mid-Semester evaluation: 30%


• End-Semester evaluation: 40%
• Project-based evaluation: 30% (in four phases)

Note:
Minimum attendance and presence in all evaluations are must
(other than some medical or emergency ground. No compensatory test or
extended submission).
Automated Attendance Marking …
Device Requirement
• Android Lollipop - 5.0+

• Unknown Sources
– To install non-google play store application

• Internet connectivity
– Use VPN for proxy enabled Wi-Fi connection

• GPS enabled
Installation and Setup
• Download the application

• Install the application


– Turn install from “unknown
sources” on

• Open the application

• Click Sign Up button


Click the Student button
Enter your detail…

Enter details carefully


Note: Roll number is your UID.

Caution: Information once


entered cannot be modified.
Please review before
submitting.

• Password should be at
least 8 characters.

• Press Sign Up button

• Wait until the process


completes
Sign In
• Enter your registered email
address

• Enter password
Enroll Course
• Enter Course details
– Course ID: CS40003
– Teacher ID: DSM
– Semester: Autumn
– Slot: F

• Click Request Approval


Mark your attendance
Wait till announcement and note down Unique Id

• Sign in
• Click “Attendance”

• Course ID: CS40003


• Semester: Autumn
• Slot: F
• Enter Unique Code
• Click “Attend Class” button
…Submissions of projects and assignments

• Moodle Course Management System


https://10.5.18.110/moodle/

• Steps to enroll to the course at Moodle


– Create your account (if it is not created earlier)
• User Id: <Roll Number>
• Password:
• Email account:
• Enrolment Key : CS40003
• Verification your account, check the registered mail box
– Login to Moodle with “User Id” and “Password”
– Select the course “Data Analytics” from list of courses at the link “My
Course”
…Doubt clearance and discussions

• Please use “Discussion Forum” at the link “Moodle” in


the course web page at

http://cse.iitkgp.ac.in/~dsamanta/courses/da/index.html
While you are in the class…
Happy
Learning!

S-ar putea să vă placă și