Documente Academic
Documente Profesional
Documente Cultură
knowledge, for them to take powerful decisions across the board, a recently de
expertise, they should be able to create, setup and interact with the Data Scie
Overview INSOFE a pioneer in Data Science training for both engineers and managers h
data savvy and take the organization to the next level in data
Day 1: Developing a Data Driven Decision Makers Mindset
Duration
Topics discussed
2 hours
2 hours
2 hours
2 hours
Topics discussed
2 hours
1.5 hours
Recommendation Engine
1.5 hours
Optimization
1.5 hours
Fraud Detection
1.5 hours
sions across the board, a recently developing data oriented thinking will help. While they may not have program
etup and interact with the Data Science team.
for both engineers and managers has developed a 2 day program that helps high performers and future leaders
e next level in data
s Mindset
Activities
The participants define some of the problems they
face on a daily basis in the analytics context. Based
on their intuition, they pick one problem for the next
two sessions
They assess the pain and gain of solving each problem
and arrive at a priority
For the problem chosen, they define the single table
view, engineer the features and come up with broad
solution architecture
They understand estimating the data and
understanding the errors, defining error metrics,
building validation strategies
hts (8 hours)
Activities
r Future Leaders
nted thinking will help. While they may not have programming
Desired Outcomes
Desired Outcomes
Topic
Data Story Telling - The Science, ggplot, Bubble Charts with Multiple Dimensions, Gauge Charts, Treem
Probabilities, joint and conditional probabilites, simulations and estimations. Introduction to gaussian m
Data types, basic probabilities, Probability distributions (Discrete and Continuous) -Bernoulli, Binomia
Describing the relationship between attributes: Covariance; Correlation; ChiSquare
Special emphasis on Normal distribution; Central Limit Theorem
Inferential stats: t, f chi-square testing
Inferential statistics: How to learn about the population from a sample and vice versa; Sampling distrib
Hypothesis Testing
Statistics and Probability in Decision Modeling: Linear Regression
Approach: Model Estimation, MLE & Error Function, Optimization through Gradient Descent for finding
Constructing a Linear Regression, Diagnostics
Interpretation and Applications
Approach: Model Estimation, MLE & Error Function, Optimization through Gradient Descent for finding
Constructing Logistic Regression, Diagnostics
Interpretation and Applications
Statistics and Probability in Decision Modeling: Time Series
Regression on Time
Modeling Seasonality as Deviation
Statistician's Approach: Components of a Time Series and Estimation Methods
Smoothing: Moving Average, Weighted and Exponential Moving
Holt Winters Method
Box-Jenkins and ARIMA
Methods and Algorithms in Machine Learning Supervised: Decision Trees
Rule Based Knowledge: Logic of Rules, Evaluating Rules, Rule Induction and Association Rules
Construction of Decision Trees through Simplified Examples; Choosing the "Best" attribute at each Non
Gain, Gini Index, Chi Square, Regression Trees
Generalizing Decision Trees; Information Content and Gain Ratio; Dealing with Numerical Variables; ot
Pruning a Decision Tree; Cost as a consideration; Unwrapping Trees as Rules
Oblique Decision Trees
Methods and Algorithms in Machine Learning Supervised: Instance based learning
K-NN method, wilson editing and triangulation
K-NN in collaborative filtering, digit recognition
Methods and Algorithms in Machine Learning Supervised: Ensembles
Methods of Ensembling (Stacking, Mixture of Experts)
Bagging and Random forest (Logic, Practical Applications)
Ada Boost
Gradient Boosting Machines
Methods and Algorithms in Machine Learning Supervised: Neural Networks
Motivation for Neural Networks and Its Applications
Perceptron and Single Layer Neural Network, and Hand Calculations
Learning In a Multi Layered Neural Net: Back Propagation and Conjugant Gradient Techniques
Application of Neural Net In Face and Digit Recognition
Deep Learning Basics: Restricted Boltzman Machines
Self Organizing Maps
Methods and Algorithms in Machine Learning Supervised: Support Vector Machines and H
Mathematical Intuition (Kernel Methods Revisited, Quadratic Optimization and Soft Constraints)
Applications and Interpretation
Optimization and Decision Analysis
Linear Programming (Graphical Explanation)
Dual form and Sensitivity
Problem Setting and Applications
Goal Programming
Quadratic Programming: Gradient, Hessian, Lagrangian
Application of Quadratic Programming: Portfolio Allocation
Evolutionary Search (Genetic Algorithms)
Practical Applications
Text Mining, Social Network Analysis and Natural Language Processing
Taming big text, Unstructured vs. Semi-structured Data; Fundamentals of information retrieval, Proper
Creating Term-Document (TxD);Matrices; Similarity measures, Low-level processes (Sentence Splitting
Stemming; Chunking)
Handling big graphs
The purpose of it all: Finding patterns in data
Finding patterns in text: Mahout, text mining, text as a graph
Engineering Big Data with Hadoop Ecosystem
Introduction to Big Data
Data center as a computer
Storing big bytes
Rapidly ingesting & organizing unstructured data
Your key tool: Split and Merge
Querying big data
Processing big data
Statistics and Probability in Decision Modeling: Nave Bayes
Fundamentals of Probability; Random Variables, Distributions, Conditional and Marginal Probability
Lecture
Time
(Hours)
Hands-onLab Time
(Hours)
1
0.5
0.5
0.5
0.5
0.5
1
0.5
0.5
0.5
0.5
0.5
0.5
1
1.5
1
0.5
1
1.5
1
1.5
1.5
1.5
1.5
4
4
4
4
4
4
2
1
2
1
1
1
1
1
0.5
0.5
1
0.5
0.5
1
0.5
0.5
0.5
0.5
1.5
1.5
1.5
1.5
1
1
1
1
2
2
2
2
0.5
1.5
1
1
0.5
1.5
1
1
1
1
1.5
1.5
1.5
1.5
1
1
1.5
1.5
1.5
1.5
1.5
1
1
1
1
1.5
1
1
1
1
2
2
2
2
2
2
2
1
1.5
1.5
1
1
2
2
2
1
1.5
1.5
1
1
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
2
2
1
2
2
1
144
144
288