AI Lec 02 - Supervised Learning - Week 02

SE-402: Lecture 2
Machine Learning: Supervised

Learning
09-09-2013
Announcements
Done with AI Introduction

Machine Learning
Todays lecture will be FUN
Fun
AI Marathon
Motivating Example
SE-402
AI and Machine
Applications Learning
and
Applicati
Agents
ons
Where We Are
AI and Machine
Applications Learning
and
Applicati
Agents
ons
What is Machine Learning?
Volunteer
ML: The study and design of intelligent
agents
That make decisions based on data.
The goal of machine learning is to
build computer systems that can
adapt and learn from their
experience.
Tom Dietterich
.
AI: The study and design of intelligent
agents
A Generic System
x1 y1
x2 y2
System

xN h1 , h2 ,..., hK
yM
Input Variables: x x1 , x2 ,..., xN

Hidden Variables: h h1 , h2 ,..., hK
Output Variables: y y1 , y2 ,..., yK
Machine Learning 15
Another definition
Machine Learning algorithms discover the
relationships between the variables of a
system (input, output and hidden) from
direct samples of the system
These algorithms originate form many

fields:
Statistics, mathematics, theoretical computer
science, physics, neuroscience, etc
Machine Learning 16
Past When are ML algorithms NOT
needed?
When the relationships between all

system variables (input, output, and
hidden) is completely understood!
This is NOT the case for almost any

real system!
Machine Learning 17
Past Observations
Example Left Hand Right $

# Hand
0 Up Up 0
1 Up Down 2
2 Down Up 4
3 Up Down 2
4 Down Down 0
5 Down Up 4
6 Up Up 0
Most interesting thing ever?
No
Exists Alongside Others

No
Not New
Full Focus

No
Machine Learning
Learnin
g
Machine Learning
Supervised
Learning
Learnin
g
Machine Learning
Unsupervise Supervised
d Learning Learning
Learnin
g
Machine Learning
d Learning Learning
Learnin
g
Feature
Learning
Deep Learning
Learn by Doing
c
Theory will Help
c
Carpentry
Background on supervised task
Dive into a particular three

algorithms
Nearest neighbor
Regression
SVM
Talk about carpentry


algorithms
Nearest neighbor
Regression
SVM

Machine Learning
d Learning Learning
Today!
Learnin
g
Feature
Learning
Supervised Learning
Given labeled data. Predict

output. (Learning with a
teacher)
Already have one algorithm
Add three more
Carpentry of Supervised
Learning
What does Data Look
Like?
Data
M observations :
For each observation (i) we have x(i) and y(i)
Supervised Learning
Training
Set
Learning
Algorith
m
h
Supervised Learning
Training
Set
Learning
Algorith
m
x h
Supervised Learning
Training
Set
Learning
Algorith
m
x h predicted
y
An Example Application
An emergency room in a hospital measures 17
variables (e.g., blood pressure, age, etc) of
newly admitted patients.
A decision is needed: whether to put a new
patient in an intensive-care unit.
Due to the high cost of ICU, those patients
who may survive less than a month are given
higher priority.
Problem: to predict high-risk patients and
discriminate them from low-risk patients.
CS583, Bing Liu, UIC 40

Applications: Example 2
A credit card company receives thousands of
applications for new cards. Each application
contains information about an applicant,
age
Marital status
annual salary
outstanding debts
credit rating
etc.
Problem: to decide whether an application
should approved, or to classify applications into
two categories, approved and not approved.

Learning the Target
Function
Like human learning from past experiences.
A computer does not have experiences.
A computer system learns from data, which
represent some past experiences of an
application domain.
Our focus: learn a target function that can be
used to predict the values of a discrete class
attribute, e.g., approve or not-approved, and
high-risk or low risk.
The task is commonly called: Supervised
learning, classification, or inductive learning.
The Data and goal
Data: A set of data records (also called
examples, instances or cases) described
by
k attributes: A1, A2, Ak.
a class: Each example is labelled with a pre-
defined class.
Goal: To learn a classification model from
the data that can be used to predict the
classes of new (future, or test)
cases/instances.

Unintentional
Hard working is the key to

success!
Loan application
example Approved or not

The learning task
Learn a classification model from the data
Use the model to classify future loan
applications into
Yes (approved) and
No (not approved)
What is the class for following case/instance?

Supervised vs.
Unsupervised
Supervised learning: classification is seen
as supervised learning from examples.
Supervision: The data (observations,
measurements, etc.) are labeled with pre-
defined classes. It is like that a teacher
gives the classes (supervision).
Test data are classified into these classes too.
Unsupervised learning (clustering)
Class labels of the data are unknown
Given a set of data, the task is to establish the
existence of classes or clusters in the data

Break
https://www.youtube.com/watch?v=
mq0D5kq9ePE
Steps in Supervised
Learning
Learning (training): Learn a model using the training data
Testing: Test the model using unseen test data to assess the model accuracy
Number of correct classifications

Accuracy ,
Total number of test cases

What is Learning?
Given
a data set D,
a task T, and
a performance measure M,
a computer system is said to learn
from D to perform the task T if after
learning the systems performance on T
improves as measured by M.
In other words, the learned model helps
the system to perform T better as
compared to no learning.
Example: Learning
Data: Loan application data
Task: Predict whether a loan should be
approved or not.
Performance measure: accuracy.
No learning: classify all future applications

(test data) to the majority class (i.e., Yes):
Accuracy = 9/15 = 60%.
We can do better than 60% with learning.

Assumption in learning
Assumption: The distribution of training
examples is identical to the distribution of
test examples (including future unseen
examples).
In practice, this assumption is often violated

to certain degree.
Strong violations will clearly result in poor
classification accuracy.
To achieve good accuracy on the test data,
training examples must be sufficiently
representative of the test data.
Discriminative
Training
Set
Learning
Algorith
m
x h predicted
y
Discriminative
1.Nearest Neighbors
2.Linear Regression
3.SVMs

algorithms
Nearest neighbor
Regression
SVM

K Nearest Neighbors
Curse of Dimensionality
Linear Regression
Support Vector Machine

Why Machine?
Vladimir Vapnik
Support Vector Machine
Theoretical guarantees
on performance
Correlation between
features doesnt throw it
of
Less prone to curse of
dimensionality
Live Example

algorithms
Nearest neighbor
Regression
SVM


algorithms
Nearest neighbor
Regression
SVM

Carpentry
LibSVM
Cross Validation
Train Set Test Set

N Fold Cross Validation
Train Train Train Test Train

Set Set Set Set Set
Metrics
Accuracy
F1
Precision
Recall
AUC (Area Under the Curve)
ROC (Receiver Operating
Characteristic)
Efficiency (time, memory)
AUC
Overfitting?
Overfitting?
AUC (how good)
Training Set
Testing Set
Model Complexity
Generative Vs
Discriminative?
Generative tells a better story
In general discriminative is more accurate.
Generative outperforms with use of an expert.
Bayes nets are going

to come back for
unsupervised learning
Generative Vs Discriminative: An
analogy?
The task is to determine the language

that someone is speaking
Generative approach:
is to learn each language and determine
as to
which language the speech belongs to
Discriminative approach:
is determine the linguistic diferences
without
learning any language a much easier
task!
ML Taxonomy
Generative Methods
Model class-conditional pdfs (probability distribution function) and
prior probabilities
Generative since sampling can generate synthetic data points
Popular models
Gaussians, Nave Bayes, Mixtures of multinomials
Mixtures of Gaussians, Mixtures of experts, Hidden Markov
Models (HMM)
Sigmoidal belief networks, Bayesian networks, Markov random
fields
Discriminative Methods
Directly estimate posterior probabilities
No attempt to model underlying probability distributions
Focus computational resources on given task better performance
Popular models
Logistic regression, SVMs
Traditional neural networks, Nearest neighbor
Conditional Random Fields (CRF)
Generative Models
Examples
Nearest neighbor Neural networks
106 examples
Shakhnarovich, Viola, Darrell 2003 LeCun, Bottou, Bengio, Haffner 1998

Berg, Berg, Malik 2005 Rowley, Baluja, Kanade 1998

Support Vector Machines and Kernels Conditional Random Fields
Guyon, Vapnik McCallum, Freitag, Pereira 2000

Heisele, Serre, Poggio, 2001 Kumar, Hebert 2003

Further difference
A generative algorithm models how the

data was generated in order to
categorize a signal.
It asks itself a question: based on my
generation assumptions,
which category is most likely to
generate this signal?
A discriminative algorithm does not care

about how the data was generated, it
simply categorizes a given signal.
Nave Bayes vs SVM
[suspense]
87.8 vs 85.9 AUC
Ensemble Method?
Motivating Example
Good Features
Good Features
Good Features
Good Features
Detect Multiple Faces?
Sliding Window
Sliding Window
No Face
No Face
No Face
Maybe Face
Face!
The End?

AI Lec 02 - Supervised Learning - Week 02

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

AI Lec 02 - Supervised Learning - Week 02

Încărcat de

Drepturi de autor:

Formate disponibile

SE-402: Lecture 2

Machine Learning: Supervised

Done with AI Introduction

Input Variables: x x1 , x2 ,..., xN

These algorithms originate form many

When the relationships between all

This is NOT the case for almost any

Example Left Hand Right $

Most interesting thing ever?

Most interesting thing ever?

Dive into a particular three

Talk about carpentry

Dive into a particular three

Talk about carpentry

Given labeled data. Predict

Add three more

CS583, Bing Liu, UIC 40

CS583, Bing Liu, UIC 41

CS583, Bing Liu, UIC 43

Hard working is the key to

CS583, Bing Liu, UIC 45

CS583, Bing Liu, UIC 46

CS583, Bing Liu, UIC 47

Number of correct classifications

CS583, Bing Liu, UIC 49

No learning: classify all future applications

CS583, Bing Liu, UIC 51

In practice, this assumption is often violated

Dive into a particular three

Talk about carpentry

Support Vector Machine

Dive into a particular three

Talk about carpentry

Dive into a particular three

Talk about carpentry

Train Set Test Set

Train Train Train Test Train

Generative tells a better story

In general discriminative is more accurate.

Generative outperforms with use of an expert.

Bayes nets are going

The task is to determine the language

Shakhnarovich, Viola, Darrell 2003 LeCun, Bottou, Bengio, Haffner 1998

Support Vector Machines and Kernels Conditional Random Fields

Guyon, Vapnik McCallum, Freitag, Pereira 2000

A generative algorithm models how the

A discriminative algorithm does not care

S-ar putea să vă placă și