Sunteți pe pagina 1din 24

6/20/2009

Support Vector Machines


and
Kernel methods

by
Lucian Huluta

06/15/2009
Support Vector Machine (SVM)

What is Support Vector Machine?

 A statistical tool, essentially used for NONLINEAR


classification/regression.
 A SUPERVISED LEARNING mechanism like
neural networks.
 An quick and adaptive method for PATTERN
ANALYSIS.
 A fast and flexible approach for learning COMPLEX
SYSTEMS.
2
Support Vector Machine (SVM)

Strengths Weaknesses

Few parameters required


for tuning the learning Training large data still
difficult
machine

Learning involves
Need to choose a
optimisation of a convex
“good” kernel function
function

It scales relatively well


to high dimensional
3
data
SVM: linear classification
Binary classification problem
- input space

- weights ( - dimensional vector),


- bias
More than one solution for 4
the decision function!
SVM: Generalization capacity

Generalization
ability

Generalization region:
5
SVM: Hard margin
Training data must satisfy:

with constraint:

Quadratic optimization problem

minimize

6
subject to:
SVM: Primal form
Convert the constrained problem => unconstrained problem:

where is nonnegative Lagrange multipliers.

Solving the for and

We obtain:

7
SVM: Dual form
The dual form of the cost function consists of inner
products.

Solve QP with following problem:

The SVM is called


hard-margin support vector machine
8
SVM: L1-soft margin problem
The modified QP minimizes following cost function:

subject to the constraints:

: trade-off between the maximization of the margin and


minimization of the classification error.
9
SVM: L2-soft margin problem
The modified QP minimizes following cost function:

subject to the constraints:

10
SVM: Regression
Decision function:

We assume that all the training


data are within the tube with radius ε
named insensitive loss function

Slack variables:

11
SVM: Regression
Cost function with slack variables:

subject to the constraints:

If p=1: L1 soft-margin


If p=2: L2 soft-margin 12
SVM: Linear inseparability

1. data are NOT linear separable.


2. feature space is HIGH DIMENSIONAL, hence QP
takes long time to solve
3. nonlinear function approximation problems can
13
NOT be solved
SVM: Linear inseparability
If the feature space is Hilbert space, i.e., where inner
product applies…

…,we can simplify the optimization problem by a 14


TRICK!!!
The Kernel “trick”
Kernel Trick = is a method for using a linear classifier
algorithm to solve a non-linear problem by

choosing appropriate KERNEL FUNCTIONS

Kernel trick avoids computing inner product of two


vectors in feature space.

15
Numerical Example
 Consider a two-dimensional input space together
with the feature map:

Kernel function
16
SVM with Kernel: Steps
Choose kernel function:

Maximize:

Compute bias term:

Classify data using decision function:

17
Kernels

Linear:

Polynomial:

Radial Basis Function:

Others: design kernels suitable for target applications

18
Demo

 To see video demo please visit this link. 19


Applications of SVM

Breast cancer diagnosis and prognosis


Handwritten digit recognition
On-line Handwriting Recognition
Text Categorization
3-D Object Recognition Problems
Function Approximation and Regression
Detection of Remote Protein Homologies
Gene Expression
Vast number of applications…
• and diagnosis in chemical processes
Fault
20
Current developments: SVM
Application aspects of SVM – Belousov et.al., 2002,
Journal of Chemometrics

About Kernel latent variables approaches and SVM


– Czekaj et.al., 2005, Journal of Chemometrics

Kernel based orthogonal projections to latent structures


– Rantalainen et.al., 2007, Journal of Chemometrics

Performance assessment of a novel fault diagnosis


system based on SVM – Yelamos et.al., 2009,
Computers and Chemical Engineering

SVM and its application in chemistry – Li et.al.,


2009, Chemometrics and intelligent Laboratory
21
Current developments: SVM
Identification of MIMO Hammerstein systems
with LS-SVM, Goethals et.al., 2005, Automatica

An online support vector machine for abnormal


event detection, Davy et.al., 2006, Signal Processing
Support vector machine for quality, monitoring in a
plastic injection molding process, Ribeiro, 2005,
IEEE System Man and Cybernetics
Fault prediction for nonlinear system based on
Hammerstein model and LS-SVM, Jiang et.al., 2009,
IFAC Safeprocess

22
My future work
Study support vector machine based classification for
“one-against-one” and “one-against-all” problems

Apply of SVM based classification algorithm to small


academic example

Apply SVM to Tennessee Eastman benchmark that


involves 20 pre-defined faults.

Study the role of various “Tuning” parameters on


classification results

Finish my diploma project 


23
Thank you..

&
Answers

24