Documente Academic
Documente Profesional
Documente Cultură
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Outline
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
The Kernel Trick
Covers Theorem
Covers Theorem
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
Introduction to SVM
classification.
Optimization packages available for solving the classification
possible margins
Bigger margin better as even in case of noisy data crossover
probability is less.
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
w T x + b = 0.[1] [3]
Here, b denotes bias, x R d is the input feature vector then
w = {w1 , w2 , w3 ...wd }.
For linearly separable points, the plane will not touch any
measurement.
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
Distance Computation
w T x 00 + b = 0. So, w T (x 0 x 00 ) = 0
Projection of xn x for any x on the plane onto the normal w .
Distance d is given by
1
||w ||
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
Optimization problem
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
largest margin.
Constrained optimization problem converted to unconstrained
optimization by lagrangian.
KKT conditions needed for solution of lagrangian under
inequality constraint.
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
x L(x , ) = 0
>= 0
g (x ) = 0
g (x ) <= 0
Covers Theorem
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
n = 1, 2...N
Minimize
P
T
L(w , b, ) = 12 (w T w ) N
n=1 n (yn (w xn + b) 1) w.r.t w
and b and maximize w.r.t. each n >= 0
P
w L = w N
n=1 n yn xn = 0
PN
L
b = n=1 n yn = 0
KKT condition 3: n (yn (w T xn + b) 1) = 0
Substituting these values in the original equation results in the
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
Substituting w =
PN
n=1 n yn xn
and
PN
n=1 n yn
= 0 in the
lagrangian
T
L(w , b, ) = 21 (w T w ) N
n=1 n (yn (w xn + b) 1) we get,
P
1 PN PN
T
L() = N
n=1 n 2
n=1
m=1 n m yn ym xn xm
PN
n=1 n yn
=0
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Quadratic Programming
max L() =
N
X
1XX
n m yn ym xnT xm
2
n=1 m=1
n=1
Alternatively,
N
min L() =
X
1XX
n m yn ym xnT xm
n
2
n=1 m=1
n=1
PN
n=1 n yn
=0
Covers Theorem
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
....
....
.....
....
T
T
T
y2 y1 x2 x1 y2 y2 x2 x2 ... y2 yN x2 xN
subject to a linear constraint y T = 0 and range of
0 <= <= and
The size of the matrix depends on the size of the training dataset.
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
QP hands back
= (1 , 2 , ..., N )
P
w = N
n=1 n yn xn
is a sparse vector. Sice weve the KKT condition
n (yn (w T xn + b) 1) = 0.
Either n = 0 for the interior points or yn (w T xn + b) = 1 for
the suppport vectors- the only ones where n > 0. The xn for
which n > 0 are called the support vectors as they only
contribute to the solution. So now,
P
w =
n yn xn
xn S.V .
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
l
X
N 1
i
i=0
O(N, l)
2N
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
Kernel Function
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
Kernel Function
Mercers condition: There exists a mapping (.) if and only if,
g (x)2 dx
is finite then
K (x, y )g (x)g (y )dxdy 0.
Any kernel which can be expressed as
P
p
K (x, y ) =
p=0 cp (x.y ) , where the cp are positive real
coefficients and the series is convergent, satisfies the condition
RBF kernel: K (x, x 0 ) = exp(||x x 0 ||2 )
Infinite dimensional z: with = 1
k=0
2k (x)k (x 0 )k
k!
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
space
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
Terminology
Dichotomy: A dichotomy is any splitting of a whole into
w x >0
1,
f (x; w ) = 1, w x < 0
0,
w x =0
A set of P vectors is in general position in N-space if every N
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
Figure 1:
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
N1
X
k=0
P 1
k
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
C (P + 1, N) = C (P, N) + C (P, N 1)
Further proof using mathematical induction for
C (P, N) = 2
N1
X
k=0
C (P + 1, N) = 2
N2
X P 1
P 1
+2
k
k
k=0
N1
X
k=0
N1
X
k=0
=2
P 1
k
N1
X P 1
P 1
+2
k
k 1
k=0
=2
N1
X
k=0
P
k
Covers Theorem
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
(
w (x) > 0
w (x) < 0
ifx X +
ifx X
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Covers Theorem
Introduction
Optimization Problem
Lagrangian Formulation
Quadratic Programming
Christopher J. C. Burges.
A tutorial on support vector machines for pattern recognition.
Data Min. Knowl. Discov., 2(2):121167, June 1998.
T.M. Cover.
Geometrical and statistical properties of systems of linear inequalities with
applications in pattern recognition.
Electronic Computers, IEEE Transactions on, EC-14(3):326334, June 1965.
Vladimir N. Vapnik.
The Nature of Statistical Learning Theory.
Springer-Verlag New York, Inc., New York, NY, USA, 1995.
Covers Theorem