Documente Academic
Documente Profesional
Documente Cultură
867
Nonlinear models,Kernels
Fall 2016
1X X
maximize i j yi yj hxi , xj i + i
2 i,j i
X
subject to i yi = 0 and i 2 [0, C]
i
X
w= yi i x i
i i = 0 =) yi [hw, xi i + b] 1
i [yi [hw, xi i + b] + i 1] = 0 0 < i < C =) yi [hw, xi i + b] = 1
i i = 0 i = C =) yi [hw, xi i + b] 1
Dual problem
1X X
maximize
2 i,j
i j yi yj hxi , xj i + i
i
X
subject to i yi = 0 and i 2 [0, C]
i
Dual problem
1X X
maximize
2 i,j
i j yi yj k(xi , xj ) + i
i
X
subject to i yi = 0 and i 2 [0, C]
i
Figure 7.10 2D toy example of a binary classification problem solved using a soft margin
Increasing C allows for more nonlinearities
SVC. In all cases, a Gaussian kernel (7.27) is used. From left to right, we decrease the
kernel width. Note that for a large width, the decision boundary is almost linear, and
Decreases number of errors
the data set cannot be separated without error (see text). Solid lines represent decision
boundaries; dotted lines depict the edge of the margin (where (7.34) becomes an equality
SV boundary need not be contiguous
with i = 0).