Sunteți pe pagina 1din 7

14/01/2017 LearningDataScience:Day11SupportVectorMachineMedium

Images havent loaded yet. Please exit printing, wait for images to load, and try to
Haydar Ali Ismail Follow
print again.
Not better at statistics than some engineers and still not better at engineering than some statistician.
4 days ago 5 min read

Learning Data Science: Day 11 - Support


VectorMachine

Illustration of Support VectorMachine

We are slowly moving to machine learning topic. In the previous


story, we talked about k Nearest Neighbor which categorized as
supervised learning. Today, we are going to talk another supervised
learning method called Support Vector Machine (SVM).

Support VectorMachine
Built to handle classication and regression problems, but mostly used
in classication problems. In SVM, each data points plotted in n-
dimensional space. Where the value of data points coordinates
depending on the features. n is specied by the number of features
used in the classier. To classify the data points, SVM will nd a
hyperplane that dierentiates the two classes well.

Separating Hyperplane

https://medium.com/@haydar_ai/learningdatascienceday11supportvectormachine8ef06da91bfc#.x84zk52h8 1/7
14/01/2017 LearningDataScience:Day11SupportVectorMachineMedium

When SVM creating a hyperplane basically there are four things to


considers.

x, are the data points

y, are the labels of the classes

w, is the weight vector

b, is the bias

x are the data points that we have. In image classication x is the set
of available images, for other cases that would be dierent. y, on the
other hand, is the class labels. An example in image classication
would be when the SVM should dene which image have a cat in the
picture or a dog in the picture, thats the labels. To dene the
orientation of the hyperplane, we will need to use w or also called the
weight vector. The main goal of SVM is to estimate the optimal weight
vector.

So, we would have something like the image above. Where the blue
dotted points is a class, and the yellow one is another class. Some
library called the class under the hyperplane function as -1 class, and
the class over the hyperplane function as +1 class. Yet, if we only
have x, y, and w we would have something like the image below.

https://medium.com/@haydar_ai/learningdatascienceday11supportvectormachine8ef06da91bfc#.x84zk52h8 2/7
14/01/2017 LearningDataScience:Day11SupportVectorMachineMedium

Because we only have w, the hyperline is bound to go through the


origin of the coordinate system. So, to make it more exible we can
use b or bias to shift the hyperplane around.

And now we will get the function of the hyperplane as the function
below.

Hyperplane function

Now we can dene the class of a particular data point based on the
hyperplane function. If the function for a data point is less than 0 it
will go to -1 class, if the function for a data point is more than 0 it
will go to +1 class.

https://medium.com/@haydar_ai/learningdatascienceday11supportvectormachine8ef06da91bfc#.x84zk52h8 3/7
14/01/2017 LearningDataScience:Day11SupportVectorMachineMedium

Illustration of data points that go to -1 class and +1 class with dierent color

Maximum Margin Classication


When choosing the hyperplane it is best to follow the maximum
margin classication rule. The margin is the distance between the
closest dot point(s) for each class to the hyperplane. The closest data
point(s) are the one that we called as the support vectors.

Which one is better? Left orright?

Lets try it on the picture above. They are an exact same data points,
only with dierent hyperplane function. If we follow the maximum
margin rule, we would choose the left one as it is the better one for
SVM. If we choose the right one, the margin between the yellow
support vector and the hyperplane is too small. So, that if other data
points that belongs to yellow came in and were on the left of the
support vector of the yellow class then it would create a miss-
classication.

Outliers
Compared to kNN, SVM is able to handle outliers pretty well. By
letting the support vector machines to cut some slack.

https://medium.com/@haydar_ai/learningdatascienceday11supportvectormachine8ef06da91bfc#.x84zk52h8 4/7
14/01/2017 LearningDataScience:Day11SupportVectorMachineMedium

The annotation for slack variables

Basically, slack variables measures the distance between the outliers


to the margin where they actually should be placed on the opposite
side.

Illustration of outilers with slack variables

We can actually ignore those outliers compared to kNN that is


sensitive to outliers.

XOR Problem
In some scenario, you wouldnt be able to create a single linear
hyperplane to classify the data points. Such as the picture below.

https://medium.com/@haydar_ai/learningdatascienceday11supportvectormachine8ef06da91bfc#.x84zk52h8 5/7
14/01/2017 LearningDataScience:Day11SupportVectorMachineMedium

XOR Problem

We can take the SVM to another level by adding an additional


dimension. This is what will happen if we add an additional
dimension (applying a square function).

The problem now separable

Now, when the problem already solved. We can decrease the


dimension and we will have a good boundary line that classies the
classes well.

Wrap Up
So, today we have covered up the basic theory of SVM. However,
there might be things that I got it wrong. So, let me know on the
response below and we can discuss that. Hopefully, this story may
help you and see you on the next story.

https://medium.com/@haydar_ai/learningdatascienceday11supportvectormachine8ef06da91bfc#.x84zk52h8 6/7
14/01/2017 LearningDataScience:Day11SupportVectorMachineMedium

https://medium.com/@haydar_ai/learningdatascienceday11supportvectormachine8ef06da91bfc#.x84zk52h8 7/7

S-ar putea să vă placă și