Sunteți pe pagina 1din 1

This cheat sheet helps you choose the best machine learning algorithm for

Machine Learning Algorithm Cheat Sheet


your predictive analytics solution. Your decision is driven by both the nature
of your data and the goal you want to achieve with your data.

Text Analytics Extract information from text Predict between Multiclass Classification
several categories
Derives high-quality information from text Answers complex questions with
Answers questions like: What info is in this text? multiple possible answers
Answers questions like: Is this A or B or C or D?
Extract N-Gram Creates a dictionary of n-grams
Features from Text from a column of free text Multiclass Logistic Fast training times,
Regression linear model
Converts text data to integer
Multiclass Neural Accuracy, long training times
Feature Hashing encoded features using the
Network
Vowpal Wabbit library Predict between
two categories Multiclass Decision
Performs cleaning operations on text, Accuracy, fast training times
Preprocess Text like removal of stop-words, case Forest
normalization One-vs-All Depends on the
Multiclass two-class classifier
Word2Vector
Converts words to values for use in Generate recommendations
NLP tasks, like recommender, named Multiclass Boosted Non-parametric, fast
entity recognition, machine Decision Tree training times and scalable
translation
Recommenders
Predicts what someone will be interested in Two-Class Classification
Predict Answers the question: What will they be interested in?
Regression Answers simple two-choice questions,
values
Collaborative filtering, better like yes or no, true or false
Makes forecasts by estimating the SVD Recommender performance with lower cost by
relationship between values Answers questions like: Is this A or B?
reducing dimensionality
Answers questions like: How much or how many? Two-Class Support Under 100 features,
Vector Machine linear model
Fast Forest Quantile
Predicts a distribution
Regression Discover structure Two-Class Averaged
Fast training, linear model
Perceptron
Poisson Regression Predicts event counts
Clustering Two-Class Decision
Accurate, fast training
Forest
Linear Regression Fast training, linear model Separates similar data points into intuitive groups
Answers questions like: How is this organized? Two-Class Logistic
Fast training, linear model
Regression
Bayesian Linear
Linear model, small data sets K-Means Unsupervised learning
Regression Two-Class Boosted Accurate, fast training,
Decision Tree large memory footprint
Decision Forest
Accurate, fast training times
Regression
Find unusual occurrences Classify Two-Class Neural Accurate, long training
Neural Network images Network times
Accurate, long training times
Regression
Anomaly Detection Image Classification
Boosted Decision Accurate, fast training times,
Tree Regression large memory footprint Identifies and predicts rare or unusual data points Classifies images with popular networks
Answers the question: Is this weird? Answers questions like: What does this image represent?

Under 100 features, PCA-Based Anomaly High accuracy, better


One Class SVM Fast training times DenseNet
aggressive boundary Detection efficiency

© 2019 Microsoft Corporation. All rights reserved. Share this poster: aka.ms/mlcheatsheet

S-ar putea să vă placă și