Sunteți pe pagina 1din 4


What is Machine Learning

Tom Mitchells Machine Learning:

A computer program is said to learn from experience E (what data to collect) with respect to some class of tasks T
(what decisions the software needs to make) and performance measure P (how well evaluate the results), if its
performance at tasks in T, as measured by P, improves with experience E.

We call our program model because it has state and it needs to be persisted.

2. Practical Machine Learning Problems

10 examples of ML:

Spam detection Credit card fraud detection Digit recognition Speech understanding Face detection

Medical diagnosis Customer segmentation Stock trading Product Shape detection


Being good at extracting the essence of a problem will allow you to think effectively about what data you need and
what types of algorithms you should try.
Classes of problems:
1. Classification
2. Regression
3. Clustering
4. Rule extraction

3. A Tour of Machine Learning Algorithms

Two ways to think about and categorize the algorithms you may come across in the field.
1. Grouping of algorithms by the learning style.
2. Grouping of algorithms by similarity in form or function (like grouping similar animals together).
Focus is on grouping based on similarity.

Algorithms grouped by learning style:

1. Supervised learning: example problems - classification and regression; example algorithms - logistic
regression and back propagation neural network.
2. Unsupervised learning: egp: clustering, dimensionality reduction, association rule learning; ega: apriori
algo, k-means.
3. Semi-supervised: egp: classification, regression; ega: extensions to other flexible methods that make
assumptions about how to model the unlabeled data.
Algorithms grouped by similarity:

1. Regression algorithms: modeling the relationship between variables that is iteratively refined using a measure
of error in the predictions made by the model.Examples:
1. Ordinary least square regression (OLSR)
2. Linear regression
3. Logistic regression
4. Stepwise regression
5. Multivariate adaptive regression splines (MARS)
6. Locally estimated scatterplot smoothing (LOESS)

2. Instance-based algorithms: build up a database of example data and compare new data to the database
using a similarity measure in order to find the best match and make a prediction. Examples:
1. k-Nearest Neighbor (kNN)
2. Learning vector quantization (LVQ)
3. Self-organizing map (SOM)
4. Locally weighted learning (LWL)
3. Regularization algorithms: penalizes models based on their complexity, favoring simpler models that are also
better at generalizing. Examples:
1. Ridge regression
2. Least absolute shrinkage and selection operator (LASSO)
3. Elastic net
4. Least-angle regression (LARS)
4. Decision tree algorithms: construct a model of decisions made based on actual values of attributes in the
data and are trained on data for classification and regression problems. Examples:
1. Classification and regression tree (CART)
2. Iterative dichotomiser 3 (ID3)
3. C4.5 and C5.0 (different versions of a powerful approach)
4. Chi-squared automatic interaction detection (CHAID)
5. Decision stump
6. M5
7. Conditional decision trees
5. Bayesian algorithms: explicitly applies Bayes Theorem for problems such as classification and regression.
1. Naive bayes
2. Gaussian naive bayes
3. Multinomial naive bayes
4. Averaged one-dependence estimators (AODE)
5. Bayesian belief network (BBN)
6. Bayesian network (BN)
6. Clustering algorithms: like regression, describes the class of problem and the class of methods. It uses
inherent structures in the data to best organize the data into groups of maximum commonality. Examples:
1. K-means
2. K-medians
3. Expectation maximisation (EM)
4. Hierarchical clustering
7. Association rule learning algorithms: extract rules that best explain observed relationships between
variables in data. Examples:
1. Apriori algorithm
2. Eclat algorithm
8. Artificial neural network algorithms: are inspired by the structure and/or function of biological neural
networks. Examples:
1. Perceptron
2. Back-propagation
3. Hopfield network
4. Radial basis function network (RBFN)
9. Deep learning: is a modern update to artificial neural networks that exploit abundant cheap computation.
1. Deep boltzmann machine (DBM)
2. Deep belief networks (DBN)
3. Convolutional neural network (CNN)
4. Stacked auto-encoders
10. Dimensionality reduction algorithms: like clustering methods, it seeks and exploits the inherent structure in
the data, but in this case in an unsupervised manner or order to summarize or describe data using less
information. Examples:
1. Principal component analysis (PCA)
2. Principal component regression (PCR)
3. Partial least squares regression (PLSR)
4. Sammon mapping
5. Multidimensional scaling (MDS)
6. Projection pursuit
7. Linear discriminant analysis (LDA)
8. Mixture discriminant analysis (MDA)
9. Quadratic discriminant analysis (QDA)
10. Flexible discriminant analysis (FDA)

11. Ensemble algorithms: are composed of multiple weaker models that are independently trained and whose
predictions are combined in some way to make the overall prediction. Examples:
1. Boosting
2. Bootstrapped aggregation (Bagging)
3. AdaBoost
4. Stacked generalization (blending)
5. Gradient boosting machines (GBM)
6. Gradient boosted regression trees (GBRT)
7. Random forest
Other algorithms which are not covered:
Feature selection algorithms, algorithm accuracy evaluation, performance measures.
Computational intelligence (evolutionary algorithms, etc.), computer vision (CV), natural language processing
(NLP), recommender systems, reinforcement learning, graphical models, and more.