Documente Academic
Documente Profesional
Documente Cultură
Notes on MLAPP
Wu Ziqing
13/07/2018
D = {(xi , yi )}N
i=1
Goal: Find interesting patterns given only the input data (also known as
knowledge discovery):
D = {xi }N
i=1
p(xi |θ)
For high dimensional data, the variability of the data may only appear on a
few latent factors. We can perform dimensionality reduction by
projecting the high dimensional data to lower dimensional subspace to
capture the essence of the data.
Advantages:
better prediction accuracy
enabling fast nearest neighbour searches
facilitates visualisation of high dimensional data
Sometimes the designed matrix will have missing values such as NAN (Not
a Number). Matrix Completion (Imputation) can infer plausible missing
values. Matrix Completion can be applied to:
Image inpainting: to denoise/ complete an image.
Collaborative filtering: To complete the missing entries in a matrix.
Market Basket Analysis: in a binary matrix, predict which cell will be
turned on given a few has been turned on. (predict what items will be
purchased together.)
1
an n-dimensional shape with each line aligned in one dimension, perpendicular to
each other and of the same length
Wu Ziqing (NTU) Chapter1: Introduction 13/07/2018 19 / 25
Parametric Models for Classification and Regression
Linear Regression
Linear regression asserts that the response is a linear function of the input:
y (x) = w T x + = D 2
P
j=1 wj xj + , where = N (µ, σ )
If we try to include every minor variation into the model, we are more
likely to include the noise than the true signal. It is called Overfitting.
err (f , D) = N1 N
P
i=1 II(f (xi ) 6= yi )
If the number of data in validation set is not enough for reliable prediction,
we can use the technique of Cross Validation (CV).
1 Divide the data into K folds.
2 For each fold k ∈ {1, 2, ..., K }, train the model using the rest of the
fold and test against kt h fold.
3 Do the training and testing in a round robin fashion.
“All models are wrong, but some are useful.” — George Box