Documente Academic
Documente Profesional
Documente Cultură
modeling
Kristiaan Pelckmans, ESAT- SCD/SISTA
J.A.K. Suykens, B. De Moor
Content
I. Overview
II. Classification
III. Regression
IV. Unsupervised Learning
V. Time-series
VI. Conclusions and Outlooks
People
Contributors to LS-SVMlab:
Kristiaan Pelckmans
Johan Suykens
Tony Van Gestel
Jos De Brabanter
Lukas Lukas
Bart Hamers
Emmanuel Lambert
Supervisors:
Bart De Moor
Johan Suykens
Joos Vandewalle
Acknowledgements
Our research is supported by grants from several funding
agencies and sources: Research Council K.U.Leuven: Concerted
Research Action GOA-Mefisto 666 (Mathematical Engineering),
IDO (IOTA Oncology, Genetic networks), several PhD/postdoc
& fellow grants; Flemish Government: Fund for Scientific
Research FWO Flanders (several PhD/postdoc grants, projects
G.0407.02 (support vector machines), G.0080.01 (collective
intelligence), G.0256.97 (subspace), G.0115.01 (bio-i and
microarrays), G.0240.99 (multilinear algebra), G.0197.02 (power
islands), research communities ICCoS, ANMMM), AWI (Bil. Int.
Collaboration South Africa, Hungary and Poland), IWT (Soft4s
(softsensors), STWW-Genprom (gene promotor prediction),
GBOU McKnow (Knowledge management algorithms), Eureka-
Impact (MPC-control), Eureka-FLiTE (flutter modeling), several
PhD-grants); Belgian Federal Government: DWTC (IUAP IV-
02 (1996-2001) and IUAP V-10-29 (2002-2006): Dynamical
Systems and Control: Computation, Identification & Modelling),
Program Sustainable Development PODO-II (CP-TR-18:
Sustainibility effects of Traffic Management Systems); Direct
contract research: Verhaert, Electrabel, Elia, Data4s, IPCOS. JS
is a professor at K.U.Leuven Belgium and a postdoctoral
researcher with FWO Flanders. BDM and JWDW are full
professors at K.U.Leuven Belgium.
I. Overview
Goal of the Presentation
1. Overview & Intuition
2. Demonstration LS-SVMlab
3. Pinpoint research challenges
4. Preparation NIPS 2002
Research results and challenges
Towards applications
Overview LS-SVMlab
I.2 Overview research
Learning, generalization, extrapolation, identification,
smoothing, modeling
Prediction (black box modeling)
Point of view: Statistical Learning,
Machine Learning, Neural
Networks, Optimization, SVM
I.2 Type, Target, Topic
I.3 Towards applications
System identification
Financial engineering
Biomedical signal processing
Datamining
Bio-informatics
Textmining
Adaptive signal processing
I.4 LS-SVMlab
I.4 LS-SVMlab (2)
Starting points:
Modularity
Object Oriented & Functional Interface
Basic bricks for advanced research
Website and tutorial
Reproducibility (preprocessing)
II. Classification
Learn the decision function associated with a set
of labeled data points to predict the values of
unseen data
Least Squares Support Vector
Machines
Bayesian Framework
Different norms
Coding schemes
II.1 Least Squares Support vector Machines
(LS-SVM
(,)
)
1. Least Squares cost-function + regularization
& equality constraints
2. Non-linearity by Mercer kernels
3. Primal-Dual Interpretation (Lagrange
multipliers)
Primal parametric Model:
i i
T
i
e b x w y + + =
Dual non-parametric Model:
i
n
j
j i i i
e b x x K y + + =
=1
) , (
o
o
(.,.)
o
K
II.1 LS-SVM
(,)
Learning representations from relations
(
(
(
(
o
II.3 SVM formulations & norms
1 norm + inequality constraints: SVM
extensions to any convex cost-function
2 norm + equality constraints: LS-SVM
weighted versions
II.4 Coding schemes
1 2 4 6 2 1 3
1 -1 1 1
-1 -1 -1 1
1 -1 -1 -1
1 2 4 6 2 1 3
Encoding Decoding
Multi-class Classification task (multiple) binary classifiers
Labels:
III. Regression
Learn the underlying function from a set of data
points and its corresponding noisy targets in
order to predict the values of unseen data
LS-SVM
(,)
Cross-validation (CV)
Bayesian Inference
Robustness
III.1 LS-SVM
(,)
Least Squares cost-function +
Regularization & Equality constraints
Mercer kernels
Lagrange multipliers:
Primal Parametric Dual Non-parametric
III.1 LS-SVM
(,)
(2)
Regularization parameter:
Do not fit noise (overfitting)!
trade-off noise and information
e
x
x x f + + =
5
) 10 sin(
) sinc( ) (
=
(
(
(
N N
y
y
y
y
K X S
...
... . ) , | (
1 1
o
o
III.4 Robustness
How to build good models in the case of non-
Gaussian noise or outliers
Influence function
Breakdown point
How:
De-preciating influence of large residuals
Mean - Trimmed mean Median
Robust CV, GCV, AIC,
IV. Unsupervised Learning
Extract important features from the unlabeled data
Kernel PCA and related methods
Nystrm approximation
From Dual to primal
Fixed size LS-SVM
IV.1 Kernel PCA
Principal Component Analysis Kernel based PCA
y
x
z
IV.2 Kernel PCA (2)
Primal Dual LS-SVM style formulations
For Kernel PCA, CCA, PLS
IV.2 Nystrm approximation
Sampling of integral equation
Approximating Feature map for
Mercer kernel
) ( ) ( ) ( ) , ( y dx x p y y x K
i i
| |
o
}
=
) ( ) ( ) , (
1
y y y x K
i
N
j
i i j
| |
o
=
= ) ( ) ( ) , (
1
y y y x K
i
n
j
i i j
| |
o
=
=
~
) ( ) ( ) , ( y x y x K
T
o
=
(.)
(.)
IV.3 Fixed Size LS-SVM
i i
T
i
e b x w y + + = ) ( |
i
n
j
j i i i
e b x x K y + + =
=1
) , (
o
o
?
V. Time-series
Learn to predict future values given a sequence of
past values
NARX
Recurrent vs. feedforward
V.1 NARX
Reducible to static regression
CV and Complexity criteria
Predicting in recurrent mode
Fixed size LS-SVM (sparse
representation)
) ,..., , (
1 1 l t t t t
y y y f y
=
,.... , , , , , ...,
5 4 3 2 1 + + + + + t t t t t t
y y y y y y
f
V.1 NARX (2)
Santa Fe Time-series competition
V.2 Recurrent models?
How to learn recurrent dynamical models?
Training cost = Prediction cost?
Non-parametric model class?
Convex or non-convex?
Hyper-parameters?
) ,..., , (
2 1 l t t t t
y y y f y
=
VI.0 References
J. A. K. Suykens, T. Van Gestel, J. De
Brabanter, B. De Moor & J. Vandewalle (2002),
Least Squares Support Vector Machines, World
Scientific.
V. Vapnik (1995), The Nature of
Statistical Learning Theory, Springer-
Verlag.
B. Schlkopf & A. Smola (2002), Learning
with Kernels, MIT Press.
T. Poggio & F. Girosi (1990), ``Networks
for approximation and learning'', Proc. of
the IEEE, , 78, 1481-1497.
N. Cristianini &J. Shawe-Taylor (2000), An
I ntroduction to Support Vector Machines,
Cambridge University Press.
VI. Conclusions
Non-linear Non-parametric learning as a
generalized methodology
Non-parametric Learning
Intuition & Formulations
Hyper-parameters
LS-SVMlab
Questions?