Sunteți pe pagina 1din 35

# A Brief Introduction

Hongbo Deng
6 Feb, 2007

1
Some of the slides are borrowed from Derek Hoiem & Jan Sochman.
Outline
Background

Theory/Interpretations

2
Can be used with many different classifiers

## Commonly used in many areas

Simple to implement

## Not prone to overfitting

3
Resampling for
A Brief History estimating statistic

Bootstrapping

## Bagging Resampling for

classifier
Boosting (Schapire 1989) design

4
Bootstrap Estimation
Repeatedly draw n samples from D
For each set of samples, estimate a
statistic
The bootstrap estimate is the mean of the
individual estimates
Used to estimate a statistic (parameter)
and its variance

5
Bagging - Aggregate Bootstrapping

For i = 1 .. M
Draw n*<n samples from D with replacement
Learn classifier Ci

## Final classifier is a vote of C1 .. CM

Increases classifier stability/reduces
variance D2
D1

D3 D
6
Boosting (Schapire 1989)
Consider creating three component classifiers for a two-category problem
through boosting.
Randomly select n1 < n samples from D without replacement to obtain D1
Train weak learner C1

## Select n2 < n samples from D with half of the samples misclassified by C1 to

obtain D2
Train weak learner C2

## Select all remaining samples from D that C1 and C2 disagree on

Train weak learner C3
D
Final classifier is vote of weak learners D3
D1

D2 -
++ -

7
Instead of resampling, uses training set re-weighting
Each training sample uses a weight to determine the probability
of being selected for a training set.

## AdaBoost is an algorithm for constructing a strong

classifier as linear combination of simple weak
classifier

## Final classification based on weighted vote of weak

classifiers

8
ht(x) weak or basis classifier (Classifier =
Learner = Hypothesis)
strong or final classifier

## Weak Classifier: < 50% error over any

distribution
Strong Classifier: thresholded linear combination
of weak classifier outputs

9
Each training sample has a
weight, which determines the
probability of being selected for
training the component classifier

10
Find the Weak Classifier

11
Find the Weak Classifier

12
The algorithm core

13
Reweighting

y * h(x) = 1

y * h(x) = -1

14
Reweighting

## In this way, AdaBoost focused on the

informative or difficult examples.
15
Reweighting

## In this way, AdaBoost focused on the

informative or difficult examples.
16
Algorithm recapitulation t=1

17
Algorithm recapitulation

18
Algorithm recapitulation

19
Algorithm recapitulation

20
Algorithm recapitulation

21
Algorithm recapitulation

22
Algorithm recapitulation

23
Algorithm recapitulation

24
Very simple to implement
Does feature selection resulting in relatively
simple classifier
Fairly good generalization
Suboptimal solution
Sensitive to noisy data and outliers

25
References
Duda, Hart, ect Pattern Classification

## Friedman, Hastie, etc Additive Logistic Regression: A Statistical View of Boosting

Jin, Liu, etc (CMU) A New Boosting Algorithm Using Input-Dependent Regularizer

## Ratsch, Warmuth Efficient Margin Maximization with Boosting

Schapire, Freund, etc Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods

## Zhang, Li, etc Multi-view Face Detection with Floatboost

26
Appendix
Bound on training error

27
Bound on Training Error (Schapire)

28
(Friedmans wording)

29
(Freund and Schapires wording)

30
Weighted Predictions (RealAB)

31
Friedman
LogitBoost
Solves
Requires care to avoid numerical problems

GentleBoost
Update is fm(x) = P(y=1 | x) P(y=0 | x) instead of
Bounded [0 1]

32