Slides 03 U 3 Lrej

Review of Le
ture 2
Sin e
has to be one of
h 1 , h2 , , hM ,
we on lude that
Is Learning feasible?
Yes, in a
probabilisti sense.
If:
Hi
|E (g) E (g)| >

in
Eout(h)
out
Then:
|E (h1) E (h1)| > or

|E (h2) E (h2)| > or
|E (hM ) E (hM )| >

in
out
in
out
in
out
Ein(h)
Hi
P[
22N
|E (h) E (h)| > ] 2e

in
out
This gives us an added
fa tor.
Learning From Data

Yaser S. Abu-Mostafa
California Institute of Te hnology
Le ture 3:
Linear Models I
Sponsored by Calte h's Provost O e, E&AS Division, and IST
Tuesday, April 10, 2012
Outline
Input representation
Linear Classi ation
Linear Regression
Nonlinear Transformation
Learning From Data - Le ture 3
2/23
A real data set
3/23
`raw' input x = (x ,x , x , , x
linear model: (w , w , w , , w
0
256)
256)
Extra t useful information, e.g.,

intensity and symmetry x = (x ,x , x )
linear model:
(w , w , w )
Features:
4/23
810 0.91
1214
0
Illustration of features
0.1
1618 0.2
x =02(x0,x0.3
x1: intensity
1, x2)
0.4
46 0.5
0.6
810 0.7
1214 0.8
0.9
1618 -81
510 -7-6
155 -5
1015 -4-3
510 -2-1
0
15
810
1214
1618
x2: 02symmetry
46
810
1214
1618
510
155
1015
510
15
5/23
Evolution of Ein and Eout
a ements
50%
Eout
10%
1%
0
250
500
Ein
750
0.35
What PLA does0.4
-8Final per eptron boundary
-7
-6
-5
-4
-3
-2
-1
0
1
1000
6/23
The `po ket' algorithm

PLA:
a ements
Po ket:
50%
PSfrag repla ements
Eout
10%
10%
Eout
1%
1%
0
50%
250
500
Ein
750 1000
Ein
250
500
750 1000
7/23
a ements
0.05
0.1
0.15PLA:
0.2
0.25
0.3
0.35
0.4
-8
-7
-6
-5
-4
-3
-2
-1
0
1
0.35
Classi ation boundary0.4- PLA versus Po ket
-8Po ket:
-7
-6
-5
-4
-3
-2
-1
0
1
8/23
Outline
Linear Classi ation
Linear Regression
regression
real-valued output
9/23
Credit again
Credit approval (yes/no)

Regression: Credit line (dollar amount)
Classi ation:
Input:
x =
age
annual salary
years in residen e
years in job
urrent debt
Linear regression output:
h(x) =
d
X
23 years
$30,000
1 year
1 year
$15,000
wi xi = wTx
i=0
10/23
The data set
Credit o ers de ide on redit lines:

(x1, y1), (x2, y2), , (xN , yN )
yn R
is the redit line for ustomer x .

n
Linear regression tries to repli ate that.

11/23
How to measure the error
How well does h(x) = w x approximate f (x)?

In linear regression, we use squared error (h(x) f (x))
T
in-sample error:
N
X
1
2
(h(xn) yn)
Ein(h) =
N n=1
12/23
x
0.5
1
Illustration of linear0 regression
0.2
0.4
0.6
0.8
1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
x
1
1
0.4
0.5
0.6
0.7
0.8
0.9
1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
x2
13/23
The expression for
in
N
X
1
2
T
( w xn yn)
Ein(w) =
N n=1
1
2
=
kXw yk
N
where
X=
x
x
.
x
1
2
T
T
y1
y2
y=
yN
14/23
Minimizing
Ein(w) =
Ein(w) =
1
N kXw
2 T
(Xw
X
N
in
yk
y) = 0
XTXw = XTy
w = Xy where X = (XTX)1XT
X is the `pseudo-inverse' of X
15/23
The pseudo-inverse
X = (XTX)1XT

{z| {z }}
d+1
d+1
N d+1
|
N
{z }
d+1
{z
d+1 N
{z
d+1 N
16/23
The linear regression algorithm

1:
Constru t the matrix
and the ve tor y from the data set

(x , y ), , (x , y ) as follows

xT
y
T

x

y
X=
y = . .
,
.

xT
y
|
|
{z
}
{z
}
1
target ve tor
input data matrix
2:
3:
Compute the pseudo-inverse X

Return w = X y.
= (XTX)1XT
17/23
Linear regression for lassi ation
Linear regression learns a real-valued fun tion y = f (x) R

Binary-valued fun tions are also real-valued! 1 R
Use linear regression to get w where w x y = 1
In this ase, sign(w x ) is likely to agree with y = 1
Good initial weights for lassi ation
T
18/23
Linear regression boundary
Symmetry
0.5
0.55
-8
-7
-6
-5
-4
-3
-2
-1
0
Average Intensity
18/23
Outline
Linear Classi ation
Linear Regression
19/23
frag repla ements

Data:
-1
-0.5
0
0.5
1
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
-1
Linear is limited
-0.5
0
Hypothesis:
0.5
1
-1.5
-1
-0.5
0
0.5
1
1.5
20/23
Another example
Credit line is ae ted by `years in residen e'
but not in a linear way!
Nonlinear [[xi < 1]] and [[xi > 5]] are better.
Can we do that with linear models?
21/23
Linear in what?
Linear regression implements
d
X
wi xi
i=0
Linear lassi ation implements
sign
d
X
i=0
Algorithms work be ause of

wi xi
linearity in the weights

22/23
PSfrag repla ements

-1
-0.5
0
0.5
1
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0.6
0.8
Transform the data nonlinearly
1
0
(x , x ) 0.1(x , x )
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
2
1
2
2
23/23

Slides 03 U 3 Lrej

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Slides 03 U 3 Lrej

Încărcat de

Drepturi de autor:

Formate disponibile

Review of Le

|E (g) E (g)| >

|E (h1) E (h1)| > or

|E (hM ) E (hM )| >

|E (h) E (h)| > ] 2e

This gives us an added

Learning From Data

California Institute of Te hnology

Sponsored by Calte h's Provost O e, E&AS Division, and IST

Tuesday, April 10, 2012

Linear Classi ation

Learning From Data - Le ture 3

A real data set

Learning From Data - Le ture 3

Extra t useful information, e.g.,

Learning From Data - Le ture 3

Evolution of Ein and Eout

Learning From Data - Le ture 3

The `po ket' algorithm

PSfrag repla ements

Learning From Data - Le ture 3

Linear Classi ation

Learning From Data - Le ture 3

Credit approval (yes/no)

Linear regression output:

Learning From Data - Le ture 3

The data set

Credit o ers de ide on redit lines:

is the redit line for ustomer x .

Linear regression tries to repli ate that.

How to measure the error

How well does h(x) = w x approximate f (x)?

Learning From Data - Le ture 3

The expression for

Learning From Data - Le ture 3

Learning From Data - Le ture 3

The linear regression algorithm

Constru t the matrix

and the ve tor y from the data set

input data matrix

Compute the pseudo-inverse X

Learning From Data - Le ture 3

Linear regression for lassi ation

Linear regression learns a real-valued fun tion y = f (x) R

Learning From Data - Le ture 3

Linear regression boundary

Linear Classi ation

Learning From Data - Le ture 3

frag repla ements

Linear regression implements

Linear lassi ation implements

Algorithms work be ause of

linearity in the weights

PSfrag repla ements

S-ar putea să vă placă și

Sponsored by Calte h's Provost O e, E&AS Division, and IST

Linear Classi ation

Linear Classi ation

Credit o ers de ide on redit lines:

Linear regression for lassi ation

Linear Classi ation

Linear lassi ation implements