Documente Academic
Documente Profesional
Documente Cultură
ture 2
Sin e
has to be one of
h 1 , h2 , , hM ,
we
on
lude that
Is Learning feasible?
Yes, in a
probabilisti sense.
If:
Hi
Eout(h)
out
Then:
out
in
out
in
out
Ein(h)
Hi
P[
22N
out
fa tor.
Le ture 3:
Linear Models I
Outline
Input representation
Linear Regression
Nonlinear Transformation
2/23
3/23
Input representation
`raw' input x = (x ,x , x , , x
linear model: (w , w , w , , w
0
256)
256)
Features:
4/23
810 0.91
1214
0
Illustration of features
0.1
1618 0.2
x =02(x0,x0.3
x1: intensity
1, x2)
0.4
46 0.5
0.6
810 0.7
1214 0.8
0.9
1618 -81
510 -7-6
155 -5
1015 -4-3
510 -2-1
0
15
Learning From Data - Le
ture 3
810
1214
1618
x2: 02symmetry
46
810
1214
1618
510
155
1015
510
15
5/23
a ements
50%
Eout
10%
1%
0
250
500
Ein
750
0.35
What PLA does0.4
-8Final per
eptron boundary
-7
-6
-5
-4
-3
-2
-1
0
1
1000
6/23
a ements
Po ket:
50%
Eout
10%
10%
Eout
1%
1%
0
50%
250
500
Ein
750 1000
Ein
250
500
750 1000
7/23
a
ements
0.05
0.1
0.15PLA:
0.2
0.25
0.3
0.35
0.4
-8
-7
-6
-5
-4
-3
-2
-1
0
1
Learning From Data - Le
ture 3
0.35
Classi
ation boundary0.4- PLA versus Po
ket
-8Po
ket:
-7
-6
-5
-4
-3
-2
-1
0
1
8/23
Outline
Input representation
Linear Regression
Nonlinear Transformation
regression
real-valued output
9/23
Credit again
Input:
x =
age
annual salary
years in residen
e
years in job
urrent debt
h(x) =
d
X
23 years
$30,000
1 year
1 year
$15,000
wi xi = wTx
i=0
10/23
11/23
in-sample error:
N
X
1
2
(h(xn) yn)
Ein(h) =
N n=1
12/23
x
Learning From Data - Le
ture 3
0.5
1
Illustration of linear0 regression
0.2
0.4
0.6
0.8
1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
x
1
1
0.4
0.5
0.6
0.7
0.8
0.9
1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
x2
13/23
in
N
X
1
2
T
( w xn yn)
Ein(w) =
N n=1
1
2
=
kXw yk
N
where
Learning From Data - Le
ture 3
X=
x
x
.
x
1
2
T
T
y1
y2
y=
yN
14/23
Minimizing
Ein(w) =
Ein(w) =
1
N kXw
2 T
(Xw
X
N
in
yk
y) = 0
XTXw = XTy
w = Xy where X = (XTX)1XT
X is the `pseudo-inverse' of X
15/23
The pseudo-inverse
X = (XTX)1XT
{z| {z }}
d+1
d+1
N d+1
|
N
{z }
d+1
{z
d+1 N
{z
d+1 N
16/23
xT
y
T
x
y
X=
y = . .
,
.
xT
y
|
|
{z
}
{z
}
1
target ve tor
2:
3:
= (XTX)1XT
17/23
18/23
Symmetry
0.5
0.55
-8
-7
-6
-5
-4
-3
-2
-1
0
Learning From Data - Le
ture 3
Average Intensity
18/23
Outline
Input representation
Linear Regression
Nonlinear Transformation
19/23
-1
Linear is limited
-0.5
0
Hypothesis:
0.5
1
-1.5
-1
-0.5
0
0.5
1
1.5
20/23
Another example
Credit line is ae
ted by `years in residen
e'
but not in a linear way!
Nonlinear [[xi < 1]] and [[xi > 5]] are better.
Can we do that with linear models?
Learning From Data - Le
ture 3
21/23
Linear in what?
d
X
wi xi
i=0
sign
d
X
i=0
wi xi
0.6
0.8
Transform the data nonlinearly
1
0
(x , x ) 0.1(x , x )
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
2
1
2
2
23/23