Documente Academic
Documente Profesional
Documente Cultură
Modele
Un model este o reprezentare a unui anumit fenomen
Model matematic - o reprezentare matematic a unui
fenomen
De cele mai multe ori un model descrie legturile
existente ntre dou sau mai multe variabile
n general, snt dou clase de modele:
Modele deterministe
Modele probabiliste
Modele deterministe
Exprim o relaie
exact ntre variabile
Teoretic, eroarea de
previziune este nul
Exemplu:
Principiul al doilea al
mecanicii newtoniene:
F = m.a
Modele probabiliste
Componenta determinist
Componenta aleatoare
Eroarea de previziune este
nenul
Componenta aleatoare
poate fi datorat factorilor
obiectivi, ce nu snt inclui n
model
Exemplu: Volumul
vnzrilor=10 * Cheltuielile
cu publicitatea +
Componenta aleatoare
4
OOAlte
t thh ee r r
MM oo dd ee l lss
modele
5
Y f ( X 1 ,..., X n )
Variabila
dependent
Variabile
independente
(variabila
endogen)
(variabile
exogene/explicative)
Variabila
rezidual
dC
1
dV
Forma funcional
Ipoteza de linearitate nu este att de restrictiv pe ct pare.
Aceasta se refer la felul n care parametrii intr n ecuaie, nu
neaprat la relaia ntre variabilele x i y.
n general modele pot fi linearizate.
y=a+bx
y=a+bz, z=ex
y=a+br, r=1/x
y=a+bq, q=ln(x)
y= x ln(y)=+ln(x)
1
y
Contra exemplu:
x nu poate fi transformat n
model liniar.
1000
a be
a b
800
600
a bx
400
200
a b ln x
0
-1
0.003
0.008
0.013
0.018
0.023
0.028
0.033
0.038
0.043
0.048
0.053
0.058
0.063
0.068
-200
-400
10
consum
800
600
400
200
0
200
300
400
500
600
700
800
900
1000
venit
12
Estimators)
13
Ipotezele de normalitate i
homoscedasticitate
f(e)
Y
X
X
14
Y
X2
X1
X
Dreapta de regresie
15
Modele
de regresie
2+ Variabile
explicative
Multiple
Simple
Linear
NonLinear
Linear
NonLinear
16
Legtur neliniar
17
18
Exemplu
practic
Exist o legtur ntre suprafaa
unor apartamente din zona
central i preul de nchiriere a
acestora?
Selectm aleator 25 de astfel de
apartamente la care urmrim
valorile celor dou variabile X
suprafaa(m2) i Y chiria
lunar(RON).
19
20
21
22
Corelograma(Scatter plot)
Graficul punctelor de coordonate (Xi,Yi), i=1,n.
23
Modelul de regresie
liniar simpl
Yi 0 1 X i i
Variabila
dependent
(rspuns)
Variabila
de
perturbaie
Variabila
independent
Panta dreptei de
regresie
(explicativ)
24
25
Dreapta de regresie
27
Yi 0 1X i i
Valoarea
observat
i = Eroarea
YX
(E(Y))
Valoarea
observat
0 1X i
X
28
Yi 0 1 X i
Yi
Xi
= Estimatorul pantei 1
29
i 1
Ne reamintim c
i 1
Yi 0 1 xi
deci
30
Ilustrare grafic
n
LS minimizeaz
i 1
2
i
2
1
2
2
2
3
2
4
Y2 0 1X 2 2
^ 44
^ 22
^ 11
^ 33
Yi 0 1X i
X
31
Condiiile de minim:
32
33
Notaii
Valoarea estimat:
Valoarea rezidual(reziduul):
34
35
1
x
2
V( 0 )
n S xx
2
V(1 )
S xx
n
All Possible
Sample 2 Line
Population Line
Sampling Distribution
S^
1
Sample Slopes
Sample 1: 2.5
Sample 2: 1.6
Sample 3: 1.8
Sample 4: 2.1
:
:
Very large number of
sample slopes
^
1
37
2
e
i
i 1
n2
2
2
)
S
V
(
xx
1
-SE ( 1 )
df
n2
S xx
V ( 0 )
-SE ( 0 )
df
1 x 2
n S xx
2
n2
1 x 2
n S xx
2
0
H A : 1 1
1 10 1 10
t
SE ( 1 )
S xx
1 10
n
e /(n 2)
i 1
n
2
i
1 10
n
2
e
/
(
x
x
)
/ n2
i
i 1
2
i
i 1
2
(
x
x
)
i
i 1
40
41
0
H
:
A 0
0
0 00
t
SE ( 0 )
0 00
2
1
x
2
ei /(n 2)
i 1
n S xx
n
Valoarea critic:
0 00
1 x
n S xx
2
0 00
2
1
x
2
/ n2
ei n
2
n
i 1
(
x
x
)
i 1
t / 2;n 2
42
2
1 x 2
1
x
2
0 0 t / 2,n 2
n S xx
n S xx
2
1 t / 2, n 2 SE ( 1 ) 1 1 t / 2, n 2 SE ( 1 )
1 t / 2, n 2
n
unde 2
e
i 1
2
i
n2
2
2 x 2
x
2
1 1 t / 2, n 2
S xx
S xx
43
Teorema Gauss-Markov
( y y)( x x) y ( x x) y ( x x) y ( x x)
i
i 1
( x x)
i 1
i 1
( x x)
i 1
i 1
i 1
i 1
i 1
i 1
i 1
n
( x x)
i yi
i 1
i 1
i 1
i 1
Pentru ca E( ) 1 , e necesar ca qi 0 i qi xi 1.
'
'
i 1
(
i 1
2
i
2 i vi v )
2
i
(
i 1
2
i
v )
2
i
i 1
2
i
q .
i 1
2
i
2
(
v
)
i i
i 1
V ( 1 ).*** QED
44
Descompunerea variaiei
Y
SSE =(Yi - Yi )2
SST = (Yi - Y)
^ 0
Yi =
_
SSR = (Yi - Y)2
Xi
^ Xi
+ 1
_
Y
X
47
= SSR
SSE
_
Msoar variaia valorilor observate Yi n jurul mediei Y
49
Coeficientul de determinaie R2
Este o msur a proporiei varianei explicate de
n
n
model
2
2
R2
( y y )
SSR
SST ( y y )
i 1
e
i 1
( y y)
0,1
Standard Error :
2
e
i
i 1
n2
51
Observaii
R2 este adesea folosit pentru a alege cel mai bun
model din punctul de vedere al varianei explicate.
Comparaiile de acest fel trebuie fcute ntre modele
de aceeai natur.
52
Foarte important!!
Pentru modele de regresie fr termen liber, de tipul
R2 nu mai are semnificaia de proporie a varianei
y
explicate.
Exemplu: considerm dou astfel de modele
y1 1 x1 1
, unde y2i y1i i x2i x1i
y2 2 x2 2
Coeficientul de determinaie i
coeficientul de corelaie liniar
Y R2 = 1, r = +1
^=b +b X
Y
i
0
1 i
Y R2 = 1, r = -1
^=b +b X
Y
i
2
R
Y = .8,r = +0.9
X
Y
^=b +b X
Y
i
0
1 i
X
1 i
R2 = 0, r = 0
^ =b +b X
Y
i
0
1 i
X
54
Tabelul ANOVA
Testul
SSR
F k 1 ~ Fk 1, n k
SSE
nk
k-numrul de parametrii ai
modelului
ANOVA
56
57
Ce prezicem
Y
Y Individual
Mean Y, E(Y)
E(Y)
Prediction, Y
XP
= 00 + 11 X
^
X
58
Y t / 2, n 2 SY E (Y ) Y t / 2, n 2 SY
unde
SY
x x
x x
i 1
i
2
e
i 1
2
i
n2
59
60
Y
Dispersie
mai mare
dect la X1
_
Y
X1
X2
X
61
Exemplu
Un analist de marketing stabilete c volumul vnzrilor depinde
liniar de cheltuielile cu reclama. Estimeaz un model de regresie
i obine 0 = -.1, 1 = .7 & s = .60553.
Cheltuieli cu reclama $ Vnzri(buci)
1
1
2
1
3
2
4
2
5
4
Ct vor fi vnzrile medii dac se cheltuiesc 4 $ pentru reclam?
alfa=0.05
62
Soluie
Y t //2,2,nn22 SYY E (Y ) Y t //2,2,nn22 SYY
Y 0.1 0.7 4 2.7
Valoarea particular
pentru X
1 4 3
SYY .60553
0.3316
5
10
22
63
Y t / 2,n 2 S Y Y YP Y t / 2,n 2 S Y Y
unde
n
1
S Y Y 1
n
xP x
n
x x
i 1
i
2
e
i 1
2
i
n2
64
Predicia
Y
we're trying to
predict
Expected
Expected
(Mean) Y
E(Y) =
Prediction, YY
XP
00 + 11 X
X
65
_
X
XP
X
66
1 Xp X
h statistic n
2
n
Xi X
2
i 1
67
68
sri s 1 hi unde
( xi x )
1
hi
2
n (x j x )
2
e
i 1
2
i
n2
Standardized residual i =
Residual i / Standard deviation
69
70
71
72
73
74
^y
Residual
+ + +
+
+
+
+
+
+
++ +
+ +
+
+
+
+ +
+
+ +
+
+
+
^y
++
+ ++
++
++
+
+
++
+
+
75
^y
Residual
+
+ +
+
+
+
+ +
+
+ +
+ +
+ +
+ +
+
+
++ +
^y
++
++
++
+
+ +++
+++
+
++
+
+
++
+
+
76
Residual
+ ++
+
+
+
+
+
+ +
+
+
++
+
+
Time
0 +
+
Time
+
77
REGRESIE MULTIPLA
78
Regresie multipla
Coeficienti de
regresie
Variabila eroare
Variabile Independente
79
Competition
Market awareness
Demand generators
Demographics
Physical quality
80
Profitability
Competition
Rooms
Number of
hotels/motels
rooms within
3 miles from
the site.
Market
awareness
Nearest
Distance to
the nearest
La Quinta inn.
Customers
Office
space
College
enrollment
Margin
Community
Physical
Income
Disttwn
Median
household
income.
Distance to
downtown.
81
+ 5Income + 6Disttwn +
INN
INN
11
22
33
44
55
66
MARGIN ROOMS
ROOMS NEAREST
NEAREST OFFICE
OFFICE COLLEGE
COLLEGE INCOME
INCOME DISTTWN
DISTTWN
MARGIN
55.5
3203
0.1
549
37
12.1
55.5
3203
0.1
549
88
37
12.1
33.8
2810
1.5
496
17.5
39
0.4
33.8
2810
1.5
496
17.5
39
0.4
49
2890
1.9
254
20
39
12.2
49
2890
1.9
254
20
39
12.2
31.9
3422
434
15.5
36
2.7
31.9
3422
11
434
15.5
36
2.7
57.4
2687
3.4
678
15.5
32
7.9
57.4
2687
3.4
678
15.5
32
7.9
49
3759
1.4
635
19
41
49
3759
1.4
635
19
41
44
82
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.724611
R Square 0.525062
Adjusted R Square
0.49442
Standard Error
5.512084
Observations
100
ANOVA
df
Regression
Residual
Total
SS
MS
F
Significance F
6 3123.832 520.6387 17.13581 3.03E-13
93 2825.626 30.38307
99 5949.458
Coefficients
Standard Error t Stat
Intercept
72.45461 7.893104 9.179483
ROOMS
-0.00762 0.001255 -6.06871
NEAREST -1.64624 0.632837 -2.60136
OFFICE
0.019766
0.00341 5.795594
COLLEGE 0.211783 0.133428 1.587246
INCOME
-0.41312 0.139552 -2.96034
DISTTWN 0.225258 0.178709 1.260475
83
Utilizarea modelului
Predictiepentru un hotel cu urmatoarele
caracteristici:
84
2
2
ii
2
2
(
r
r
)
(rii rii
1
1)
n
n
2
2
rrii
1
1
ii
The range
range of
of d
d is
is 0
0
d
d
4
4
The
85
Residuals
+
0
Time
+
+
+ +
+
+
+
+
+
Time
86
TEST UNILATERAL
Daca d<dL exista autocorelatie de ordinul I pozitiva
If d>dU nu exista autocorelatie de ordinul I pozitiva
Daca d este intre dL si dU indecizie.
indecizie
Auto
corelatie
dL
Indepen
denta
dU
Indecizie
Indepen
denta
4-dU
Auto
corelatie
4-dL
88
Variabile calitative
In many real-life situations one or more
independent variables are qualitative.
Including qualitative variables in a regression
analysis model is done via indicator variables.
An indicator variable (I) can assume one out of
two values, zero or one.
o
1
if
a
degree
earned
is
in
Finance
1
if
the
temperature
was
below
50
11 ififadata
firstwere
condition
collected
out ofbefore
two is1980
met
I= 00 ififthe
o Finance
asecond
degree
earnedwas
isout
not
temperature
50
or more
00 ififadata
werecondition
collected
after
ofintwo
1980
is met
89
Variabile calitative
Consideram ca pretul este determinat si de
culoarea masinii.
Consideram trei culori :
White
Silver
Other colors
90
Folosim modelul
y = 0 + 1(Odometer) + 2I1 + 3I2 +
Price
5318
5061
5008
5795
5784
5359
.
.
Odometer
37388
44758
45833
30862
31705
34010
.
.
I-1
1
1
0
0
0
0
.
.
I-2
0
0
0
0
1
1
.
.
White car
Other color
Silver color
91
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.835482
R Square
0.69803
Adjusted R Square
0.688594
Standard Error
142.271
Observations
100
ANOVA
df
Regression
Residual
Total
3
96
99
SS
MS
F Significance F
4491749 1497250 73.97095 7.22E-25
1943141 20241.05
6434890
Coefficients
Standard Error t Stat
Intercept
6350.323 92.16653 68.90053
Odometer -0.02777 0.002369 -11.7242
I-1
45.24098 34.08443 1.327321
I-2
147.738 38.18499 3.869007
92