Sunteți pe pagina 1din 20

logistic regression

dependent variable can take 2 alternatives - 0,1 yes no


we find out the probability of simething happening.
Probability is given by e to power z divide by 1+e to the power z
Odds for will be given by e to the power z
Log of "Odds for" = Log of likelihood = LL = will become Z
Z = b not + b1x1 +b2x2+...
since odds will always be less than 1, LL will always be negative.
So, we take minus LL or minus 2LL (standard).
Since it is negative, value should decrease as the model fit improves.
Forward LR Method -Step 0---Z= b0
Difference between predicted and observed values (error)
then we enter first independent variable
Step 1----Z= b0 + b1x1.
Again find out diff between observed and predicted. (error)
Compare that the error in step 0 and step 1
if there is decrease in error is significant, we cancel the step 0 model
and accept step 1 model.

Microfinance----Analyze> Regression>Binary Logistic


Method = forward LR
Save - Probabilities
Options - Hosmer Lemshow goodness of fit
Model Summary - -2LL is going down

Took variables - duration, supportgroup,gender,age


See PRE_1 i.e. probabilities column in data sheet (newly created)
Based on this, we should divide ppl into rank 1,2,3,4,5
Analyse>descriptivestatistics>frequencies
Predictive probability as variable
Statistics - Cutpoints =5
aisa kuch aaya!
Statistics
Predicted probability
N
Valid
Missing
Percentiles
20
40
60
80

337
172
0.316628
0.506517
0.591726
0.746565

Transform - Recode into different variable (use above to define variables


highest 1, lowest 5 (can also be positive) higher = default = lower
rating in this case
See new column "rating" in datasheet
Analyze > Regression > Linear regression
D as Dependent variable and take significant independent variables
(duration, supportgroup,gender,age)
aisa kuch aaya -Coefficientsa
Model
Coefficientt
1 (Constant)
age
duration
gender

Unstandardized CoeffiStandardized
Sig.
B

Std. Error Beta


0.636
0.115
-0.054
0.02
-0.134
-0.164
0.024
-0.336
0.166
0.051
0.159

5.52
-2.705
-6.825
3.243

0
0.007
0
0.001

supportgro
a Dependent Variable: D

0.145

0.05

0.145

2.915

0.004

new equation bani >>> (equation of credit rating) - interval scale!!


#NAME?
Transform > Compute Variable > Score -- write above eqn there
cut points
analyze compare means > one way anova
one way anova.. what variables are significant for credit score
with "Descriptive statistics"
aisa kuch aaya --

Descriptives
N

Mean

Std. DeviatStd. Error

95%

Confidence Interval f Minimum Maximum


Lower Bou Upper
Bound
age

77

1.7013

0.97421

0.11102

1.4802

1.9224

62

2.4839

1.09757

0.13939

2.2051

2.7626

63

2.4286

1.41095

0.17776

2.0732

2.7839

69

2.4493

1.05072

0.12649

2.1969

2.7017

66

3.2424

1.13762

0.14003

2.9628

3.5221

337

2.4362

1.23549

0.0673

2.3038

2.5686

77

2.6104

1.18272

0.13478

2.3419

2.8788

5
Total
5
experince

62

2.5484

1.15485

0.14667

2.2551

2.8417

63

2.4762

1.20292

0.15155

2.1732

2.7791

69

2.3478

1.04073

0.12529

2.0978

2.5978

66

2.4697

1.11244

0.13693

2.1962

2.7432

337

2.4926

1.13682

0.06193

2.3708

2.6144

77

1.5455

0.6795

0.07744

1.3912

1.6997

3
2

62

1.6613

0.76702

0.09741

1.4665

1.8561

63

1.5714

0.75593

0.09524

1.3811

1.7618

69

1.5942

0.6928

0.0834

1.4278

1.7606

66

1.5455

0.63686

0.07839

1.3889

1.702

337

1.5816

0.7029

0.03829

1.5063

1.6569

77

2.6494

1.06086

0.1209

2.4086

2.8901

4
2

62

2.5968

1.0474

0.13302

2.3308

2.8628

63

2.3175

0.96429

0.12149

2.0746

2.5603

69

2.6957

1.00447

0.12092

2.4544

2.937

4
Total
4
netearning
1

3
Total
3
household
1

66

2.6667

0.9337

0.11493

2.4371

2.8962

337

2.5905

1.00814

0.05492

2.4825

2.6985

77

2.4026

1.67222

0.19057

2.0231

2.7821

62

1.9194

1.56077

0.19822

1.523

2.3157

63

2.4286

1.70118

0.21433

2.0001

2.857

69

1.913

1.52179

0.1832

1.5475

2.2786

66

2.1061

1.61844

0.19922

1.7082

2.5039

337

2.1602

1.62326

0.08842

1.9863

2.3342

77

2.2727

0.64147

0.0731

2.1271

2.4183

62

2.0968

0.69447

0.0882

1.9204

2.2731

63

2.2222

0.72833

0.09176

2.0388

2.4056

69

2.2174

0.68319

0.08225

2.0533

2.3815

66

2.2273

0.69715

0.08581

2.0559

2.3987

337

2.2107

0.68583

0.03736

2.1372

2.2842

77

1.2208

0.41749

0.04758

1.126

1.3155

4
Total
4
purpose
5

5
Total
5
amount
4

4
Total
4
duration

1
1

62

1.1774

0.38514

0.04891

1.0796

1.2752

63

1.7619

0.68895

0.0868

1.5884

1.9354

69

2.3333

0.77964

0.09386

2.146

2.5206

66

3.3636

0.7773

0.09568

3.1726

3.5547

337

1.9614

1.02716

0.05595

1.8514

2.0715

77

62

1.6774

0.47128

0.05985

1.5577

1.7971

63

1.6825

0.46923

0.05912

1.5644

1.8007

69

1.3913

0.49162

0.05918

1.2732

1.5094

66

1.4091

0.49543

0.06098

1.2873

1.5309

337

1.6409

0.48044

0.02617

1.5895

1.6924

77

0.8442

0.36509

0.04161

0.7613

0.927

1
2

62

0.5484

0.50172

0.06372

0.421

0.6758

63

0.3492

0.48055

0.06054

0.2282

0.4702

69

0.4638

0.50234

0.06047

0.3431

0.5844

5
Total
5
gender
2

2
Total
2
supportgro
0

66

0.2121

0.41194

0.05071

0.1109

0.3134

337

0.4955

0.50072

0.02728

0.4419

0.5492

1
Total
1

Now see for age - high credit category has a mean of 1.7 but in qn 25-34
: (2) which means age is like 32

See score column there!!!!

Try LR for arjun file qn 7

=============================================================
============

RFM practoice -- harit5


purchcum = frequency
monthlas = recency
amtspent = monetary
Analysie> descriptive stats>freq>statistics>cut points = 5
select recency

aisa kuch aaya


Statistics
Recency
N
Valid
Missing
Percentiles
20
40

506
71
2.044
3.174

60
80

5.4
8.696

Now go to transform>recode into diff variables, enter range as above with


high value as 5 lowest value as 1
Analysie> descriptive stats>freq>statistics>cut points = 5
select pyrchcum (frequency)
Statistics
total no purchased last 5 years
N
Valid
506
Missing
71
Percentiles
20
15
40
19
60
22
80
28
Now go to transform>recode into diff variables, enter range as above with
high value as 5 lowest value as 1

Statistics
total amt spent on all ayurvedic
N
Valid
506
Missing
71
Percentiles
20
200.89
40
416.05
60 565.938
80 741.692
Now go to transform>recode into diff variables, enter range as above with
high value as 5 lowest value as 1

Now make composite score


Numeric expression - 100*R+10*F+M (Compute Variable)
Analysie> descriptive stats>freq>statistics>cut points = 5
Analyse. Compare Means> One Way Anova (do it with whatever ma'am asks)

RFM( composite) in dependent list nd other variable in factor and check


if it is significant or not

===============================================
Structural eqn modelling
every variable may be dependent on other

create new > make golas and name them


create latent variables (just click as many times) and name them (exactly
same as that mentioned in excel)
Load Data fine
List variables in Dataset
drag Drop names into dabbas
Use touch up a variable like a magic wand!!
Add latent variable 3rd col 2nd row
onw way arrow from overall to others
Plugins > Name unobserved variables
!! Require table from Book
View > Analysis Properties > Output > Modification Indices
View Text. Modification Indices > highest > join with double arrow, run
again and repeat
chronbach alpha nikaalo aur wahi daalo latent variables mein jinka 0.6 se
upar hai.

mODEL fIT/ eSTIMATES


GFI, AGFI, PGFI, RFI, NFI, CFI
Check P value in estimates it should be greater than 0.5

===========================

S-ar putea să vă placă și