Sunteți pe pagina 1din 33

Regresi Logistik II

(Peubah Bebas : Kategorik)

Dr. Kusman Sadik, M.Si


Program Studi Magister (S2)
Departemen Statistika IPB, 2017/2018
 In the case of logistic regression, the response variable
is a binary or dichotomous variable, which means it
can only take on one of two possible values.
 Case: logistic regression models in which the
predictors are categorical or qualitative variables (such
as gender, location, and socioeconomic status).
 All of the material on logistic regression modeling remains
the same, but the coding of the predictors (dummy
coding) and interpretation of the regression coefficients
changes due to the categorical nature of the predictors.
2
 The interpretation of the model parameters
(intercept, slope) discussed for continuous predictor
variables does not change fundamentally for
categorical predictor variables.
 The main difference between quantitative or
continuous predictors and qualitative or
categorical predictors is that the latter need to be
coded such that (C – 1) indicator variables are
required to represent a total of C categories.

3
 When dummy coding is used, the last category of the
variable is used as a reference category.
 Therefore, the parameter associated with the last
category is set to zero, and each of the remaining
parameters of the model is interpreted relative to the
last category.

4
5
6
7
8
9
10
11
12
Inferensia

13
Catatan : Uji G2 sama dengan Uji Deviance

14
15
Pengaruh Interaksi

16
17
Gender SES Interaksi

18
19
20
21
* Model Logistik untuk Data Horseshoe Crab (Agresti, 5.4.4) *

dataku <- read.csv(file="Data-Horseshoe.Crab-Agresti.csv")


c <- factor(dataku[,1])
s <- factor(dataku[,2])
w <- dataku[,3]
wt <- dataku[,4]
sa <- dataku[,5]
y <- c(1:173)

for (i in 1:length(sa)) {
if (sa[i] > 0) (y[i] = 1) else (y[i] = 0)
}
color <- relevel(c, ref="4")
width <- w
data.frame(color,s,width,wt,sa,y)

model <- glm(y ~ color+width, family=binomial("link"=logit))


summary(model)
dugaan <- round(fitted(model),2)
data.frame(color,width,y,dugaan)

22
Call:
glm(formula = y ~ color+width, family = binomial(link =
logit))

Coefficients:
Estimate Std. Error z value Pr(>|z|)
Intercept -12.7151 2.7617 -4.604 4.14e-06 ***
color1 1.3299 0.8525 1.560 0.1188
color2 1.4023 0.5484 2.557 0.0106 *
color3 1.1061 0.5921 1.868 0.0617 .
width 0.4680 0.1055 4.434 9.26e-06 ***

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’

Null deviance: 225.76 on 172 degrees of freedom


Residual deviance: 187.46 on 168 degrees of freedom
AIC: 197.46

23
color width y dugaan
1 2 28.3 1 0.87
2 3 26.0 1 0.64
3 3 25.6 0 0.59
4 4 21.0 0 0.05
5 2 29.0 1 0.91
6 1 25.0 1 0.58
7 4 26.2 0 0.39
8 2 24.9 0 0.58
.
.
.
171 2 26.5 1 0.75
172 3 26.1 1 0.65
173 2 24.5 0 0.54

24
25
26
27
1. Gunakan Program R untuk data Horseshoe Crabs Revisited
(Agresti, sub-bab 5.4.4 ) .
a. Lakukan pemodelan regresi logistik dengan peubah bebasnya
adalah Width (x) dan Color (c). Bandingkan hasil output R
dengan output SAS di dalam buku Agresti.
b. Lakukan pemodelan regresi logistik dengan peubah bebasnya
adalah Width (x), Color (c), dan Spine (s). Apakah Spine
berpengaruh nyata? Gunakan uji Deviance untuk  = 0.05.
c. Bandingkan model bagian (a) dan (b) di atas melalui uji
Deviance pada  = 0.05. Model mana yang lebih baik?
Jelaskan.

28
2. Gunakan Program R untuk menyelesaikan Problems 9.5 (Azen, hlm.
241 ) .

29
30
Pustaka

1. Azen, R. dan Walker, C.R. (2011). Categorical Data


Analysis for the Behavioral and Social Sciences.
Routledge, Taylor and Francis Group, New York.
2. Agresti, A. (2002). Categorical Data Analysis 2nd. New
York: Wiley.
3. Pustaka lain yang relevan.

31
Bisa di-download di

kusmansadik.wordpress.com

32
Terima Kasih

33