# REGRESI

oleh
Dr.Ir. Ika Amalia Kartika, MT.

## Departemen Teknologi Industri Pertanian

Fakultas Teknologi Pertanian
Institut Pertanian Bogor
2018
REGRESI
Tujuan Instruksional Khusus:
Setelah mengikuti kuliah ini Mahasiswa
mampu :
1. memodelkan hasil percobaan
(Prediksi, Optimasi, Pengendalian)
2. menganalisis hasil percobaan
3. menyimpulkan hasil percobaan
Regresi analysis :
 The part of statistic which deals with investigation of
the relationship between two or more variables related in
a nondeterministic fashion ...

Model Linier :
y = 0 + 1x,
Pasangan (x, y) yang memenuhi persamaan y = 0 + 1x,
merupakan sebuah garis lurus dengan kemiringan 1 dan titik
potong 0. x adalah independent variable dan y merupakan
dependent variable.

## Asumsi I : There exist parameter 0 and 1 such that for

any fixed value of the independent variable x
Y = 0 + 1 x + 
where Y : expected value is a linear function of x
 : a random variable with E() = 0 and Var() = 2
Asumsi II : For fixed x values x1, x2, .., xn, the observations
y1, y2, .., yn are observed values of random variables Y1, Y2, ....,
Yn generated independently by the model of assumption 1.
Y1 = 0 + 1x1 + 1
Y2 = 0 + 1x2 + 2
...
Yn = 0 + 1xn + n
where i : n independent random deviations :  probability distribution
with mean = 0 and Variance = 2

## SSE =  i2 =  (yi - 0 - 1xi)2  minimalkan SSE untuk

mendapatkan nilai2 0 dan 1

## SS/0 = -2  (yi - 0 - 1xi) = 0

SS/1 = -2  xi (yi - 0 - 1xi) = 0

## n0 + 1  xi =  yi 1 = [n xiyi – (xi)(yi)] / [n xi2 – (xi)2]

0  xi + 1  xi2 =  xiyi 0 = [yi – 1(xi)] / n
2 : a measure of the amount of variability inherent
the regression model...
 <<< 2  model regresi linier 
 residual/selisih antara yi dgn fitted value Yi = 0 + 1xi
2 = SSE/(n – 2) = (yi – Yi)2/(n – 2)

## Asumsi III : E(Yi) = 0 + 1xi and Var(Yi) = 2 for all

i, so each Yi has a normal distribution
Untuk n berjumlah kecil UJI T
Hipotesis Nol : H0 : 1 = 10

## Hipotesis Alternatif Rejection region u/ level  test

Ha : 1 > 10 T  t, n - 2
Ha : 1 < 10 T  - t, n - 2
Ha : 1  10 T  t½, n - 2 atau T - t½, n - 2
Correlation Coefficient, r : a measure of how
strongly related two variables x and y  degree of
linear relationship among variables ...
r = [nxiyi - (xi)(yi)] / [(nxi2 - (xi)2)½ * (nyi2 - (yi)2)½]

Analisis Ragam
Derajat Sum
Sumber kebebasan Mean Squares
squares FO
ragam (MS)
() (SS)

## TOTAL n-1 SST - -

TOLAK Ho :
FoR > F, 1, n - 2
Analisis Ragam
Sumber ragam Sum squares (SS)

## Error SSE = SST - 1 [xiyi – (xi)(yi)/n]

n n
TOTAL SST =  yi2 - ( yi)2/n
i=1 i=1

## Coefficient of determination, r2 : how

much of the variability in the observed yi’s is due
to variation in the independent variable.

r2 = SSR/SST = 1 – SSE/SST
 >>> r2  model regresi linier 
Contoh kasus I :
The following data is representative of that reported in the article
« an experimental correlation of oxides of nitrogen emissions from
power boilers based on field data », with x = burner area liberation
rate (MBtu/hr - ft2) and y = NOx emission rate (ppm)

Observation x y Observation x y
1 100 -150 8 250 400
2 125 140 9 250 430
3 125 180 10 300 440
4 150 210 11 300 390
5 150 190 12 350 600
6 200 320 13 400 610
7 200 280 14 400 670

## 1. Assuming that the simple linear regression model is valid, obtain 0

and 1 parameters.
2. What is the estimate of expected NOx emission rate when burner
area liberation rate equals 225.
3. For each xi used in the experiment, compute the predicted value Yi
= 0 + 1xi (i = 1, 2, ..., 14). Does the plot indicate that the
prediction relationship is effective for the observed data ?
Model Non Linier :
...linierisasi...  y' = 0 + 1x'

## Definisi : A function relating y to x is intrinsically linear if by mean

of a transformation on x and/or y, the function can be expressed as
y' = 0 + 1x', where x' = the transformed independent variable and
y' = the transformed dependent variable…

Fungsi Linierisasi
y = ex  Y = ex. Y' = ln Y, x' = x
 0 = ln , 1 = 
ln Y = ln  + x + ln  ' = ln 
y = x  Y = x. Y' = log Y, x' = log x
 0 = log , 1 = 
log Y = log  +  log x + log  Y' = 0 + 1x' + ' ' = log 
y =  +  log x Y' = Y, x' = log x
 Y =  +  log x +  0 = , 1 = , ' = 
y =  + . 1/x Y' = Y, x' = 1/x
 Y =  + . 1/x +  0 = , 1 = , ' = 
1 = [n x'iy'i – (x'i)(y'i)] / [n x'i2 – (x'i)2]
0 = [y'i – 1 (x'i)] / n
Contoh kasus II :
The following data is representative on mass rate of burning x
and flame length y of filter paper.

Observation x y Observation x y
1 1.7 -1.3 8 3.3 2.6
2 2.2 -1.8 9 4.1 4.1
3 2.3 -1.6 10 4.3 3.7
4 2.6 -2.0 11 4.6 5.0
5 2.7 2.1 12 5.7 5.8
6 3.0 2.2 13 6.1 5.3
7 3.2 3.0

## 1. Estimate the parameters of a power function model.

2. Construct diagnostic plot to check whether a power function
is an appropriate model choice.
3. Test H0 = 4/3 vs Ha < 4/3 using a level 0.05 test
Multiple Linear Regression Model :
y = 0 + 1x1 + 2x2 +…..+ kxk + 
The model describes a hyperplane in the k-dimensional
space of the independent variables (xj).
y : dependent variable
xj : independent variables (j = 0, 1, …., k; x0 = 1)
j : regression coefficient (j = 0, 1, …., k)

y = 0 + 1x1 + 2x2 + 
The model describes a plane in the two-dimensional x1
and x2 space.
y : dependent variable
x1 and x2 : independent variables
0 : intercept of the plane
1 and 2 : regression coefficient
Model Regresi Lengkung :
y = 0 + 1x + 2x2 + ... + kxk
 Y = 0 + 1x + 2x2 + ... + kxk + 
Untuk k = 2  model kuadratik
k = 3  model kubik
SSE =  i2 =  (yi – (0 + 1xi + 2xi2 + ... + kxik))2
 minimalkan SSE untuk mendapatkan nilai2 0, 1,
2, ..., k
n0 + 1  xi + 2  xi2 + ... + k  xik =  yi
0  xi + 1  xi2 + 2  xi3 + ... + k  xik+1 =  xiyi
... ... ... ... ...

## 0  xik + 1  xik+1 + 2  xik+2 + ... + k  xi2k =  xik yi

X = y
n  xi  xi2 ...  xik 0  yi

##  xi  xi2  xi3 ...  xik+1  xi yi

1
=
... ... ... ... ... ...
 xik  xik+1  xik+2 ...  xi2k k  xik yi

X = y
X' X  = X' y  Eliminasi Gauss
-1
atau  = (X' X) X' y

Error Variance, 2 :
2 = SSE/(n – (k + 1)) = (yi – ŷi)2/(n – (k + 1))
Coefficient of multiple determination, R2 :
R2 = 1 – SSE/SST SST = ( yi2) - ( yi)2/n
Contoh kasus III :
The article « a simulation – based evaluation of three cropping
systems on cracking clay soils in a summer rainfall
environenmet » proposed a quadratic model for the
relationship between water supply index (x) and farm wheat
yield (y). Representative data appears below.

Observation x y Observation x y
1 1.2 790 8 2.9 1420
2 1.3 950 9 3.1 1625
3 1.5 740 10 3.2 1600
4 1.8 1230 11 3.3 1720
5 2.1 1000 12 3.9 1500
6 2.3 1465 13 4.0 1550
7 2.5 1370 14 4.3 1560

## 1. Estimate 0, 1 and 2 and the quadratic regression function

y = 0 + 1x + 2x2.
2. Compute the predicted values and residuals. Does the plot
indicate that the quadratic model is correct ?

