Documente Academic
Documente Profesional
Documente Cultură
Causal
Forecasting
Models
Disposable Diapers
~ f(births, household income)
Car Repair Parts
~ f(weather/snow)
Promoted Item
~f(discount, placement, advertisements)
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 2
Agenda
• Simple Linear Regression
• Regression in Spreadsheets
• Multiple Linear Regression
• Model Transformations
• Model Fit and Validity
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 3
Simple Linear Regression
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 4
Example: Simple Linear Regression
• Recall from earlier lecture on exponential smoothing
• Estimating initial parameters for Holt-Winter (level, trend, seasonality)
• Removed seasonality in order to estimate initial level and trend
yi = β0 + β1 xi
220
Deseasoned Daily Bagel Demand
160
140
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 5
Simple Linear Regression
• The relationship is described in terms of a linear model
• The data (xi, yi) are the observed pairs from which we try to estimate
the Beta coefficients to find the ‘best fit’
• The error term, ε, is the ‘unaccounted’ or ‘unexplained’ portion
• The error terms are assumed to be iid ~N(0,σ)
Observed demand for period 97
220 = y97 = 204
y = -213.16 + 4.14x
Deseasoned Daily Bagel Demand
200
100
80 85 90 95 100
Time Period (Days)
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 6
Simple Linear Regression
• Residuals or Error Terms
n Residuals, ei, are the difference of actual minus predicted values
n Find the b’s that “minimize the residuals”
n n 2 n 2
∑i=1 (ei ) = ∑i=1 ( yi − yˆi ) = ∑i=1 ( yi − b0 − b1xi )
2
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 7
Simple Linear Regression
• Ordinary Least Squares (OLS) Regression
n Finds the coefficients (b0 and b1) that minimize the sum of the squared error terms.
n We can use partial derivatives to find the first order optimality condition with respect to
each variable.
n n 2 n 2
∑i=1 (ei ) = ∑i=1 ( yi − yˆi ) = ∑i=1 ( yi − b0 − b1xi )
2
220
y = -213.16 + 4.14x
Deseasoned Daily Bagel Demand
n 200
∑ (xi − x )( yi − y)
b =
1
i=1
n 180
2
∑ (xi − x )
i=1
160
b0 = y − b1 x 140
120
We know from the data:
x = 89.5 y = 157.4 100
80 85 90 95 100
Time Period (Days)
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 8
OLS Regression in Spreadsheet
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 9
n
∑ (xi − x )( yi − y)
Regression – By Hand b = i=1
1 n
∑ (xi − x ) 2
i=1
=B7-$B$22
=A7-$A$22
=C7*D7
=C7^2
=SUM(C2:C21) =SUM(D2:D21)
=SUM(E2:E21) =SUM(F2:F21)
=LINEST(A2:A21,B2:B21,1,1)
LINEST(known_y's, known_x's, constant, statistics)
n Then, in the 'Insert Function' area, type the formula and press
the keyboard combination Ctrl+Shift+Enter (for Windows &
Linux) or command+shift+return (for Mac OS X).
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 11
Regression – Using LINEST b1
sb1
b0
sb0
n = number of observations R2 se
k = number of explanatory variables (NOT intercept)
F df
df = degrees of freedom (n-k-1)
SSR SSE
b0 = estimate of the intercept b0 = y − b1 x
n
∑ (xi − x )( yi − y)
b1 = estimate of the slope (explanatory variable 1) b =
1
i=1
n
∑ (xi − x ) 2
i=1
n n n
2 2 2
Total Sum of ∑( y − y ) = ∑( ŷ − y ) + ∑( y − ŷ )
i i i
Squares (SST) i=1 i=1 i=1
R2 = Coefficient of Determination: the ratio of “explained” to total sum of squares where 0≤R2≤1
SSR SSR
R2 = =
SST SSR + SSE
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 12
Regression – Using LINEST b1
sb1
b0
sb0
R2 se
se = standard error of estimate: an estimate of variance of
the error term around the regression line. F df
SSR SSE
n n 2
∑ ei2 ∑ ( y − ŷ )i i
se = i=1
= i=1
n − k −1 n − k −1
sb0 = standard error of intercept sb1 = standard error of slope
1 x2 1
sb0 = se + n 2
sb1 = se n 2
n ∑ (x − x )
∑ (x − x )
i=1 i i=1 i
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 13
b1 b0
=LINEST(A2:A21,B2:B21,1,1)
LINEST(known_y's, known_x's, constant, statistics)
=D2/D3
=TDIST(D9,$E$5,2)
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 14
Multiple Linear Regression
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 15
Example: Monthly Iced Coffee Sales
• Develop Forecasting Model #1
n Level, trend, & avg. historical temperature
n Develop OLS regression model
Yi = β0 + β1 x1i + β2 x2i + εi
DEMAND = LEVEL + TREND(period) + TEMP_EFFECT(temp)
Output b2 b1 b0
(0.27) 52.65 3,254.81 sb2 sb1 sb0
2.75 6.24 174.85
0.78 208.97 #N/A R2 se
36.36 21 #N/A
F df
3,175,996 917,074 #N/A
SSR SSE
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 16
Example: Monthly Iced Coffee Sales
b2 b1 b0
(0.27)
52.65
3,254.81
n= 24 observations
sb2 sb1 sb0
2.75
6.24
174.85
k = 2 variables
0.78
208.97
#N/A
R2 se df = n-k-1 = 24-2-1= 21
36.36
21
#N/A
F df
3,175,996
917,074
#N/A
SSR SSE
Yi = β0 + β1 x1i + εi
DEMAND = LEVEL + TREND(period)
52.55 3239.89 b1 b0
6.02 86.05 sb1 sb0
0.78 204.22 R2 se
76.14 22 F df
3175570 917500 SSR SSE
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 18
Example: Monthly Iced Coffee Sales
• Compare the goodness of fit between models
n Model 1:
w DEMAND = LEVEL + TREND(period) + TEMP_EFFECT(temp)
w R2 = 0.77594
n Model 2:
w DEMAND = LEVEL + TREND(period)
w R2 = 0.77584
" n −1 %
2
adj R = 1− 1− R $
# n
(− k −1
'
&
2
)
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 19
Transforming Variables
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 20
Example: Monthly Iced Coffee Sales
• Develop Forecasting Model #3
n Level, trend, & school being open
n Develop OLS regression model
Yi = β0 + β1 x1i + β3 x3i + εi
DEMAND = LEVEL + TREND(period) + OPEN_EFFECT(open)
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 21
Example: Monthly Iced Coffee Sales (#3)
144.50 51.54 3156.12 b3 b1 b0
85.35 5.81 96.30 sb3 sb1 sb0 n= 24 observations
k = 2 variables
0.80276 196.07 #N/A R2 se
df = n-k-1 = 24-2-1= 21
42.74 21 #N/A F df
3285765 807305 #N/A SSR SSE
• How is the overall fit of the model?
n R2 = 0.80276 with adj R2 ≅ 0.78398 (better than #1 or #2)
• Are the individual variables statistically significant?
n Run t-tests for each variable and the intercept
n Intercept and trend coefficients are strongly significant, school flag is borderline
intercept trend school in session
b0 3156 b1 51.54 b3 144.50
tb0 = = = 32.77 tb1 = = = 8.87 tb3 = = = 1.69
sb0 96.3 sb1 5.81 sb3 85.35
P-value =TDIST(32.77, 21, 2) < 0.0001 P-value =TDIST(8.87, 21, 2) < 0.0001 P-value =TDIST(1.69, 21, 2) = 0.105
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 22
Model & Variable Transformations
• We are using linear regression, so how can we use dummy variables?
n The model just needs to be linear in the parameters
n For model #3: y = β0+β1(period)+β3(open_flag)
n Many transformations can be used:
y = β0 + β1 x1
y = β0 + β1 x1 + β2 x12
y = β0 + β1 ln(x1 )
y = ax b ⇒ ln( y) = ln(a) + bln(x)
y = ax1b1 x2b2 ⇒ ln( y) = ln(a) + b1 ln(x1 ) + b2 ln(x2 )
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models
Model Validation
5
Number of Months
4
3
2
• Basic Checks 1
0
n Goodness of Fit – look at the R2 values
-250
-200
-150
-100
-50
50
100
150
200
250
n Individual coefficients – t-tests for p-value Residuals
Residuals
0
residuals -100 35 45 55 65 75 85
w Does the standard deviation of the error terms differ -200
for different values of the independent variables? -300
-400
n Autocorrelation – is there a pattern over time Temperature (F)
Residuals
w Make sure dummy variables were not over specified 100
-
• Statistics Software (100) 0 2 4 6 8 10 12 14 16 18 20 22 24
(200)
n Most packages check for all of these (300)
(400)
n More sophisticated tests and remedies Time Periods
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 25
Modeling Results – which is best?
8
7 Model: Linear Equation
6 8
5
Sales
4 6
y = 0.4x + 4.5
Sales
3
4 R² = 0.16
2
1 2
0
0
0 2 4 6 8
0 2 4 6 8
Time
Time
6 6
Sales
Sales
4 4
y = 0.5x2 - 2.1x + 7 y = 1.3333x3 - 9.5x2 + 20.167x - 7
2 R² = 0.36 2
R² = 1
0 0
0 2 4 6 8 0 2 4 6 8
Time Time
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 27
Key Points Yi = β0 + β1 x1i + β2 x2i + εi
• Regression finds correlations between
n A single dependent variable (y)
n One or more independent variables (x1, x2, …)
• Coefficients are estimates by minimizing the sum of the squares of
the errors
• Always test your model:
n Goodness of fit (R2)
n Statistical significance of coefficients (p-value)
• Some Warnings:
n Correlation is not causation
n Avoid over-fitting of data
• Why not use this instead of exponential smoothing?
n All data treated the same
n Amount of data required to store
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 28
CTL.SC1x -Supply Chain & Logistics Fundamentals
“Casey”
Photo courtesy Yankee Golden
Retriever Rescue (www.ygrr.org)
MIT Center for
Transportation & Logistics caplice@mit.edu
Image Sources
• "Pampers packages (2718297564)" by Elizabeth from Lansing, MI, USA - Pampers packagesUploaded by Dolovis. Licensed under
Creative Commons Attribution 2.0 via Wikimedia Commons -
http://commons.wikimedia.org/wiki/File:Pampers_packages_(2718297564).jpg#mediaviewer/
File:Pampers_packages_(2718297564).jpg
• "Kraft Coupon" by Julie & Heidi from West Linn & Gillette, USA - Grocery Coupons - Tearpad shelf display of coupons for Kraft
Macaroni & Cheese. Licensed under Creative Commons Attribution-Share Alike 2.0 via Wikimedia Commons -
http://commons.wikimedia.org/wiki/File:Kraft_Coupon.jpg#mediaviewer/File:Kraft_Coupon.jpg
• "Damaged car door" by Garitzko - Own work. Licensed under Public domain via Wikimedia Commons -
http://commons.wikimedia.org/wiki/File:Damaged_car_door.jpg#mediaviewer/File:Damaged_car_door.jpg
CTL.SC1x - Supply Chain and Logistics Fundamentals Lesson: Causal Forecasting Models 30