Documente Academic
Documente Profesional
Documente Cultură
Michael Bailey
Economist, Data Scientist @Facebook
Successful Forecasts
Successful Forecasts
Source: Baseball-Almanac
Successful Forecasts
Source: Baseball-Almanac
Failed Forecasts
"I predict the Internet will soon go spectacularly supernova and in 1996
catastrophically collapse. Robert Metcalfe, founder of 3Com and
inventor of Ethernet, writing in a 1995 InfoWorld column.
Failed Forecasts
Failed Forecasts
Outline
What methods should be used to construct a forecast?
What distinguishes a good forecast from a bad one? What are the best
practices and common mistakes of forecasters?
How can I learn more?
Qualitative Methods
Ad-hoc / make stuff up: used more often than you would think, often for
new products/markets where there isnt much data available.
Delphi Method: iterative forecasts by a room of experts. Panel sees
results from previous round and reforecasts. Several problems with this
method bias of outliers, group psychology effects like herding, etc.
Quantitative Methods
Time Series predict future values based upon past values. Some models
include other regressors (ARMAX), but usually the forecast is based solely
upon observed values of the response.
Moving Average -
The key to time series analysis is to transform the data into a stable
time series:
(1)
Once you have a stable time series, you can forecast forward using
exponential smoothing models. Moving Averages are a special
type of ES.
ES models take in all past data, but put different weights on how
recent data should predict the next period.
One ES model is HoltWinters (HoltWinters() in R) that selects a
smoothing parameter automatically.
Apr
132,671,221
May
141,424,018
Jun
150,134,416
Jul
158,826,902
Make sure you are fitting your model using training data, and validating
your model using test data.
Two most common model validation metrics are:
Mean Absolute Error = mean(|error|)
Root Mean Squared Error = sqrt(mean(error^2))
There is vociferous debate in the forecasting journals about the best
metric to use very domain specific.
Forecasting Process:
1) Define problem, Gather contextual knowledge
2) Plot like crazy to learn data structure (correlation matrices,
autocorrelation plots, etc.)
3) Split data into train/test set.
4) (Time Series) Transform the data to obtain stationarity.
5) Fit appropriate models, test that errors look like white noise.
6) Apply models to test data set, evaluate using error deviation metrics.
Make forecasts.
7) Re-evaluate next period.
Best Practices
Gather as much contextual knowledge (experts) as possible.
Best Practices
Embrace Uncertainty.
>
Best Practices
Avoid overfitting.
Always train your model and perform model selection on a subset of your
data.
Dont necessarily select the model that best fits your current data.
Dont necessarily select the model that best fits the next n periods.
Remember, you can always attain a perfect fit to the data with enough
parameters.
Best Practices
Make Lots of Forecasts and Calibrate.
Continually asses the probability statements of your model to see how
far it deviates from the truth.
Best Practices
Beware the Lucas Critique.
When your forecast might affect the outcome, calibration is incredibly
difficult.
Platform Forecasting
At Facebook, we have the challenging problem of the need of a platform
forecast.
Our revenue is dependent on our users using the site, and our advertisers
wanting to serve them ads. We control neither supply nor demand, and
thus we need to make forecasts for both.
We also need to understand how the composition of supply and demand
turns into revenue which very much depends on the ads-serving
mechanisms and optimizations we employ, which are continuously
changing.
We use a combination of simulation techniques, experiments, machine
learning and cross-section models, and time series models to estimate
the demand and supply curves are facing.
Resources
Dont google forecasting instead search on methods (time series,
prediction market, neural networks, etc.)
Resources R
Forecasting in R
R zoo package, useful for dates
R forecast package
A Little R Time Series Book
Python/Pandas
Time Series in Pandas
Time Series in Pandas (Video)
statsmodels
Texts
Forecasting Methods and Applications
Time Series Analysis
Forecasting: Principles and Practice
Signal and the Noise: Why So Many Predictions Fail but Some
Don't