Sunteți pe pagina 1din 13

2nd INTERNATIONAL CONFERENCE ON SUPPLY CHAINS

Forecasting the consumption and the purchase of a


drug

1st Angeliki Papana, 2nd Dimitris Folinas and 3rd Anestis Fotiadis
1
University of Macedonia, Greece
2
Department of Logistics, Alexander TEI of Thessaloniki, Greece
3
I-Shou University, Taiwan

1
 angeliki.papana@gmail.com, 2 dfolinas@gmail.com, 3 anesfot@gmail.com

Abstract

In this study, we indicate the usefulness of time series forecasting methods on very
short data. Specifically, we apply some of the basic time series forecast methods in
order to predict the future consumption and purchase of the drug RAPILYSIN
LYPDINJ 2X1.16G/VIAL (RL). The available data are monthly measurements of the
consumption and purchase of the drug RL from the General Hospital of Katerini and
cover the period 2009-2011, i.e. three years. Tools from univariate time series
analysis and forecasting are introduced, discussed and applied based on the type of
the available data. Based on the accuracy of the forecasts, the most efficient method
is fitting a simple seasonal exponential smoothing model.

Keywords: logistics, purchase, demand forecasting, drug, Greece.

1. Introduction
A synchronized and responsive flow of products and services is the goal of
supply chain planning, while demand planning is the first step of supply chain
planning that determines the effectiveness of manufacturing and logistics
operations in the chain. A demand forecast is the prediction of the quantity of
a product or service that will be purchased. Demand forecasting is essential
for corporations and organizations such as hospitals in order to assess future
capacity requirements.
There are two approaches to determine demand forecast, i.e. the
qualitative approach and the quantitative one. Qualitative methods are usually
used at ambiguous situations and when little data exists, and require the
intuition and experience of the experts. Quantitative methods of demand
forecasting involve many techniques that incorporate the information from
past or current data, e.g. regression methods, extrapolation methods, neural
networks and data mining techniques. The statistical methods tend to be
superior in general, although there are occasions when model-based methods
are not practical. The best demand forecast may be determined using a multi-
2nd INTERNATIONAL CONFERENCE ON SUPPLY CHAINS

functional approach considering previous sales of the product while also


factors based on marketing and finance. The determination of the proper
forecast method is based on the available data, the size of the data and the
type of the data.
For the determination of the demand forecast method, one should first
determine the use of the forecast, i.e. stock availability. The time horizon of
the forecast (short-term, medium term or long-term predictions) also is
important in order to decide on the forecast method that should be used.
Finally, one should always validate the results.

Figure 1: Pharmaceutical decision framework

Doctors
Hospital President
Patient Allergies or
Characteristics
Ministerial Decisions
Hospital Scientific Committee

Legislations
Financial Problems
Limited IT Knowledge
Limited Managerial Knowledge
Ministerial Decisions

Concerning the monthly orders of drugs in the General Hospital of


Katerini, the influential factors are the doctors with their medical decisions.
Usually the heads of the departments (Director of Pathology, Cardiology, etc.)
indicate to the director of pharmacy about the more ‘effective’ drugs from a
category of similar drugs. The President of the institution influences the
decision as the chairman of the board and shall be informed by the Director of
Pharmacy on the monthly cost of purchasing drugs. Financial targets set by
the Ministry of Health may impose lower costs. The fact that some patients
may not receive any treatment because of allergies and special
characteristics can influence the final decision after consultation with the
attending doctors. The decisions of the Ministry of Health decisively affect the
decisions of the pharmacy. A scientific committee of doctors also operates in
each hospital informed on the needs of the patients of the hospital, affecting
the final decision.
2nd INTERNATIONAL CONFERENCE ON SUPPLY CHAINS

Limiting factors that affect the final decision of the monthly orders of
drugs may be some legislative decisions that determine what will be the
preference for a commission through the procurement system. In parallel, the
Ministry may determine some limiting factors. Recently in Greece took place
the first online auctions for substances. Therefore, the entire drug supply
system is modified, as most managers will start ordering the active
substances of the drugs and not nominally specific drugs. The economic
problems that Greece is facing nowadays, clearly influences all the decisions.
The continued need to reduce costs causes the supplement of cheaper drugs
of dubious quality. The last two years there is a constant attempt to electronic
data processing of all pharmacies and provide statistics to the health ministry
but unfortunately the older employees of the pharmacy and the fear of contact
with technology hinders the electronic operation of the pharmacy. Figure 1
displays in short the infuential and limitation factors for the final decision of the
hospital concerning the monthly orders of drugs.
In this study, we will introduce time series methods, which are suitable
for short term predictions. These methods search for patterns in the time
series and extrapolate these patterns into the future. Time-series forecasting
is a form of extrapolation in that it involves fitting a model to a set of data and
then using that model outside the range of data to which it has been fitted.
Forecasting of time series is a very difficult task as it is hard to recognize the
underlying patterns and relationships due to noise and random and
unexpected changes.

2. Time Series Forecasting Methods


A time series is a set of evenly spaced evenly spaced, continuous, numerical
data obtained at regular time periods. In the time series forecasting methods,
the forecast is based only on past values and assumes that factors that
influence the past and the present will continue influence the future. If future
values of a time series can be predicted from its past values, then the series
is deterministic. If the future of a time series can only be partly determined by
past values, then the time series is stochastic or random.
Some basic univariate linear time series methods are the moving
average method, exponential smoothing (Brown, 1956), Auto-Regressive
Moving Average (Huang and Yang, 1995), Auto-Regressive Integrated
Moving Average (Box and Jenkins, 1976), Random Walk model, time series
decomposition and Z-Chart. The simple moving average is a series of
arithmetic means and is applicable data present no trends. The exponential
smoothing is an averaging method that reacts more strongly to recent
changes in demand by assigning weights. Exponential smoothing methods
have been developed in order to take into account trend and seasonality.
Auto-Regressive Moving Average (ARMA) is appropriate for non stationary
data, when a system is a function of a series of unobserved shocks as well as
its own behavior. Auto-Regressive Integrated Moving Average (ARIMA) is a
more complex method that handles trend and seasonality, but requires larger
data sets. The Random Walk model assumes that from one time period to the
next, the original time series merely takes a random "step" away from its
previous value. It is usually used when data present an irregular behaviour,
2nd INTERNATIONAL CONFERENCE ON SUPPLY CHAINS

e.g. irregular growth. The time series decomposition adjusts the seasonality
by multiplying the normal forecast by a seasonal factor. Another method of
short-term forecasting is the use of a Z-Chart. It is assumed that basic
principles that dominate the data do not alter, or alter on anticipated course
and that any underlying trends at present will continue. More complex
nonlinear methods are also develop for time series prediction, however these
methods usually require larger data sets as they have more free parameters
for their estimation.
The minimum time series length one needs to make ‘good’ forecasts
using a statistical model depends on the type of the model and the amount of
random variation in the data. From a purely statistical point of view, it is
always necessary to have more observations than parameters. The minimum
requirements apply when the amount of random variation in the data is very
small. Real data often contain a lot of random variation and therefore sample
size requirements increase accordingly. Therefore, the number of available
data affects the choice of the corresponding forecasting method.
In order to be able to decide which forecast method is the best for each
data set, one should know and understand the different types of methods and
recognize the different components in the data. However, one should always
validate the forecasts. In order to check the accuracy of the forecasts and the
fact that are unbiased and efficient, we need to measure the prediction error,
i.e. the difference between the actual time-series and the forecasts. For this
purpose, many statistical measures have been developed such as the mean
square error, root mean squared error, cumulative forecast error, mean
absolute percent error, etc. Therefore, we display the original and forecasted
values of each method for the three years in order to see their performance.

3. Forecasting the consumption and the purchase of the drug


RAPILYSIN LYPDINJ 2X1.16G/VIAL
This study is part of a large on-going research which has been conducted
during the last 2 years. The purpose of the main survey was to investigate the
influence of the economic crisis to the logistics services sector in Greece. The
process of analysis revealed that the 3PL’s have been significantly affected by
the crisis and these effects have influenced all the main functional areas of
the logistics management (procurement, warehousing, inventory
management, transportation and distribution) as well as the main logistics
philosophies and practices. These findings gave birth to two central questions:
The data examined here are the time series of the consumption and
purchase of the drug RAPILYSIN LYPDINJ 2X1.16G/VIAL (RL), which is a
drug from the cardiology department indicated for the thrombolytic treatment
of suspected strokes. The data are for the years 2009-2011. The first step in
any time series analysis is to plot the observations versus time, in order to
observe any important features of the data such as trend, seasonality,
outliers, discontinuities etc. The time plots of our data are displayed in Figure
2.

Figure 2: The time plots of (a) the consumption and (b) the purchase of the drug RL 
2nd INTERNATIONAL CONFERENCE ON SUPPLY CHAINS

(a)
12000

10000

8000

consumption
6000

4000

2000

0
0 5 10 15 20 25 30 35
months   
4 (b)
x 10
3

2.5

2
purchase

1.5

0.5

0 5 10 15 20 25 30 35
  months  
 
  An important issue of time series analysis is the stationarity of the data.
A stationary time series is one whose statistical properties such as mean,
variance, autocorrelation, etc. are constant over time. Most statistical
forecasting methods are based on the assumption that the time series are
stationary or can become stationary using mathematical transformations. In
order to test the stationarity of the data, we implemented the Augmented
Dickey-Fuller test (Dickey & Fuller, 1979) which indicated the rejection of the
unit-root null hypotheses in favour of the alternative one, i.e. suggested that
both time series are stationary. Therefore, we do not need to transform the
original time series.
Let us denote as {xt}, t=1,…,N the observed time series. The sample
autocovariance coefficient at lag k=0,1,2.. is
N −k
ck = ∑ ( xt − x )( xt + k − x ) / N
t =1

and the sample autocorrelation coefficient at lag k is rk = ck / c0 . We


proceed by estimating the sample autocovariance coefficient and the
correlogram, i.e. the graph of rk versus k. The correlogram provides
information on the type of the data, i.e. whether data are deterministic,
stochastic or random. For example, several significant coefficients at low lags
provide evidence that the data do not come from a purely random process.
Values of rk outside the range ±2 N are regarded to be significantly different
from zero.
The correlogram of the consumption and the purchase of the drug RL
are displayed in Figure 2, respectively. We can observe that for the
consumption of the drug, two significant sample autocorrelation coefficient rk
2nd INTERNATIONAL CONFERENCE ON SUPPLY CHAINS

values exist, for lag 6 and lag 8, while for the purchase of the drug only for lag
3 is rk significantly different from zero. From the two correlograms we can
conclude that the data are stationary and present no trend. The two time
series may be random as only 1 and 2 significant rk values exist, respectively,
however time series may also be characterized by seasonal fluctuations, and
therefore the correlogram is also exhibiting oscillations at the same frequency.
If the series are truly random, then only an occasional autocorrelation should
be larger than two standard errors in magnitude. The interpretation of a
correlogram is a difficult task, especially when N is so small.

Figure 3: The correlogram of (a) the consumption and (b) the purchase of the drug
RL 
(a)
1
sample autocorrelation

0.5

-0.5
0 2 4 6 8 10
lags
(b)
1
sample autocorrelation

0.5

-0.5
0 2 4 6 8 10
lags  
 
In order to decide whether there is a cyclic component in our data, we
use the seasonal subseries plots (Cleveland, 1993), which is a tool for
detecting seasonality in a time series. This plot is only useful if the period of
the seasonality is already known. Since our data are monthly, the period is
considered to be 12. The seasonal subseries plots of the consumption and
the purchase of the drug are displayed in Figure 4. From the plots, no
apparent seasonality is observed for the two variables.
 
Figure 4: The seasonal subseries plots of (a) the consumption and (b) the purchase
of the drug RL
2nd INTERNATIONAL CONFERENCE ON SUPPLY CHAINS

(a)

10000

8000

consumption
6000

4000

2000

0
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Noe Dec
months

4 (b)
x 10
2.5

2
purchase

1.5

0.5

0
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Noe Dec
months

After examining the existence of trend or seasonality components of


the data, we will introduce some basic forecasting methods that can be used
for the available data and we will find the most appropriate one by comparing
their forecast accuracy. Let us consider that we have data up to time N and
make forecasts about the future by fitting a model to the data. The accuracy of
the forecasts is then tested by a statistical measure. Let us again denote Xt
the real value of the time series at the period t, Ft its forecast at a time t, and
et the forecast error, i.e. et = Xt - Ft. The mean square error (MSE) of the
forecast is defined as
N

∑e 2
t
MSE = t =1 .
N

The first time series forecasting method introduced here is the simple
moving average. This method is suitable for data that present no trend,
seasonality or cyclic components. The forecasts are estimated as the mean of
the K previous values of the time series
1 t −K
Ft = ∑ Xi .
K i =t −1

The largest the K, the better is the standardization in the random


fluctuations of the values of the variable and the smaller is the effect of the
possible extremes of the time series.
 
2nd INTERNATIONAL CONFERENCE ON SUPPLY CHAINS

Table 1: MSE from the simple moving average method for the consumption and the
purchase of the drug RL for K=3, 4, 5, respectively

MSE K=3 K=4 K=5


Consumption 7.6957.106 7.5640.106 7.9501.106
Purchase 8.9091.107 8.7103.107 9.2444.107
 
From MSE values, it is obvious that simple moving average method is
ineffective in correctly forecasting the two variables. In Figure 5, the original
time series and their fitted values are displayed for K=3. MSE values are large
also due to the fact that the time series values are also large.

Figure 5: Plots of (a) the consumption and (b) the purchase of the drug RL and their
fitted values from simple moving average method for K=3 
(a)

consumption
10000
forecast

8000

6000

4000

2000

0
0 5 10 15 20 25 30 35
month
 
4 (b)
x 10
3
purchase
forecast
2.5

1.5

0.5

0
0 5 10 15 20 25 30 35
month
 
       
The simple exponential smoothing method takes into account all
previous observations but gives greater weight to more recent observations.
This method is again suitable for data with no trend or seasonality. The
forecast is estimated from the equation
Ft+1 = αXt + α (1-α) Xt-1 + α (1-α)2 Xt-2 + … + α (1-α)m Xt-m + (1-α)m+1 Ft-m,

where α is a smoothing constant which takes values in [0,1]. The


parameters α reflects the weight given to each observation of the time series.
Values of α closer to 1, give greater weight to the most recent data values.
This equation is difficult to use, however it can be transformed in the form

Ft+1 = α Xt + (1-α) Ft,


2nd INTERNATIONAL CONFERENCE ON SUPPLY CHAINS

where Ft+1 is given as a linear combination of the current real value Dt


and the previous exponentially smoothed moving average Ft. The value of α is
selected in order to result in the smallest MSE.
The MSE values from the simple exponential smoothing method for the
consumption and the purchase of the drug for α=0.1, 0.3, 0.5, 0.7, 0.9,
respectively, are displayed in Table 2. Although the MSE values are again
large, we can see that the simple exponential smoothing method for large α
simulates the oscillations of the original series but with some slight lag. In
Figure 6, the plots for the consumption of the drug RL and its fitted values are
displayed for α=0.3 and α=0.9. 
 
Table 2:  MSE  from the simple exponential smoothing method for the consumption
and the purchase of the drug RL for α=0.1, 0.3, 0.5, 0.7, 0.9, respectively

MSE α=0.1 α=0.3 α=0.5 α=0.7 α=0.9


Consumption 7.8859.106 7.3903.106 8.3950.106 9.8519.106 1.1794.107
Purchase 7.5347.107 8.4867.107 1.0061.108 1.2102.108 1.4600.108
 
Figure 6: Plots of the consumption of the drug RL and their fitted values from simple
exponential smoothing method (a) for α=0.3 and (b) α=0.9, respectively
 
(a)

consumption
10000
forecast

8000
consumption

6000

4000

2000

0
0 5 10 15 20 25 30 35
months
(b)

consumption
10000
forecast

8000
consumption

6000

4000

2000

0
0 5 10 15 20 25 30 35
months
 
 
The Random Walk model, Yt = Yt-1 + εt, predicts that the value at time
"t" will be equal to the last period value plus a stochastic (non-systematic)
component that is a white noise, which means εt is independent and
identically distributed with mean zero and variance σ². The forecasting model
suggested is Yt - Yt-1 = εt or Yt - Yt-1 = α, where alpha is the mean of the first
2nd INTERNATIONAL CONFERENCE ON SUPPLY CHAINS

differences, i.e. the average change from one period to the next. If we
rearrange this equation, we get Yt = Yt-1 + α. In other words, we predict that
this period's value will equal last period's value plus a constant representing
the average change between periods. Therefore, the random walk model
assumes that from one period to the next, the original time series merely
takes a random "step" away from its last recorded position. The ‘best’ forecast
of the next value is the same as the most recent value. This is a very simple
method, but is often quite sensible, and has been widely applied to economic
data even though one may expect that more complicated methods will
generally be superior. The means of the first differences of the consumption
and of the purchase of RL are 79.1003 and 0, respectively. Therefore, the two
forecast models at each case are Yt = Yt-1 + 79.1003 and Yt = Yt-1,
respectively. The MSE for the two data sets are 1.2992.107 and 1.6029.108,
respectively. The original and fitted values from the random walk model are
displayed in Figure 7.
 
Figure 7: Plots of the (a) consumption and (b) purchase of the drug RL and their
fitted values from the random walk model, respectively
 
(a)
12000
consumption
forecast
10000

8000
consumption

6000

4000

2000

0
0 5 10 15 20 25 30 35
months  
4 (b)
x 10
3
purchase
forecast
2.5

1.5

0.5

0
0 5 10 15 20 25 30 35
  months  
 
The next method is to fit to the data an Auto-Regressive model of order
p, denoted as AR(p). The general form of the AR(p) model is Xt = φ0 + φ1Xt−1
+ φ2Xt−2 + · · · + φpXt−p + Zt. Thus the value at time t depends linearly on the
last p values and the model looks like a regression model. The order of the
AR model is selected using the partial auto-correlation function or the Akaike
information criterion (Akaike, 1974). We will implement here the simplest
example of an AR(p) process, i.e. the AR(1) model Xt = φ0 + φ1Xt−1 + Zt. The
AR model is fitted by least squares regression to find the values of the
parameters for each data set which minimize the error term. The estimated
2nd INTERNATIONAL CONFERENCE ON SUPPLY CHAINS

coefficients of the AR(1) models from fitting the data are φ0=368.202,
φ1=0.171 and φ0=1103.91, φ1= 0.168, respectively. The MSE values for the
two data sets are 5.7821.106 and 6.1048.107. The original and fitted values
from the AR(1) model are displayed in Figure 8.
 
Figure 8: Plots of the (a) consumption and (b) purchase of the drug RL and their
fitted values from the AR(1) model, respectively 
(a)

consumption
10000
forecast

8000

6000

4000

2000

0
0 5 10 15 20 25 30 35
month
 
4 (b)
x 10
3
purchase
forecast
2.5

1.5

0.5

0
0 5 10 15 20 25 30 35
month
 
 
Finally, the best model to fit the data is found to bee the simple
seasonal exponential smoothing model. This model is appropriate for series
with no trend and a seasonal effect that is constant over time. As the data are
monthly, the number of periods in a seasonal interval is p = 12. Simple
seasonal exponential smoothing has two parameters, the level parameter L(t)
and the season parameter S(t)
L(t ) = a ( X (t ) − S (t − s )) + (1 − a) L(t − 1)
S (t ) = δ ( X (t ) − L(t )) + (1 − δ ) S (t − s )
Xˆ (k ) = L(t ) + S (t + k − s )
t

where α is the level smoothing weight and δ is the season smoothing


weight. By fitting the data to the simple seasonal exponential smoothing
model, we estimated for the consumption the parameters α=0.1, δ=0 and for
the purchase α=0.1, δ=1,762x10-5, respectively. The corresponding MSE for
these models are 4.3245x106 and 4.1437x107. Therefore, for the simple
seasonal exponential smoothing model, we have the smallest MSE, and these
are the best forecast models for the available data. In Figure 9, we display the
observed (original) values of the data, the fitted values from the simple
2nd INTERNATIONAL CONFERENCE ON SUPPLY CHAINS

seasonal exponential smoothing model, and the forecasted values for the next
six months.
 
Figure 9: Plots of the (a) consumption and (b) purchase of the drug RL, their fitted
values from the simple seasonal exponential smoothing model and their forecasts for
the next six months, respectively.
 
(a)
 
 
 
 
 
 
 
 
 
 
(b)
 
   

 
 
 

4. Conclusions
This work concentrates on finding ‘best’ point forecasts using MSE. Although
the available data are so few, we could still find a model that seems to able to
simulate the oscillations of the original data. More advanced forecast methods
cannot be used when the available data are so few. However, simple methods
have proved to be better that more advanced ones at cases. In order to
evaluate and compare the forecast methods, the easier way is to only
compare the accuracy of the method, based on the fitted values of each
method/ model.
In practice, different statistical measures for forecast accuracy may
give different results. Therefore, it is important to check which method each
statistical measure suggests and whether there is ‘significant’ difference
among the methods. This work concentrates on finding ‘best’ point forecasts
using MSE. In practice, we often need to produce interval forecasts, in order
to better assess future uncertainty.
2nd INTERNATIONAL CONFERENCE ON SUPPLY CHAINS

In conclusion, there is no method/ model suitable for all types of data


and all contexts. For any forecasting problem, one should put in a reasonable
amount of effort to get a good forecast. The analyst should get appropriate
background information and carefully define the objectives, i.e. the type of
forecast required. The following steps are essential in order to decide of the
forecasting method. Make a time plot of the data and inspect it carefully in
order to seek for trend, seasonal variation, outliers, etc. Pre-process the data
if necessary by correcting obvious errors, adjusting outliers and imputing
missing observations. Whatever method/ model is selected to make forecasts,
the analyst needs to carry out post-fitting checks to check the adequacy of the
forecast. Then the method/ model selected can be used to actually make
forecasts. Finally, plot the forecasts on a time plot of the data and check that
they look intuitively reasonable.

References
Akaike H. A new look at the statistical model identification. IEEE Transactions on
Automatic Control 19 (6), 716–723, 1974.
Brown, R.G. Exponential Smoothing for Predicting Demand. Cambridge,
Massachusetts: Arthur D. Little Inc. pp. 15, 1956.
Box G.E.P. and G. Jenkins. Time Series Analysis: Forecasting and Control. Holden-
Day, 1976.
Cleveland W.S. Visualizing Data, Hobart Press, 1993.
Dickey D.A. and W.A. Fuller. Distribution of the estimators for autoregressive time
series with a unit root. Journal of the American Statistical Association 74, 427–
431, 1979.
Huang C. and H. Yang. A Time Series Approach to Short Term Load Forecasting
through Evolutionary Programming Structures. Proceedings of the International
Conference on Energy Management and Power Delivery (EMPD'95), Vol. 2,
583-588, 1995.

S-ar putea să vă placă și