Documente Academic
Documente Profesional
Documente Cultură
AbstractIndonesia has the highest number of dengue fever searched on Google. Previous studies have used Google Flu
cases in Southeast Asia. Early detection of the disease is required Trends [6] and Google Dengue Trends [7] as surveillance
in order to be able to prepare preventive measures against dengue system alternatives and forecasting number of dengue fever and
fever. Previous research has shown that certain query search influenza cases [8, 9, 10]. Both services were available online
related to communicable disease on Google Trends are highly
for free, yet Googles scientists had never revealed what search
correlated with number of communicable disease cases in South
Korea. Based on previous research, Google Trends search index terms used to track the disease. These services were
shows potential to be included as external variable in a discontinued in 2015, however it is still available limited for
multivariate quantitative forecasting model. academic research purposes.
Using time series model, the role of Google Trends on Search query data on Google are available for public on a
epidemiology of dengue fever transmissions in Surabaya will be web service called Google Trend. It allows people to examine
analyzed. This research uses several data (1) Number of dengue the trends of certain search queries. In previous study Google
fever cases obtained from general local hospital of Dr. Soetomo (2) Trend for certain queries using the survey on influenza are
Google Trends search index of certain queries related to dengue correlated with national surveillance data in South Korea [11].
fever. All of the data spans from December 2010 August 2015.
Although a research on correlation between dengue fever cases
Interpolation and extrapolation techniques are used to handle the
missing data. ARIMA and ARIMAX model with Google Trends and certain dengue fever search queries in Indonesian has never
data are implemented in order to forecast the number of dengue been conducted, the study has shown that Google Trends can
fever cases. The research shows that the addition of Google Trends be considered as independent variables for multivariate
into ARIMAX model improves forecasting performance. The best forecasting model. Google Trend geographical location is able
ARIMAX with Google Trends model improves MAPE value by to be broken-down to city level, which makes it more favorable
3%. than Wikipedia access log data.
KeywordsForecasting; ARIMA; ARIMAX; Google Trends; Autoregressive Integrated Moving Average with exogenous
Dengue Fever. variable (ARIMAX) is chosen to develop forecasting model of
I. INTRODUCTION dengue fever cases. Previous studies have shown that ARIMAX
model performs better than univariate ARIMA. Forecasting
According to World Health Organization, Indonesia is the performance of ARIMAX method including variation of
2nd country with the most dengue fever cases in the world [1]. calendar effects is better than ARIMA model [12]. ARIMAX
The incidence rate of dengue fever in Indonesia has been method with various variables have been used in previous
increasing constantly since 1968. Based on data obtained from studies. School absenteeism [13], pharmacy drug sales [14], and
Indonesian Ministry of Health, the incidence rate of dengue climatic factors (temperature, relative humidity, rainfall, and air
fever is 41,25 for every 100.000 citizens in 2013 [2]. pressure) [15]. ARIMA model has been widely used in
Controlling dengue fever transmission is part of Indonesian epidemiology to monitor and infectious disease [16]. We
Ministry of Healths strategic plan [3]. Forecasting their impact present a comparative analysis of ARIMA and ARIMAX with
is crucial for planning an effective response strategy [4]. Google Trends forecasting model to forecast dengue fever cases
Previous studies have shown the potential of traces data on in Surabaya, Indonesia.
the internet for monitoring and forecasting the transmission of
communicable disease. Wikipedia access data log has been II. DATA AND METHODOLOGY
used to monitor and predict several communicable disease in
Haiti, Uganda, China, Japan, Poland, United States of America A. Data
and Norway [5, 4]. The limitation of these studies is that using The dependent variable is the number of dengue fever cases.
article language as location proxy is considered weak, because It was obtained from general local hospital RSUD Dr. Soetomo
it cannot be used for the smaller scale than country-level [5]. Surabaya, Indonesia. Google Trend search index is
Google launched Google Flu Trends and Google Dengue external/independent variables for the model. Google Trend
Trends in 2008. These data are modeled from search queries search queries related to dengue fever is determined based on
115
parameters, there are 16 ARIMA models. These models are
going to be fitted.
116
TABLE 4 CORRELATIONS BETWEEN GOOGLE TREND QUERIES AND DENGUE FEVER CASE
Google Trend Queries Lag 0 Lag 1 Lag 2 Lag 3 Lag 4 Lag 5 Lag6 Lag7 Lag 8 Lag 9 Lag10
0.373* 0.555* 0.541* -0.318*
dengue 0.265 0.278 -0.018 -0.058 -0.172 -0.1582 -0.059
(3) (2) (2) (3)
0.307* -0.386* -0.331*
demam berdarah 0.0942 0.202 0.166 0.089 -0.076 -0.227 -0.221 -0.0594
(3) (3) (3)
0.350* -0.321*
demam 0.098 0.264 0.283 0.227 0.0287 -0.172 -0.187 -0.217 -0.0281
(3) (3)
0.423* 0.407* -0.432* -0.535* -0.474*
dbd -0.0203 0.182 0.263 0.031 -0.216 -0.320
(3) (3) (3) (2) (3)
IV. DISCUSSIONS
The result of this research shows that there is a correlation
ARIMAX(2,0,0) with GoogleTrend
between certain Google Trend queries related to dengue fever 1200
and the number of dengue fever cases. The best fit MAPE of 1000
univariate ARIMA model for dengue fever case is
# of cases
800
ARIMA(2,0,1) with MAPE of 32.34%. The best fit MAPE of
ARIMAX model is ARIMAX(2,0,0) with dengue lag 1 and 600
demam berdarah lag 2 as external variables with MAPE of 400
29.11%. From both of the best models for ARIMA and
200
ARIMAX, we found that including external variables Google
Trend improve the MAPE by 3.23%. In average the 0
Jul-10
Jul-11
Jul-12
Jul-13
Okt-10
Okt-11
Okt-12
Okt-13
Jan-10
Apr-10
Jan-11
Apr-11
Jan-12
Apr-12
Jan-13
Apr-13
performance of ARIMAX model with Google Trend is better
than univariate ARIMA model by 2%. The best models of Time period
ARIMA and ARIMAX will be implemented in testing test. Actual Forecast L95 U95
Table 5 shows the performance comparison between these
models on training set and testing set. As shown in Table 5, FIGURE 6. PLOT OF ARIMAX(2,0,0) WITH GOOGLE TREND FITTED MODEL
ARIMAX performs better in predicting the number of dengue
fever cases. Addition of certain Google trend queries does improve the
performance of forecasting model statistically. However there
TABLE 5 PERFORMANCE COMPARISON OF ARIMA AND ARIMAX is a limitation of using Google Trends in this study. The
selection of queries related to dengue fever in Indonesian is
MAPE decided through related search results shown on Google Trend.
Model External Variables Training Testing Although there is a significant correlation between the related
set set search results shown and number of dengue fever cases, this
ARIMA(2.0.1) - 32.34% 32.34% method has weakness. The related search results is selected
dengue_log2_1 based on frequency correlation of search queries in certain
ARIMAX(2.0.0) demamberdarah_log 29.11% 19% period of time [19]. Hence, it can be changed as the time period
2_2 goes on, resulting an inconsistent suggestion. For further
studies using Google Trend search queries, the queries can be
The fitted value of ARIMA(2,0,1) and ARIMAX(2,0,0) are decided based on survey to sample of population asking what
plotted in Figure 5 and Figure 6. These models are within come to your mind when youre searching for dengue fever?
confidence interval of 95% [11].
800
600
selection method of search queries on Google Trend has to be
400
refined in further studies.
200
0
Jul-10
Jul-11
Jul-12
Jul-13
Okt-10
Okt-11
Okt-12
Okt-13
Jan-10
Apr-10
Jan-11
Apr-11
Jan-12
Apr-12
Jan-13
Apr-13
time period
Actual Forecast lower 95% upper 95%
117
ARIMAX Method in Moslem Kids Clothes Demand
REFERENCES Forecasting : Case Study," Procedia Computer Science,
vol. 72, pp. 630-637, 2015.
[1] World Health Organization, "Prevention and control of [13] E. JR, H. AG, B. JS, B. DL and O. D. e. al, "Usefulness
dengue and dengue haemorrhagic fever," World Health of school absenteeism data for predicting influenza
Organization, India, 2003. outbreaks, United States," Emerging infectious diseases,
[2] Central of Data and Information, Indonesian Ministry of vol. 18, no. 8, 2012.
Health, "Dengue Fever situation in Indonesia," [14] P. A, "Comparison : Flu Prescription Sales Data from a
Indonesian Ministry of Health, Jakarta, 2014. Retail Pharmacy in the US with Google Flu Trends and
[3] Indonesian Ministry of Health, Strategic Plan of US ILINet(CDC) Data as Flu Activity Indicator," PLOS
Indonesian Ministry of Health 2015-2019, Jakarta: ONE, vol. 7, 2012.
Indonesian Ministry of Health, 2015. [15] R. P. Soebiyanto, F. Adimi and R. K. Kiang, "Modelling
[4] K. S. Hickmann, G. Fairchild, R. Priedhorsky, N. and Predicting Seasonal Influenza Transmission in
Generous and J. M. H. et.al, "Forecasting the 2013-2014 Warm Regions Using Climatological Parameters,"
Influenza Season Using Wikipedia," PLoS PLOS ONE, vol. 5, no. 3, 2010.
Computational Biology, vol. 11, no. 5, 2015. [16] S. Chadsuthi, C. Modchang, Y. Lenbury, S.
[5] N. Generous, G. Fairchild, A. Deshpande, S. Y. D. Valle Iamsirithaworn and W. Triampo, "Modelling seasonal
and R. Priedhorsky, "Global Disease Monitoring and leptospirosis transmission and its association with
Forecasting with Wikipedia," PLOS ONE, vol. 10, no. rainfall and temperature in Thailand using time-series
11, 2014. and ARIMAX analyses," Asian Pacific Journal of
[6] Google, "Google Flu Trends," Google, 2015. [Online]. Tropical Medicine, 2012.
Available: http://www.google.org/flutrends. [Accessed [17] "Dickey-Fuller Unit Root Test (Stasionarity Test),"
29 September 2015]. [Online]. Available:
http://staffweb.hkbu.edu.hk/billhung/econ3600/applicat
[7] Google, "Google Dengue Trends," Google, 2015.
ion/app01/app01.html. [Accessed 6 Oktober 2015].
[Online]. Available:
http://www.google.org/denguetrends. [Accessed 29 [18] C. Chatfield, "Basic of Time Series Analysis," in Time-
September 2015]. Series Forecasting, Florida, CRC Press LLC, 2000, pp.
[8] O. M. Araz, D. Bentley and R. L. Muellman, "Using 20-42.
Google Flu Trends data in forecasting influenza-like- [19] D. E. Bowman, R. E. Ortega, M. L. Hamrick, J. R.
illness relatde ED visits in Omaha, Nebraska," American Spiegel and T. R. Kohn, "Refining search queries by the
Journal of Emergency Medicine, vol. 32, 2014. suggestion of correlated terms from prior searches".
[9] A. F. Dugas, M. Jalalpour, Y. Gel, S. Levin, F. Torcaso, Amerika Serikat Patent US6006225 A, 21 December
T. Igusa and R. E. Rothman, "Influenza Forecasting with 1999.
Google Flu Trends," PLOS ONE, vol. 8, no. 2, 2013.
[10] J. Ginsberg, M. H. Mohebbi, R. S. Patel and L. B. et.al,
"Detecting influenza epidemics using search engine
query data," 19 February 2009. [Online]. Available:
http://static.googleusercontent.com/media/research.goo
gle.com/en//archive/papers/detecting-influenza-
epidemics.pdf. [Accessed 29 September 2015].
[11] S. C. J. M. S. S.-Y. L. J. e. a. Cho S, "Correlation
between National Influenza Surveillance Data and
Google Trends in South Korea," PLOS ONE, vol. 8, no.
12, 2013.
[12] W. Anggraeni, R. A. Viniarti and Y. D. Kurniawati,
"Performance Comparisons Between ARIMA and
118