Sunteți pe pagina 1din 5

Proceeding of 2018 IEEE International Conference on Current Trends toward Converging Technologies, Coimbatore, India

Comparative Study of Short-Term Wind Speed


Forecasting Techniques Using Artificial Neural
Networks
Rohitha. B. Kurdikeri, A. B. Raju
Department of Electrical and Electronics Engineering
B.V.B. College of Engineering and Technology, Hubballi, India.
Email:rohitha.kurdikeri@gmail.com

Abstract—This paper focuses on the importance of wind fore- II. R ECURRENT N EURAL N ETWORK M ODEL
casting and comparison of two different forecasting schemes using
artificial neural network approach. Types of forecasting include
One of the deep learning models designed to recognise
feed-forward network models using standard back propagation patterns in a sequence of data. These are capable of learning
technique and recurrent neural network models with inherent local as well as long-term dependencies in data and also ac-
memory for any given data. In this study, how local memory and commodate sequences of variable length. Therefore, recurrent
relevant inputs make recurrent neural networks more suitable neural networks are kind of general paradigm for handling
for time-series prediction than normal feed-forward networks
is shown. And also for accurate forecasting and better energy
variably sized data that allow us to capture different types of
trading, fine tuning of present techniques is required. Therefore, usecase setups.
LSTM models are implemented which are a part of recurrent
neural networks. Finally, the results are measured in terms A. RNN Architecture
of mean-squared error, an error function which calculates the In feed-forward neural networks, outputs of one layer is
difference between actual and model outputs. It was found that fed as input into the subsequent layers and each unit does
LSTM models were more suitable for short as well as long term
time-series forecasting as compared to RNN model. relatively simple computations. The first layer takes input xi ,
Index Terms—fully recurrent, standard back-propagation, multiplies it by a weight matrix wij , performs a sum and
time-series data, future forecast models, RNN, time stamp. passes it through an activation function g to yield output yi
i.e.,
I. I NTRODUCTION Xn

In recent decades, there has been a shift from statistical yi = g( wij xj + bi ) (1)
i,j=0
models to artificial neural network models for future forecast-
ing so as to combat the complexities that are being observed In order to train these models, we use cost function and take
in improving the generation of wind energy. Randomness in its derivative w.r.t weight and use this derivative to move
the wind speed prevents the supply of predefined power to through nested layer of computations until it equals zero[6].
the grid thus resulting in inequality of the power that was
promised and received at the other end. Therefore, to reduce
unreliability in electricity supply, future forecast models are
being developed for accurate forecast of wind speed and power
generation[1][2].
Recurrent neural networks are used when patterns in data
change with time. This deep learning model has a simple
structure with a built-in feed back loop allowing it to act as
a forecasting engine. Taking a closer look at the structure,
these are very much different from the other neural networks,
like in feed-forward networks signals flow from one direction
i.e., from input to output, one layer at a time where as in
recurrent neural networks (RNN) the output of layer is added Fig. 1: Traditional feed-forward network
as the next input and fed back into the same layer which is
typically the only layer in the entire network. This process Major assumptions that can be drawn from feed-forward
can be termed as passage through time. Unlike feed-forward networks is fixed length and independence. From Fig. 1, it
networks, RNN can receive sequence of values as input and can be observed that outputs are independent of each other
sequence of values as output. The ability of these models i.e., output at time t is independent of output at t − 1. This
working with sequence of data opens up a wide variety of idea of independence considered above does not match with
applications in which forecasting is one of a kind. sequences such as time-series data which consists of short-long

978-1-5386-3702-9/18/$31.00 © 2017 IEEE


1
Proceeding of 2018 IEEE International Conference on Current Trends toward Converging Technologies, Coimbatore, India
term temporal dependencies[7][8] that should also be taken Above parameters are called hyper-parameters of the model.
into account. To overcome this challenge, we use recurrent At time t=0, we have input x0 . To find the value of h0 ,
neural networks. Equation 1 is used. To calculate y0 , Equation 2 is used.
Similarly, output y2 is obtained at that respective time stamp
i.e., at input x2 . This is how RNN model works.

C. Training of Recurrent Neural Network


RNN uses standard back-propagation algorithm, but it is
applied to every time stamp. It is commonly called Back-
propagation through time(BTT). In training the network, two
cases arise: 1)Vanishing Gradient and 2) Exploding Gradient.
The algorithm consists of following steps:

Step 1:Computing the new weights,


Fig. 2: Architecture of RNN model
w = w + ∆w (4)
For any given sequential data, the output depends upon Step 2:Change in weights,
previous output as well as the current input. Looking at the
de
structure in Fig. 2, input at t − 1 is fed to the current network ∆w = n × (5)
and get the output at time t − 1. At next time stamp, input dw
at time t is given to the network along with the information Step 3:Calculating the error,
from the previous time stamp i.e., t − 1, and get the output at error = (actual output − model output)2 (6)
t. Similarly for t + 1, the new input is fed to the network and
it goes on. A generalised figure is shown where information de
When gradient dw is less than one, then change in weight
from previous time stamp is flowing in a loop. ∆w is negligible. This can give rise to many iterations and
the new weight is almost equal to the old weight. Hence
B. Structure
no updation of weight would be happening. This is called
The math behind RNN’s is explained as follows: vanishing gradient. One solution to the vanishing gradient
problems is that we can use ReLU activation function which
gives output as one while calculating the gradient. Other
solutions include clipping the gradient when it goes higher
than the threshold which is referred to as RMSprop or building
different network architecture such as LSTM and GRUs.
Similarly, when gradient is very large i.e., when there are long
term dependencies, the new weights are very different from
older weights. This is called exploding gradient. Solution to
exploding gradient problems include applying truncated BTT,
clipping gradients at threshold or RMSprop to adjust learning
rate[8].

D. Algorithm
Fig. 3: Struture of RNN model This is the proposed algorithm to verify the network model.

From the Fig. 2, consider the following equations:


h(b) = gh (wi x(t) + wR x(t−1) + bh ) (2)

y (t) = gy (wy h(t) + by ) (3)


where,
h(t) =output of the hidden layer at one time stamp,
y (t) =output at one time stamp,
w=weight matrix,
xi =input,
b=bias,
gh , gy =activation function. Fig. 4: Process of training the given data

2
Proceeding of 2018 IEEE International Conference on Current Trends toward Converging Technologies, Coimbatore, India
E. Implementation Referring to Fig. 7, the model performance is evaluated
The neural network models were implemented using deep in terms of mean-squared error(MSE) which follows gradient
learning libraries Keras 2.0.6 and TensorFlow 1.2.1[4]. Other descent optimisation. The MSE is reduced to 12.36.
Python stacks include:
• pandas 0.20.3
• pyflux 0.4.15
• numpy 1.13.1
• matplotLib 2.0.2
• plotly 2.0.15
• scikit-learn 0.18.2
• scipy 0.19.1
• statsmodels 0.8.0
• seaborn 0.8

Better results are obtained when computations are performed


using GPU versions of TensorFlow, a great tool for imple-
menting deep learning models.
Fig. 7: Evaluating the prediction performance of Recurrent
F. Results neural networks
On executing, following results are obtained. Fig. 5 shows
the wind speed data in m/s and x-axis represents hourly data III. LSTM N ETWORKS
from 2010 to 2014. Long-Short term memory networks are a special kind of
recurrent neural networks. They are capable of learning long
term dependencies. LSTM’s have string-like structure similar
to RNN’s shown in the Fig. 9.
A. Structure
On comparing with RNN structure in Fig. 7, LSTM’s
have a string-like structure but the recurrent module has four
interactive neural networks where as RNN has a single layer
with one activation function. Now, the key to LSTM is the
cell state x. Following are the computational steps involved in
time-series prediction using LSTM network model in Fig. 8:

Fig. 5: Time series data of wind speed

A comparison of predicted and actual values is shown in


Fig. 6.

Fig. 8: The repeating module in LSTM Network model

Step 1:The first step in the LSTM is to identify those


information that are not required and will be discarded from
the cell state. This decision is made by the sigmoid layer
called the forget gate layer.

ft = σ1 (wf [ht−1 , xt ] + bf ) (7)


Here,wf =Weight,
Fig. 6: Forecasted wind speed compared to actual windspeed ht−1 =Output from previous time stamp,

3
Proceeding of 2018 IEEE International Conference on Current Trends toward Converging Technologies, Coimbatore, India
xt =New input,
bf =Bias

Step 2:The next step is to decide what new information


has to be stored in the cell state. This process comprises of
following steps. A sigmoid layer called the input gate layer
decides which values will be updated. Next, a tanh layer
creates a vector of new prospective values, that could be
added to the state.

it = σ2 (wi [ht−1 , xt ] + bi ) (8)

Fig. 10: Evaluating the prediction performance of LSTM


ct = tanh(wc [ht−1 , xt ] + bc ) (9) networks
Step 3:To update the old cell state ct−1 into new cell state IV. M ULTIVARIATE FORECAST MODEL USING LSTM
ct , multiply the old state (ct−1 ) by ft and add it × ct . NETWORKS
In comparison to classical linear time series forecasting
ct = (ft × ct−1 ) + (it × ct) (10) models, LSTM networks adapt easily to multiple input
variable problems. Let us consider a multivariate LSTM
Step 4:Pass a sigmoid layer which decides what parts of the model for forecasting wind speed. Following are the steps
cell state’s output is to be considered and put that cell state followed to build the forecasting model:
through tanh and multiply it by the output of the sigmoid gate.
Step 1:Preprocessing of data involving transformation of
ot = σ3 (w0 [ht−1 , xt ] + b0 ) (11) raw data set into meaningful inputs in order to feed in the
time-series model.
ht = ot × tanh(ct ) (12)
Step 2:Building an RNN LTSM model to fit the data.
B. Results
Step 3:Normalising the given data for the model to converge
Previously used wind speed data is divided into 90 percent at a greater speed and
for training and 10 percent for testing and comparison of actual
and predicted values from the two sets i.e., train and test sets Step 4:Finally, to make a forecast and scale-up the predicted
is shown in Fig. 9. values back into the original units.

A. Training Algorithm

Fig. 9: Comparison of actual and predicted values of wind


speed

Referring to Fig. 10, the model performance is evaluated in


terms of root mean-squared error(RMSE) and follows ADAM
optimisation which is more efficient than gradient descent
optimisation. The model performs fairly at greater speed with
train and test score of 0.46 and 0.55 RMSE respectively. Fig. 11: Building the multivariate LSTM forecast model

4
Proceeding of 2018 IEEE International Conference on Current Trends toward Converging Technologies, Coimbatore, India
B. Implementation R EFERENCES
This forecasting model was developed using Keras deep [1] Douglas C. Montgomery, Cheryl L. Jennings, Murat Kulahci, Introduction
learning library using TensorFlow[4] in the back-end. to Time-Series analysis and Forecasting, Newyork:Wiley Publications,
2015.
The following graph shows the data recorded from wind farm [2] Mohammad Shahidehpur, Hatim Yamin, Zuyi Li, Market oper-
near Managuli village in the southern part of Karnataka, which ations in electric power systems, NewYork:A John Wiley and
includes the date-time, zonal and meridional component of Sons.Inc.,Publication, 2002
[3] Trevor Hastie, Robert Tibshiani, Jerome Friedman, The elements of
wind, wind direction and wind speed. The complete list of statistical engineering, NewYork City:Springer publications, 2001.
features[5] available in the raw data are as follows: [4] Sam Abrahams, Danijar Hafner, Erik Erwitt, Ariel Scarpinelli, Tensor-
1.No:Row number Flow for Machine Intelligence, Bleeding Edge Press, 2016.
[5] Song Li, Peng Wang, Lalit Goel,Wind Power Forecasting Using Neural
2.Year:2010-2014 Network Ensembles With Feature Selection, IEEE Transactions on sus-
3.Month:1-12 tainable energy, VOL. 6, NO. 4, October 2015.
4.Day:1-30/31 [6] Hao Quan, Dipti Srinivasan, Abbas Khosravi, Short-Term Load and Wind
Power Forecasting Using Neural Network-Based Prediction Intervals,
5.Hour:1-24 IEEE Transactions on neural networks and learning systems, VOL. 25,
6.v:Zonal component of wind in degrees NO. 2, February 2014.
7.u:meridional component of wind in degrees [7] Qianyao Xu, Dawei He, Ning Zhang, Chongqing Kang, Qing Xia,
Jianhua Bai, Junhui Huang, A Short-Term Wind Power Forecasting
8.ws:Wind speed in m/s Approach With Adjustment of Numerical Weather Prediction Input by
9.wd:Wind direction in azimuth degrees Data Mining,IEEE Transactions on sustainable energy, VOL. 6, NO. 4,
October 2015.
C. Results [8] Thanasis G. Barbounis, Johnn B. Theocharis, Minas C. Alexiadis, Pet-
ros S. Dokopoulos, Long-Term Wind Speed and Power Forecasting Using
Referring to Fig. 12, it can be seen that model achieves Local Recurrent Neural Network Models, IEEE Transactions on Energy
a score of 3.061 RMSE and loss of 0.078 which is pretty Conversion, Vol-21, No. 1, March, 2006.
acceptable for any multivariate forecast model.

Fig. 12: Evaluating the prediction performance of LSTM


networks

V. C ONCLUSION
Literature have mixed results regarding the performance
of neural networks compared to other forecasting methods.
Neural network models prove very useful for bigger and
high frequency data as compared to the other statistical
models. Feed-forward networks are commonly not used for
implementing forecast models because the current output is
independent of previous output where as RNN’s are the best
for time-series forecasting. Evaluating the two network models
considered above the results have shown that LSTM’s perform
better and faster in case of given sequence having long-term
dependencies where as RNN’s perform better with short-term
dependencies. Time taken for LSTM networks for time-series
prediction is much smaller as compared to RNN models for
the same given data. With multiple data inputs, LSTM’s can
easily adapt to changes in the patterns of data and learn at a
greater speed and model takes lesser time to converge.

S-ar putea să vă placă și