Estimate Furnace Temp

Energy Conversion and Management 49 (2008) 1989–1998
Contents lists available at ScienceDirect
Energy Conversion and Management

journal homepage: www.elsevier.com/locate/enconman
Estimation of furnace exit gas temperature (FEGT) using optimized radial basis
and back-propagation neural networks
J.S. Chandok a,1, I.N. Kar b,*, Suneet Tuli c
a
Energy Technologies, NTPC Limited, New Delhi 110003, India
b
Department of Electrical Engineering, Indian Institute of Technology, Delhi, New Delhi 110016, India
c
Center for Applied Research in Electronics, Indian Institute of Technology, Delhi, New Delhi 110016, India
a r t i c l e i n f o a b s t r a c t
Article history: The boiler is a very important component of a thermal power plant, and its efficient operation requires
Received 30 October 2006 continuous online information of various relevant parameters. Furnace exit gas temperature (FEGT) is
Received in revised form 25 July 2007 one such important design/operating parameter. Knowledge of FEGT is not only useful for design of con-
Accepted 5 March 2008
vective heating surface but also helpful for operating actions and decision making. Its online information
Available online 28 April 2008
ensures improvement in economic benefit of the power plant. Non-availability of FEGT on the operator
desk greatly limits efficient operation. In this study, a novel method of estimating FEGT using neural net-
Keywords:
work is presented. The training data are first generated by calculating FEGT using heat balances through
Boiler
Furnace exit gas temperature (FEGT)
various heat exchangers. Prediction accuracy and fast response are major advantages in using neural net-
Prediction work for estimating FEGT for operator information. Two types of feed forward neural modeling networks,
Neural network radial basis function and back-propagation network, were applied and compared based on their network
RBF simplicity, model building and prediction accuracy. Results are verified on practical data obtained from a
210 MW boiler of a thermal power plant.
Ó 2008 Elsevier Ltd. All rights reserved.
1. Introduction the furnace water wall tubes and causing high FEGT value. The
knowledge of FEGT is useful for the design of convective heating
In conventional two pass pulverized fuel (PF) boilers, combus- surfaces and plant operating actions, which ensures improvement
tion takes place in the furnace and circulating fluid inside evapora- in the economic benefit of the power plant. Non-availability of
tive furnace tubes absorbs less than half of the total fuel heat. FEGT on the operator desk puts great limitation on efficient oper-
Furnace tubes viewed by the flame are treated as radiant heat ation. In PF boiler, predominant design and operating consider-
transfer surfaces whereas other boiler tubes downstream of fur- ations that govern FEGT are size of convective sections, ash
nace are assumed convective or a combination of two types [1]. deformation temperature, NOx formation and pollution control.
In the furnace exit, flue gases produced as a product of combustion, Conventionally thermocouples, furnace temperature probes,
attain a certain temperature termed as furnace exit gas tempera- etc., mounted on the left and right side of furnace are generally em-
ture (FEGT). The FEGT is an important design and operating param- ployed to measure FEGT. However their performance is compro-
eter and can be defined as the ratio of heat absorption by the mised due to inability to sustain such high temperature.
radiant heating surfaces in the furnace and that by the convective Recently, in a few power plants, methods like acoustics pyrometer,
heating surfaces downstream of the furnace [2]. optical pyrometer, etc., have been used but have not become pop-
A high value of FEGT would make the furnace compact but the ular due to their own limitations of high cost, lens fouling, combus-
convective sections larger. The FEGT is chosen to be below the ash tion noise and frequent calibration. Alternately, FEGT can also be
deformation temperature, to avoid severe fouling of back-pass calculated analytically by carrying out heat balances in various
tubes by molten ash. Similarly special provision of over fire air is heat exchangers in the flue gas path. Since FEGT depends on vari-
also made in some large furnace boiler to reduce peak furnace tem- ous direct and indirect boiler parameters, it is difficult to build a
perature, CO formation and to improve furnace safety. Further, precise mathematical model.
inferior coal firing leads to excessive soot formation and ash on A novel approach of estimating FEGT using artificial neural net-
work is proposed in this paper. Neural network can be used as
model free estimator, capable of estimating from available historic
* Corresponding author. Tel.: +91 11 26591093; fax: +91 11 26581264.
E-mail addresses: jschandok@gmail.com (J.S. Chandok), ink@ee.iitd.ac.in (I.N.
data called training data, consisting of both inputs and output. In
Kar). this study, FEGT is considered to be the output of neural network,
1
Tel.: +91 9868390930. which is not directly measurable. Hence to know this output an
0196-8904/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved.
doi:10.1016/j.enconman.2008.03.011
1990 J.S. Chandok et al. / Energy Conversion and Management 49 (2008) 1989–1998
Nomenclature
hgo flue gas enthalpy at economizer O/L in BTU/lb wlij connecting weight for ith neuron in the Lth layer
Tgo flue gas temperature at economizer outlet in °F E(k) instantaneous sum squared error
Tgi flue gas temperature at economizer outlet in °F g learning rate
hgi energy of the flue gas at the economizer inlet /0 derivative of activation function
hfwi feed water enthalpy at economizer inlet yL1 final output of first neuron in the Lth layer
hfwo feed water enthalpy at economizer outlet cj center of the RBF neuron in the hidden layer
yj(k) output of the neuron j xi training data
dj(k) desired value for neuron j r width of the Gaussian function
ej(k) error function of neuron j k regularization parameter
analytical calculation of FEGT is first carried out using heat balance 1600 °C in the flame core. Though the flue gas is cooled by the
in each heat exchanger in the flue gas path. This requires knowl- evaporator and platen superheater in the furnace, the temperature
edge of pressure, temperature and flow of working fluid at the inlet at the exit of the furnace is still in the range of 1000–1250 °C. Since
and outlet of each heat exchanger. A total of nine input parameters the flue gas flows through the furnace at a low velocity, only a
(Section 3) based on the operator’s experience and theoretical small fraction of the total heat transferred to the walls is through
studies have been selected as input to the neural network. These convection.
inputs may or may not have direct mathematical relation with After leaving the furnace the flue gas enters the convective sec-
FEGT, but certainly have great influence on it. tion of the boiler at furnace exit gas temperature as shown in Fig. 1,
Prediction accuracy and faster response are major concerns in where it cools further by transferring heat to water, steam and in
using neural network for reliable and meaningful value of FEGT some cases to combustion air. The principal mode of heat transfer
for operator information. This work also contributes to evaluating in this section being forced convection this section is called the
the prediction ability of two important neural networks using ra- convection section. Here, the gas enters at the FEGT and leaves at
dial basis function (RBF) and multilayer perceptron (MLP) back- slightly above the stack temperature [2]. In the boiler, the heat
propagation networks respectively. Measures such as correlation exchangers located in the convective section include the reheater
coefficient, prediction error bar chart, speed, etc., are adopted to (RHTR), final superheater (FNSH), low temperature superheater
compare the performance of these two network. These neural net- (LTSH), economizer and air preheater (APH) all placed in series.
work modeling methods are currently becoming very popular be-
cause of their applications in various disciplines, e.g. prediction 2.1. FEGT calculation
of coal ash fusion temperature [11], turbogenerator control [14],
solar array modeling and maximum power point prediction [15], The method employed in the present work is the lumped anal-
and power plant condenser performance [18]. Amoudi and Zhang ysis approach in which a heat exchanger can be defined by average
[15] reported comparison of BP and RBF neural network and con- characteristics. The analytical method is based on a series of heat
cluded that BP take longer time but require less information as balances beginning with the average flue gas temperature mea-
compare to RBF. Optimized design of neural networks has been re- sured with a left and right thermocouple at the economizer outlet.
ported, depending on their application. Prieto et al. [18] applied By working upstream toward the furnace outlet, the average gas
artificial neural network for power plant condenser and reported temperature entering each tube bank is determined using a series
that the design of the neural network could be enhanced by knowl- of heat transfers calculations. Heat gained by the working fluid
edge of physical variables. FEGT, though a very important measure- (water or steam) equals the heat lost by the flue gas. The last heat
ment for plant operation, has received fairly little attention in the transfer section in this series of calculations is the reheater and the
literature for estimating it using artificial neural network. There is last, or reheater inlet gas temperature, is the FEGT [3].
no reference found available in open literature where FEGT is esti-
mated using ANN.
This paper first describes an analytical calculation of FEGT using
convective heat transfer in the backward path. Important consider-
ations for selection of plant parameters as input to the neural net-
work are then elaborated. Architecture and training of feed
forward network both for MLP network and RBF network is ex-
plained and the results and performance comparison of both MLP
and RBF network discussed comparing their advantages and
disadvantages.
2. Process description and FEGT calculation
Heat transfer in steam generator is a complex and inexact phe-

nomenon due to its geometry, absorbing gas volumes and furnace
walls, etc., and thus there are varied opinion regarding which cor-
relation to apply in a particular situation. There are three mecha-
nisms of heat transfer; conduction, convection and radiation –
similar in some aspects but quite different in others.
When fuel burns in the boiler it releases a large amount of en-
ergy, which heats up the product of combustion (flue gas) to a very
high temperature. This temperature may range from 1500 °C to Fig. 1. Overview of boiler and location of furnace exit gas plane.
J.S. Chandok et al. / Energy Conversion and Management 49 (2008) 1989–1998 1991
Flue gas temperature at the inlet of any heat exchanger can be a T 2gi þ b T gi þ ðc hgi Þ ¼ 0
calculated by first calculating flue gas enthalpy from known outlet qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2
temperature at the outlet of this heat exchanger by a quadratic b b 4aðc hgi Þ
! Economizer inlet temperature; T gi ¼
equation as given below [4]. 2a
Flue gas enthalpy at economizer outlet, is given by ð3Þ
hgo ¼ a T 2go þ b T go þ c ð1Þ This economizer inlet flue gas temperature will be taken as the
averaged outlet temperature of the low temperature superheater
where hgo is in BTU/lb and Tgo is flue gas temperature at economizer (LTSH) (next element of the series, in backward path) and the tem-
outlet in °F. perature at the inlet of LTSH can be calculated, which will be used
The value of scalar constants a, b, c in Eq. (1) for a relevant range to know the averaged temperature at the final superheater inlet.
of temperature is Likewise, this final superheater inlet temperature will be consid-
a ¼1:725460 105 ered as averaged reheater outlet temperature and finally used to
calculate the reheater inlet temperature (FEGT). Fig. 2 shows the
b ¼0:2336275
variation of calculated FEGT, load and feed water (FW) flow.
c ¼18:58662 Figs. 3 and 4 indicate the convective heat transfer and flue gas
The total energy of the flue gas at the inlet of the economizer (hgi) temperature at the inlet of various components of boiler at full load
will be the sum of the flue gas energy at the outlet of economizer and part load, respectively. They confirm that the heat transfer in
calculated from Eq. (1) and heat energy gained by the feed water the reheater is maximum among all considered components, be-
from flue gas. cause reheater being nearest to the furnace is subjected to very
high temperature, due to which some radiative heat transfer is also
hgi ¼ hgo þ ðhfwo hfwi Þ ð2Þ present. Further as per design, reheater is a single component,
where hfwi and hfwo are specific enthalpies of feed water at econo- whereas the superheater is divided into three subcomponents,
mizer inlet and outlet, respectively. Further with the known value each sharing some amount of heat transfer. Total heat transfer in
of specific enthalpy at the inlet of economizer hgi, Eq. (1) can be superheater (Platen + LTSH + Final SH) is much more than the heat
used again, but this time to calculate economizer inlet temperature transfer in reheater. These results also give the flue gas tempera-
Tgi as follows: tures at the inlet of various components, which are useful to know
the intermediate process conditions.
3. Inputs for neural network modeling
The main objective of the work is to make use of the potential of

neural networks, to estimate the FEGT. This requires proper selec-
tion of input parameters for the NN model and is based on the
rationale that all parameters, which have direct or indirect effect
on FEGT, ought to be included. As shown in Fig. 5, the important
input parameters selected for FEGT neural network modeling are
1. Feed water flow.

2. Total coal flow.
3. Secondary airflow.
4. Secondary air temperature.
5. Primary air to coal flow ratio.
6. Flue gas O2%.
7. Burner tilt.
8. Mill combination.
Fig. 2. Variation of load, FW flow and calculated FEGT for a set of operation. 9. Cleanliness factor.
Fig. 3. Heat transfer and temperature in various component of boiler (full load).
Fig. 4. Heat transfer and temperature in various component of boiler (part load).
Out of these parameters, feed water flow, coal flow, secondary are six coal mills in the selected 210 MW plant and at any time,
airflow and secondary air temperature are basically input condi- generally four mills are in service. The various combinations of four
tions to boiler and their variations are directly reflect the various running mills, i.e. lower, middle or upper combination, affects FEGT
plant load conditions and hence different FEGT values. The remain- at a particular load condition. A representative value correspond-
ing parameters, i.e. PA/coal flow ratio, O2%, burner tilt, mill combi- ing to each combination can be obtained by taking the weighted
nation and cleanliness factor, though not mathematically related to average of mills in service. Upper mills are given more weightage
the FEGT, have a great influence on it at a particular operating load. compared to lower mills, as FEGT will be higher when upper mills
The velocity of primary stream must exceed the speed of flame are in service.
propagation so as to avoid flashback, and on leaving the burner The input mill combination in service is computed by weighted
the velocity of the mixture must also be low enough for stable igni- average method as
0:2xðMill AÞ þ 0:4xðMill BÞ þ 0:6xðMill CÞ þ 0:8xðMill DÞ þ 1:0xðMill EÞ þ 1:2xðMill FÞ

4:2
tion [5]. This ratio leads to variation in actual combustion and in with (Mill X), which is in service, being considered as ‘1’ and not in
turn to FEGT. Generally the air–coal ratio of primary stream is service as ‘0’.
maintained as 2:1. Similarly input cleanliness factor, which is the ratio of actual
Flue gas O2%, CO2 and CO are the product of combustion and operating heat of combustion and design heat of combustion
characterize the quality of combustion and hence FEGT. As only [6,7] also influences the FEGT at a particular load condition. A high
the O2% measurement was available in the plant chosen, it alone cleanliness factor indicates a clean boiler leading to better heat
was taken as input parameter for FEGT estimation. Burner tilt also transfer, and thus comparatively low temperature at the furnace
has great impact on FEGT as the temperature goes high when tilt is exit.
in upward position and goes low in downward position. The last
two inputs, i.e. mill combination and cleanliness factor are calcu- Cleanliness Factor ¼ Operating heat of combustion=Design heat
lated parameters. Unlike the other seven parameters, these are of combustion
not taken directly from plant data communication system. There
where heat of combustion is given by the sum (main steam heat
energy + reheater steam heat energy + blow down water heat en-
ergy SH spray heat energy + RH spray heat energy).
4. Neural network implementation
A feed forward network is a special architecture of neural net-

work in which neurons are organized in the form of layers. In this
work, two types of feed forward network namely multilayer per-
ceptron and radial basis function networks are adopted to estimate
FEGT.
The general architecture for a feed forward network is depicted
in Fig. 6, which is a standard architecture with one hidden layer.
MLP and RBF mainly differ in the type of neuron employed in the
hidden layer. The input layer consists of the descriptors, and the
output layer employed in this study for both networks is a linear
output function. The goal in neural modeling is to generate
weights, w, and bias, b, which model the data and act as coeffi-
Fig. 5. Neural network input/output model. cients to predict accurate output value [8–10].
Fig. 6. General architecture of single hidden layer feed forward network.
4.1. Training data generation gation learning consists of two passes through the different layers
of the network: a forward pass and backward pass. In forward pass,
Input data samples used for training and testing of the neural an input vector is applied to the sensory nodes of the networks and
network model were collected from a 210 MW thermal power its effect propagates through the network, layer by layer. Finally, a
plant. All required data points are averaged over one minute in set of outputs is produced as the actual response of the network.
the communication system itself. In this way, data of around During the forward pass, all the synaptic weights of the network
3200 samples (known as exemplars) spanning over a period of are fixed while during backward pass the synaptic weights are ad-
three different days were collected. Inputs directly available from justed in accordance with the error correction rule [11].
the plant and other calculated parameters, along with target FEGT The error signal at the output of neuron j at iteration k is defined
are extracted from whole data set for neural network training. by
To avoid duplicity of input data, all data points with no varia-
ej ðkÞ ¼ dj ðkÞ yj ðkÞ
tion in the values of parameters for successive samples are re-
moved from the training data sample. Also, data in which there where yj(k) is the output of the neuron j; dj(k) is the desired value
are extreme peculiarities is filtered out else the network ends up for neuron j; ej(k) is the error function of neuron j
memorizing input peculiarities with the result that the generaliza- The instantaneous sum squared error E(k) is given by
tion capability of the network becomes poor. In this manner, a total
1X 2
of 1489 data samples are selected for training, testing and EðkÞ ¼ e ðkÞ ð5Þ
2 j j
validation.
Neural network training can be made more efficient if certain and the mean squared error (MSE) is obtained by averaging E(k)
preprocessing steps are performed on the training data set. The in- over set size N is given by
put data to be applied to network and the target data for training
and testing is to be normalized in the range of the activation func- 1 XN
MSE ¼ Eav ðkÞ ¼ EðkÞ ð6Þ
tion. It is also to be seen that the normalized values of input and N k¼1
target data samples do not fall in the saturation regions of the acti-
vation function characteristic curve to avoid unrealistic network For layer l = 1, 2, . . ., L, yL1 will be final output of first neuron in the
response [11]. Hence, all data samples are normalized in the range output layer, which is a function of net input to this neuron. The
of 0.9 to+0.9 as the range of tan-sigmoid activation function is net input to ith neuron in the Lth layer is given as
from 1 to+1. nl1
X
For a variable with maximum and minimum values of Vmax and netLi ¼ ðwLij yL1
j þ wLi0 Þ ð7Þ
j¼1
Vmin, respectively, each value V is scaled to its normalized value A
using. yLi ¼ /ðnetLi Þ ð8Þ
The nonlinear activation function /() employed in this study is the

ðV V min Þ
A¼ 1:8 0:9 ð4Þ tan sigmoid function (hyperbolic tangent) given by
ðV max V min Þ
1 enet
/ðnetÞ ¼ tanhðnet=2Þ ¼ ð9Þ
1 þ enet
4.2. MLP network with back-propagation algorithm
The weights wlij represent the connecting weight and their updation
MLP have been applied successfully to solve some difficult and rule is given by
diverse problems by training them in a supervised manner with an " #
1X P
algorithm known as back-propagation algorithm. Algorithm is wlij ðt þ DtÞ ¼ wlij ðtÞ þ g ðdip ylip Þ /0 ðnetip Þ yl1
j ð10Þ
p p¼1
based on the error-correction-learning rule. The error back-propa-
where g is learning rate and /0 is derivative of the activation In principle, it has been proved that a neural network model
function. with only one hidden layer can uniformly approximate any contin-
The process of iteration is continued until the total number of uous function [8,11]. In the present work, the FEGT approximation
iterations (epochs) is reached or the specified error level for train- with nine dimensional input spaces is carried out with a network
ing is attained. The speed of training a network is largely depen- of single hidden layer with neurons varying from 2 to 9. MLP net-
dent on the method of updating weight w and bias b in the works with different hidden neurons are trained with 900 patterns
hidden layer and also on the size of the training data matrix. In and trained over 500 iterations. Energy function (MSE) of different
standard back-propagation w and b are updated by gradient des- models for the 900 training patterns, against the different number
cent with w and being moved in the opposite direction to the error of neuron in hidden layer is plotted as shown in Fig. 7. This indi-
gradient e. Each step down the gradient results in smaller errors cates there is substantial improvement in the performance func-
until the error minima is reached. Normally, momentum and learn- tion up to 5 neurons in the hidden layer, however, there is
ing rate terms are incorporated in the training scheme, which marginal improvement with further increase in number of neuron
makes changes proportional to the running average of the gradient from 6 to 10. In actual engineering system, the training data are
thereby speeding up the convergence process and reducing the risk usually erroneous, so the too high training precision will overfit
of trapping in a local minimum. In this work, the Levenberg–Mar- the training pattern and then impede generalization of the MLP
quardt approximation [12] is employed which is several orders of network [11,13]. From this experiment and analysis, five neuron
magnitude faster than standard gradient descent method. The are finally employed in the MLP network.
Levenberg–Marquardt rule states that change in weight DW is,
4.2.3. Selection of training precision
DW ¼ ðJT J þ lIÞ1 JT e ð11Þ
To avoid overfitting due to large capacity of network and for
where J is the Jacobian matrix of derivatives of errors for each good generalization, cross-validation method for early stop [8,11]
weight, l is a scalar and e is an error vector, and I is the identity ma- is used in training the network. Early stopping is used to ensure
trix [9,12]. that the network would generalize well to unseen data. To apply
the cross-validation, the total data set is first partitioned into a
4.2.1. Initialization of the MLP network training set and a testing set. The training set is further partitioned
The back-propagation learning algorithm is sensitive to initial into two disjoint subsets:
conditions. If the synaptic weights are assigned large initial value,
the activation function of neurons (the sigmoid function) will very Estimation subset, used to select the model.
possibly reach saturation and thus, the whole MLP network will get Validation subset, used to validate the model.
stuck in local minima [11]. On the other hand, if synaptic weights
are assigned a small initial value, the back-propagation algorithm The training subset is used to adapt the weights by Eqs. (5)–
may operate on a very flat and around the origin of the error sur- (10). After each epoch, the network is queried with the input vec-
face. Many initialization methods have been put forward for the tors in the validation subset and the mean squared validation error
back-propagation learning algorithm. In this work, a layer’s is calculated. The energy function for training patterns usually de-
weights and biases are initialized according to the Nguyen–Wid- creases with the progress of training, while that for validation pat-
row initialization algorithm [12]. This algorithm chooses values terns decreases at the initial stage of training and then increases
in order to distribute the active region of each neuron in the layer with training further on, as shown in Fig. 8. The opinion that smal-
evenly across the layer’s input space. As compared to purely ran- ler preset training precision, the better is the generalization of the
dom weights and biases, this algorithm has the advantages that MLP network, is not always true, especially in actual applications,
fewer neurons are wasted (since all the neurons are in the input because data from almost all the engineering systems are inevita-
space) and training works faster (since each area of the input space bly corrupted by noise.
has neurons). If the validation error increases with continued training, the
training is terminated due to potential for overfitting. If the valida-
4.2.2. Selection of MLP neural network architecture tion error remains the same for more than 10 successive epochs, it
A total of 1489 preprocessed data used in the present work was
divided suitably into three parts. Around 900 data are kept for
training, 200 data are kept for validation and 300 data are taken
for testing purpose.
Fig. 7. Comparison of energy value for the MLP with different hidden neurons. Fig. 8. MSE vs epoch for training and validation data set.
2 3
is assumed that the network has converged. It can be seen from h1 ðx1 Þ h2 ðx1 Þ h3 ðx1 Þ hm ðx1 Þ
Fig. 8, that the training stopped after only 18 iterations because 6 h1 ðx2 Þ h2 ðx2 Þ h3 ðx2 Þ hm ðx2 Þ 7
6 7
the validation error increased which means the training above this 6. .. .. .. 7
6 7
H ¼ 6 .. . . . 7
will impede the generalization capability of network. 6 7
6 .. .. .. .. 7
4. . . . 5
4.3. Radial basis function network h1 ðxp Þ h2 ðxp Þ h3 ðxp Þ hm ðxp Þ
1 T 1
The design of a supervised neural network may be pursued in a and A = (H H) .
variety of ways. The back-propagation algorithm for the design of a In practical situations, training based on available data (which is
multiplayer perceptron as described in previous section may be contaminated by noise also) is an ill-posed problem [12]. In that
viewed as the application of a recursive technique, however an- the large data sets may contain a surprisingly small amount of
other approach called radial basis function can be viewed as the information about the desired solution and there is no unique solu-
design of neural network vis-à-vis a curve fitting problem in a high tion exit. In such situations, it is necessary to supply extra informa-
dimensional space. RBF network has number of advantages over tion (or assumption). The mathematical technique for this is
the MLP with regard to training and locality of approximation, known as regularization [10]. In the linear model (12), model com-
making it an attractive alternative for on line applications [14]. plexity can be controlled by the addition of a penalty term. When
this combined error
4.3.1. Basic features of RBF network X
p X
m
The proposed RBF network for FEGT estimation is shown in E¼ ðyi f ðxi ÞÞ2 þ k w2j ð15Þ
i¼1 j¼1
Fig. 6. It comprises three layers: the input layer, the hidden layer
and the output layer. The input layer consists of a nine dimensional is optimized, large components in the weight vector w are inhibited.
vector. The output layer has only one element, i.e. the FEGT. The This kind of penalty is known as ridge regression or weight decay
hidden layer is composed of m RBFs /j (j = 1, . . ., m) that are con- and the parameterk, which controls the amount of penalty, is
nected directly to all the elements in the input layer. For a data known as the regularisation parameter. A small value for k means
set consisting of N input vectors together with corresponding out- the data can be fit tightly without causing a large penalty; a large
put FEGT, there are N such hidden units, each corresponding to one value for k means a tight fit has to be sacrificed if it requires large
data point. weights.
The activation function of hidden units is symmetric in the in-
put space, and the output of each hidden unit depends only on 4.3.2. Selection of RBF network architecture
the radial distance between the input vector and the center for The selection of RBF network architecture consists of selecting
the hidden unit. The output of each hidden unit, hj, j = 1, 2, . . ., m, its center, spread and optimal weights. An intractable problem of-
is given by ten met in RBF network applications is the choice of centers, which
hj ðxi Þ ¼ /ðkxi cj kÞ affect the complexity and the performance of a network greatly. If
too few centers were used, the network may not be capable of gen-
where /() is the Gaussian activation function is used and its trans- erating a good approximation to the target function. However, with
fer function is too many centers the network may overfit the data and it may fit
" # misleading variations due to imprecise or noisy data. There are dif-
kxi cj k2 ferent learning strategies that can follow in the design of an RBF,
/j ðxi Þ ¼ exp
2r2 depending on how the centers of the radial basis functions of the
network are specified.
where xi is the training data, cj is the center of the neuron in the hid- The learning method employed in the present work uses the
den layer,r is the width of the Gaussian function and k.k is the forward selection approach to determine the centers of RBF func-
euclidean norm. tions. Forward selection [10,16] is a direct approach to control
An RBF network is nonlinear if the basis functions can move or model complexity and to select a subset of centers from a larger
change size or if there is more than one hidden layer. Here, we fo- set that consists of all the input samples. The input data vectors
cus on single-layer network with functions that are fixed in posi- in the training set were used as the candidate centers. The method
tion and size. When applied to supervised learning with this starts with an empty network and then adds one neuron at a time
linear model, the least squares principle leads to a particularly easy to the hidden layer, which is an incremental operation.
optimization problem. If the model is Using forward selection approach, the candidate unit, which de-
X
m creases the sum-squared-error (SSE) most and had not already
f ðxÞ ¼ wj hj ðxÞ ð12Þ been selected at each step is chosen to be added to the current net-
j¼1 work. When the error of network output reaches the pre-set error
goal value in conventional RBF network the procedure of adding
and the training set is fðxi; yi Þgpi¼1 , then the least square recipe is to
hidden neurons will stop. With the improved method, to decide
minimize the sum squared error
when to stop adding further neuron, the generalized cross-valida-
X
p tion (GCV) is used as the model selection criterion (MSC) to calcu-
S¼ î f ðxi ÞÞ2
ðy ð13Þ late the prediction error during the training procedure and is
i¼1
^T P 2 y
py ^
with respect to weights of the model. The minimization of above ^2GCV ¼
r ð16Þ
ðtraceðPÞÞ2
cost function leads to a set of m simultaneous linear equations in
the m unknown weights and these linear equations can be solved where P is projection matrix and p is the number of data sample in
to obtain optimal weight vector as training the network This is a quantity to estimate how well the
^ ¼ A1 HT y
^ trained network will perform in the future.
w ð14Þ
The learning process undertaken by RBF network consists of
where the design matrix H, is supervised adjustment of nonlinear parameters, i.e. center and
Table 1 strong responses to some of the samples in order to differentiate

Selection of radius of basis function (k = 5 10E06) them.
Serial Radius Number of basis function Mean squared error on There are various heuristics for calculating the radius of RBF,
no. selected selected test data however, in the present work the MSE is used as a criterion for
1 2 67 0.1988 investigating the effect of radius of hidden neurons. The MSE does
2 3 24 0.1356 not monotonously decrease or increase with the increase of the ra-
3 4 20 0.1278 dii, but there is a value or range of radii at which the MSE is min-
4 5 16 0.0322
5 6 16 0.0968
imum [17]. It can be seen from Table 1, that the mean squared
6 6.5 25 0.19 error on test data is increasing and decreasing in the radius range
7 7 12 0.0021 of 2–6.5, however; in the range of 7–9 of radii the MSE is found to
8 7.5 11 0.0024 be falling near its minimum. With this analysis, 12 number of ra-
9 8 17 0.073
dial basis function of radius 7 with regularization parameter equal
10 9 20 0.073
to 5 10E06 is selected in this application. With this network
configuration, optimal weights have been obtained by minimizing
the performance function given by Eq. (15).
shape of receptive field, as well as linear parameter, i.e. weights of
the output layer. Total 900 data were used for the training and 300 5. Results and discussion
data are kept for testing the network. Initially training is performed
without any subset selection, i.e. with simple RBF network, and The FEGT is predicted for 300 test data using MLP and RBF type
with this the MSE is found to be very high. With forward selection neural network implemented in MATLAB. Test data are first nor-
and small regularization k = 5 10E06, much smaller mean malized between 0.9 to+0.9 and later denormalized back to its
square error equal to 0.0021 is found and only12 basis function exact value for comparison with actual FEGT value.
are selected. In back-propagation algorithm, neural network with nine in-
The next step in the training process involves the computation puts, one hidden layer with five hidden neurons and one output
of the Gaussian function width/radius. Radius reflects the size of neuron is employed for predicting FEGT. The activation function
the RBF unit, and thus affects the response of the network to an ‘tan-sigmoid’ is selected for hidden neurons and linear function
input directly. The effect of radius on RBF network performance is employed for output neuron. Back-propagation training using
was investigated in the present work. With small radii, the re- the Lavenberg–Marquadrt training algorithm with learning rate
sponses of networks are weak to all the samples because the hid- of 0.01 is used for training the network. MLP network predicted va-
den neurons are so narrow that most of the samples are far from lue and actual value of FEGT is plotted for set of 300 data as shown
the center. While relatively larger radii networks will give out in Fig. 9.
Fig. 9. Predicted FEGT using back-propagation algorithm
Fig. 10. Predicted FEGT using radial basis function (FEGT)

Similarly, in the case of RBF network the number of basis func- radius of basis function and the regularization parameter are
tion selected by forward selection supervised learning is 12. The important for RBF performance and selected experimentally as 7
and 5 10E06, respectively. RBF network predicted value and
actual value of FEGT is plotted for set of 300 data as shown in
Fig. 10.
5.1. Comparison of two networks
As MLP and RBF type neural networks initialize and train in dif-
ferent ways, a direct comparison is not straightforward. However,
they can be compared based on ease of model building, prediction
accuracy and network simplicity.
The major parameters which influence training and prediction
performance of MLP networks includes training time and number
of neurons in the hidden layer. The choice of these two parameters
is crucial and complex in MLP model building. Methods such as
cross-validation have been used to monitor and reduce the training
time by examination of mean squared error for prediction (MSEP).
Models with minimum MSEP are considered adequate. Also be-
cause of the random initialization of weights and bias in MLP net-
Fig. 11. Correlation of actual and predicted FEGT using MLP network.
work, they will not necessarily yield the same result when
repeated and it is very difficult to conclude, which configuration
is better. In contrast, initialization of weights in the RBF network
with forward selection by minimizing the MSEP, always results
in the same performance for a particular set of parameters. Model
building in RBF network is therefore easier as compared to MLP
network in this case.
The prediction accuracy of two feed forward networks is com-
pared by computing mean square error (MSE) on test data set.
The predicted FEGT using MLP network and RBF network as shown
in Figs. 9 and 10 results in MSE of 0.0030 and 0.0021, respectively.
The correlation coefficient is also calculated for both cases and
found to be 0.982 and 0.985 for MLP and RBF networks, respec-
tively as shown in Figs. 11 and 12. This indicates that RBF is com-
paratively better in approximating the actual FEGT.
Another attempt was made to compare the performance be-
tween the RBF and MLP networks by plotting FEGT prediction er-
rors against counts. Bar charts of Figs. 13 and 14 indicate that
FEGT prediction error is within 7 to +7 for almost all the counts
in the case of RBF network, whereas for the MLP network it is more
wider and in the range of 10 to +10.
In terms of network simplicity, the number of processing units
Fig. 12. Correlation of actual and predicted FEGT using RBF. in RBF network is more than twice of the MLP network. However,
Fig. 13. FEGT error using MLP network using back-propagation.

Fig. 14. FEGT error using radial basis function network.
RBF network trains much more rapidly as compared to MLP net- References
work. A proper selection of spread parameter is crucial to the gen-
eration of good global model for the type of data used. [1] British Electricity International. Modern power station practice – boiler &
ancillary plant. Oxford: Pergamon Press; 1991. p. 1–75.
[2] Basu P, Cen Kefa, Louies Jestin. Boilers and burners. New York: Springer-Verlag;
6. Conclusions 2000.
[3] Babcock&Wilcox, R&D division – Full scale demonstration of low-NOx cell
burner retrofit. <http://www.babcock.com/pgg/tt/tech-utility.html>.
This study presents a novel neural network based approach to [4] Babcock & Wilcox, Co. Steam – its generation and use, 37th ed. 1995, p. 9-17–
estimate FEGT for operator information. The major emphasis in 27.
this study is on how to make FEGT estimate more reliable and use- [5] Lawn CJ. Principal of combustion engineering for boilers. Academic Press;
1987. p. 9–15 and 260–2.
ful to the operator. Various important parameters, which are di- [6] Davidson Ian, Carter HR. A fully intelligent sootblowing system. In:
rectly or indirectly related to FEGT, have been logically selected International conference on thermal power generation. Best practices and
as input variables based on operator experience and process future technologies part-I. NTPC and USAID, session –II, New Delhi, India;
2003. p. 17–26.
knowledge to ensure reliable estimation. Prediction accuracy and [7] Nasal RJ, Richard RD, Richard Deaver. Expert system support of a heat transfer
faster response are major concerns in using neural network in esti- model to optimize soot blowing – a case study at Delmarva’s Edge Moor unit
mating FEGT for useful operator information: therefore two types #5”. In: Proceedings of the heat rate improvement conference, EPRI TR-
106529; May 1996. p. 23-1–14.
of feed forward neural modeling networks, radial basis function [8] Haykin S. Neural networks. A comprehensive foundation. 2nd ed. Prentice-
and back-propagation network, were applied and compared based Hall; 1999.
on their network simplicity, model building and prediction [9] Jang JSR, Sun CT, Mizutani E. Neuro-fuzzy and soft computing. Pearson
Education; 2004.
accuracy.
[10] Orr MJ. Regularization in selection of RBF centres. Neural Comput
It has been shown that, for this application, RBF networks train 1995;7(3):606–23.
very rapidly (about 10 times faster than MLP networks using back- [11] Chungen Y, Luo Z, Ni M, Kefa Cen. Predicting coal ash fusion temperature with
a back propagation neural network model. Fuel 1998;77(15):1777–82.
propagation algorithm). A proper selection of the spread parameter
[12] Demuth H, Beale M. Neural network toolbox. For use with MATLAB. User’s
is crucial to the generation of a good global model for the type of guide. The MathWorks Inc.; 2002. MATLAB/help/pdf doc/nnet.pdf.
data used in this study, but both the spread parameter and the [13] Nasr GE, Badr EA, Joun C. Backpropagation neural networks for modeling
number of neurons in the hidden layer were easily optimized by gasoline consumption. Energy Convers Manage 2003;44:893–905.
[14] Flynn N, Mcloone S, Irwin GW, Brown MD, Swidenbank E, Hogg BW. Neural
forward selection method. In general the FEGT predictions of both control of turbogenerator systems. Automatica 1997;33(11):1961–73.
MLP and RBF networks are very similar, i.e. the results show that [15] Amoudi A, Zhang L. Application of radial basis function networks for solar
both networks converged to the same performance in terms of pre- array modeling and maximum power-point prediction. IEE Proc Gen Trans Dist
2000;147(5).
diction output, provided that care was taken to ensure that the net- [16] Chang F-J, Liang Jin-Ming, Chen Yen-Chang. Flood forecasting using radial basis
work was optimized. function neural networks. IEEE Trans Systems Man Cybernet – Part C
The inference and conclusions drawn from the study will con- 2001;31(4):530–5.
[17] Zhuoyong Zhang, Dan Wang, Peter de, B Harringnton, Voorhees KJ, Rees Jon.
tribute to development of useful soft-sensors for FEGT. This online Forward selection radial bais function network applied to bacterial
measurement of FEGT can be directly linked with existing distrib- classification based on MALDI-TOF-MS. Talenta 2004;63(3):527–32.
uted digital control and information system, for control and infor- [18] Prieto MM, Montanes E, Menendez O. Power plant condenser performance
forecasting using a non fully connected artificial neural network. Energy
mation in existing power plants. The FEGT available by this means
2001;26:65–79.
will also be useful for various boiler optimization software viz. boi-
ler plant optimization systems, intelligent soot blowing system
and study of heat rate improvement.

Estimate Furnace Temp

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Estimate Furnace Temp

Încărcat de

Drepturi de autor:

Formate disponibile

Energy Conversion and Management 49 (2008) 1989–1998

Contents lists available at ScienceDirect

Energy Conversion and Management

2. Process description and FEGT calculation

Heat transfer in steam generator is a complex and inexact phe-

3. Inputs for neural network modeling

The main objective of the work is to make use of the potential of

1. Feed water ﬂow.

0:2xðMill AÞ þ 0:4xðMill BÞ þ 0:6xðMill CÞ þ 0:8xðMill DÞ þ 1:0xðMill EÞ þ 1:2xðMill FÞ

4. Neural network implementation

A feed forward network is a special architecture of neural net-

Fig. 6. General architecture of single hidden layer feed forward network.

The nonlinear activation function /() employed in this study is the

Table 1 strong responses to some of the samples in order to differentiate

Fig. 9. Predicted FEGT using back-propagation algorithm

Fig. 10. Predicted FEGT using radial basis function (FEGT)

5.1. Comparison of two networks

Fig. 13. FEGT error using MLP network using back-propagation.

Fig. 14. FEGT error using radial basis function network.

S-ar putea să vă placă și