Mahdevari 2012

International Journal of Rock Mechanics & Mining Sciences 55 (2012) 33–44
Contents lists available at SciVerse ScienceDirect
International Journal of
Rock Mechanics & Mining Sciences
journal homepage: www.elsevier.com/locate/ijrmms
Application of artificial intelligence algorithms in predicting tunnel

convergence to avoid TBM jamming phenomenon
Satar Mahdevari a,n, Seyed Rahman Torabi a,b, Masoud Monjezi a
a
Department of Mining Engineering, Tarbiat Modares University, Tehran, Iran
b
Department of Mining, Petroleum and Geophysics, Shahrood University of Technology, Shahrood, Iran
a r t i c l e i n f o abstract
Article history: One of the most important issues in TBM excavated tunnels is the exact estimation of the ground
Received 29 December 2011 squeezing. Prediction of the ground behavior ahead of the tunnel face is essential to avoid project
Received in revised form setbacks such as jamming phenomenon due to squeezing conditions. Artificial intelligence (AI)
13 May 2012
algorithms are proved to be suitable tools when relationship between dependent and independent
Accepted 14 June 2012
variables cannot easily be understood. In this paper, well-known AI based methods, support vector
Available online 13 July 2012
machines (SVM) and artificial neural networks (ANN), were employed to predict ground condition of a
Keywords: tunneling project. The Ghomroud water conveyance tunnel excavated in rocks vulnerable to squeezing
TBM jamming condition was selected as the case study. Training of the AI models was performed using previous
Artificial intelligence
practical experiences in the form of database. The tunnel convergence due to squeezing was considered
SVM
as the models’ outputs. According to the obtained results, it was observed that AI based methods can
ANN
Rock squeezing effectively be implemented for prediction of rock conditions in the tunneling projects. Moreover, it was
concluded that performance of the SVM model is better than the ANN model. A high conformity was
observed between predicted and measured convergence for the SVM model.
& 2012 Elsevier Ltd. All rights reserved.
1. Introduction employed before facing undesirable phenomena such as TBM

jamming due to rock squeezing.
Ground squeezing is an important phenomenon in tunneling AI is a scientific discipline concerned with the design and
through rock, frequently at great depths. When such conditions development of algorithms used to evolve behaviors based on
are not recognized prior to excavation of the tunnel, construction empirical data. There are several AI algorithms, amongst them
delays and increased costs may result. One major effect of SVM and ANN are more applicable for prediction of non-linear
squeezing behavior in mechanized tunneling is the tunnel walls phenomena in engineering problems.
deformations, which can result in stoppage of TBM. Since the early 1990s, ANNs have been applied successfully to
Rock squeezing in many cases slows down or obstruct TBM almost every problem in engineering. The literature reveals that
operation and sometimes even call into question the feasibility of ANNs have extensively been used to solve geotechnical problems
a TBM drive [1]. Therefore, reliable prediction of the ground such as modeling TBM performance [3], rock failure criteria [4],
conditions ahead of the tunnel face is essential to avoid project prediction of stability of underground openings [5], prediction of
setbacks. ground surface settlements due to tunneling [6,7], identifying
Despite improvements made in the theoretical assessment of probable failure modes for underground openings [8], prediction
the squeezing phenomenon and the experiences gained with of tunnel support stability [9] and tunneling performance predic-
different construction methods, there is still no reliable and tion [10].
unique method of prediction available [2]. This research aims to Recently, there have been many studies of the application of
present results of application of two well-known AI based SVMs in geotechnics, for example prediction of subsidence due to
methods in predicting convergence of the TBM excavated tunnels. mining [11], prediction of blast induced ground vibration [12],
In this respect, two AI algorithms namely SVM and ANN, are slope reliability analysis [13], prediction of gas leakage in coal
implemented. Enabling prediction of tunnel convergence in var- mines [14], non-linear displacement time series modeling [15],
ious geological conditions, proper remedial measures can be calculation of subsidence coefficient [16], Design of tunnel shot-
crete-bolting support [17] and prediction of tunnel surrounding
rock displacement [18].
n
Corresponding author. Tel./fax: þ 98 21 82884324. It is obvious that, one of the most important issues in mechanized
E-mail address: satar.mahdevari@modares.ac.ir (S. Mahdevari). tunneling is the ability of recognizing squeezing condition to avoid
1365-1609/$ - see front matter & 2012 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.ijrmms.2012.06.005
34 S. Mahdevari et al. / International Journal of Rock Mechanics & Mining Sciences 55 (2012) 33–44
TBM jamming phenomenon. High rate of tunnel convergence can between shield and tunnel perimeter is reduced or omitted due to
result in variety of problems, e.g. in mechanized tunneling rapidly high magnitude of convergence.
converging ground may exert such a high pressure on the shield that As the TBM types are different with respect to the thrusting
the thrust force is no longer sufficient to overcome the shield skin system, the type of support and the existence or lack of a shield
friction hence results in jamming of the TBM [19]. In soft rock, different hazard scenarios arising from squeezing phenomenon
clogging and/or sagging of the cutter head may also occur and TBM have to be considered. Length increase in the double shielded
bracing by the gripper plates may become impossible [20]. TBMs leads to higher friction forces because of the higher contact
Researchers have proposed a number of approaches for the of the rock mass with the shield which results in higher risks of
assessment and prediction of the convergence and support TBM jamming. In the weak grounds which are prone to squeezing
requirement to cope with the squeezing phenomenon in under- condition, bracing by gripper may be impossible and additional
ground constructions. For example Aydan et al. [21] proposed an problems may also occur with the extension and compression of
empirical method using tangential strain of tunnels as a para- the telescopic joints. Hence, instrumentation and monitoring of
meter to assess the degree of the rock squeezing and a threshold the deformations during and after the tunnel construction is the
value of 1% for the recognition of squeezing. Also one of the semi- best way to reach a realistic judgment.
analytical methods is the Hoek and Marinos approach [22]
predicting tunnel squeezing based on the ratio between rock
mass strength scm and lithostatic stress, s0 ¼ gH. Another method 3. Methodology
is a reliable analytical approach namely convergence-confine-
ment which was proposed by Carranza-Torres and Fairhurst [23]. 3.1. Theory of support vector machines
This method requires knowledge of the deformation character-
istics of the ground and support system. SVM is a universal approach for solving the problems of multi-
Although predicting of tunnel convergence was investigated dimensional function estimation. It is based on the Vapnik–
by some researchers using empirical and analytical methods Chervonenkis (VC) theory [30,31]. In a nutshell, the VC theory
[21–23], these approaches cannot present a reliable and unique characterizes properties of learning machines, which enable them
solution for all the conditions. In this research, our goal is to apply to properly generalize unseen data. The SVM is also implementing
AI based methods for predicting this problem. For this purpose, the structural risk minimization (SRM) inductive principle for
Ghomroud water conveyance tunnel having convergence problem obtaining strong generalization ability on a limited number of
is selected as a case study for testing the proposed AI models. learning patterns. The SRM involves simultaneous attempt for
minimizing the empirical risk and the VC dimension [32].
Support vector regression (SVR) employs the SVMs to tackle
with problems of function approximation and regression estima-
tion by introducing an alternative loss function.
2. Squeezing in mechanized tunneling In general, the approximating function in SVR for a given sample
set: S¼{S1,S2,y,Si,y,SN} such that Si ¼{xi,yi} takes a linear form:
Rock squeezing phenomenon has been studied by many
researchers, such as Deere [24], Kovari [25], Aydan [21], Barla yi ¼ f ðxÞ ¼ /w,xS þb, /w,xS ¼ wfðxÞ ð1Þ
[26] and Panet [27]. According to the definition given by Inter- n
where xi (xiAR ) is the input vector, yi (yiAR) is the output and N is
national Society for Rock Mechanics (ISRM), rock squeezing is the the total number of training data points, wðwA Rn Þ and bðb A RÞ are
time dependent phenomenon which occurs around the tunnel weight matrix and bias, respectively. and j(x) is the high-dimen-
and is essentially associated with the creep caused by exceeding a sional feature space which is non-linearly mapped from the input
limiting shear stress, and may terminate during construction or space Rn.
continue over a long period of time. In SVR the main objective is to find a function f(x) that has
Due to the fixed geometry and the limited flexibility of the almost e deviation from the actual targets yi given by the training
TBM, the room (gap) to be allowed for ground deformations, is data [33]. In other words, there is no concern for errors as long as
restricted. Convergences which exceed 5% of the tunnel radius are they are less than e, but any deviation larger than this is not
to be considered problematical [28]. The consequences of squeez- accepted. This penalty function is used in SVR as follows:
ing can range from tunnel closures and exerting high pressures on
yi -ðw:fðxi Þ þbi Þ r e, not allocating a penalty
the shield of the TBM to more extreme conditions when the
friction produced by the ground cannot be counteracted by the
yi ðw:fðxi Þ þbi Þ 4 e, allocating a penalty
available thrust and the TBM is jammed.
Fig. 1 depicts a longitudinal section of a working TBM. As it is The concept of e-insensitive loss function is depicted graphically
seen in the figure, the machine is trapped when the space (gap) in Fig. 2. Only the samples out of the 7 e margin (shaded region
Fig. 1. Schematic section of shielded TBM (left) [29] and TBM jamming (right).
S. Mahdevari et al. / International Journal of Rock Mechanics & Mining Sciences 55 (2012) 33–44 35
Fig. 2. The e-insensitive loss setting corresponds to a linear SVR [33].
known as e-insensitive tube) will have a nonzero slack variable. where aiþ and a i are the Lagrange multipliers, satisfying the
Normally, if the predicted value is inside the region the loss will equalities aiþ a þ
i ¼ 0; ai Z0 and ai Z0, where i¼ 1,2,y,N.
be zero while if the predicted point is outside the tube the loss is In the above equation, the bias b can be calculated by
the magnitude of the difference between the predicted value and considering Karush–Kuhn–Tucker conditions for regression. The
the radius e of the tube. Lagrange multipliers can also be obtained by maximizing the dual
The goal of this problem can be written as below, where the function of Eq. (4).
coefficients w and b are also determined from the training data by The basic idea in the SVMs is to map the data XARn into a high-
minimizing the regression risk (Rreg) as follows [34]: dimensional feature space via non-linear mapping [33]. Kernel
( ) functions perform the non-linear mapping between input space
1 2
XN
Rreg ¼ Min :w: þ C Le ðxi ,yi ,f Þ ð2Þ and a feature space. In other words, the kernel approach is
2 i¼1 employed to address the curse of dimensionality. The value of
In the regularized risk function given by Eq. (2) the first term the kernel is equal to the inner product of two vectors xi and xj in
2
1=2:w: , is the structure risk used to control the smoothness or the feature space f(xi) and f(xj), i.e. K(xi,xj)¼ f(xi). f(xj). Various
complexity of the function (regularization term) and the second kernel functions may produce different support vectors. Because
P the Gaussian kernel function as a popular radial basis function,
term C N i ¼ 1 Le ðxi ,yi ,f Þ, is the empirical risk. It is calculated by the
e-insensitive loss function according to Eq. (3). Thus, the SVR is (RBF) can approximate any non-linear function in AI theories, this
formulated as minimization of both the structural and empirical function is selected in our study as the kernel function of the
risk. SVMs.
2
!
Le ðxi ,yi ,f Þ ¼ Max 0,9f ðxi Þyi 9e ð3Þ :xi xj : 2
Kðxi ,xj Þ ¼ exp ¼ expðg:xi xj : Þ ð6Þ
2s2
The above loss function provides advantage of enabling one to
use sparse data points to represent the decision function given by where : : is the Euclidean norm for vectors and s2 denotes the
Eq. (1). In Eq. (2), C is referred as the regularized constant and it variance of the Gaussian kernel, determining the geometrical
determines trade-off between the empirical risk and the regular- structure of the mapped samples in the kernel space.
ization term. Increasing the value of C will result in the relative Based on the Karush–Kuhn–Tucker conditions of quadratic
importance of the empirical risk. In Eq. (3), the parameter e is programming, only a certain number of coefficients ðaiþ a i Þ in
equivalent to the approximation accuracy placed on the training Eq. (5) will assume non-zero values. The data points associated
data points. In this function when the data points are within 7 e with them have approximation errors equal to or larger than e
range, do not contribute to the empirical error. and are referred to as support vectors. These are the data points
To obtain the estimations of w and b, Eq. (2) is transformed to lying on or outside the e-bound of the decision function. In a
the primal function given by Eq. (4), by introducing the positive sense, the complexity of a function’s representation by support
þ
slack variables zi and zi as follows [34]: vectors is independent of the dimensionality of the input space
( )
1 2
XN
þ
and depends only on the number of support vectors.
Rreg ¼ Min :w: þ C ðzi þ zi Þ
2 i¼1
Subjected to: ðw:fðxi Þ þbi Þyi r e þ zi
þ 3.2. Theory of artificial neural networks

yi ðwfðxi Þ þ bi Þ r e þ zi
An ANN model is a mathematical model that is inspired by the
ziþ ; z
i Z0 ð4Þ
structure and/or functional aspect of biological neural networks
þ
where zi and zi represent the upper and lower training errors, and is in fact an emulation of biological neural system. Neural
respectively, subjected to e-insensitive tube. network analysis can be used to handle non-linear problems. An
The optimization problem can easily be solved in its dual ANN is a computing system consisting of a highly interconnected
formulation. Moreover, the dual formulation provides the key for set of simple information processing elements called units or
extending SVMs to non-linear functions. Finally, by introducing neurons. Arrangement of these units determines the neural net-
Lagrange multipliers and exploiting the optimality constraints the work architecture. One of the most commonly implemented
decision function given by Eq. (1) will have the following explicit ANNs is the multi-layer perceptron (MLP) technique.
form [34]: The MLP is a universal function approximator as proved by the
X
N Cybenko theorem [35] and employs a supervised learning tech-
f ðx,aiþ ,a
i Þ¼ ðaiþ a
i ÞKðxi ,xÞ þ b ð5Þ nique called back-propagation (BP) for training the network and
i¼1 is a kind of feed-forward ANN model [36,37]. This model maps
a set of input data into a set of expected outputs. An MLP consists When the scalar m is large, this becomes gradient descent with a
of several layers of nodes in a directed graph that is completely small step size.
connected from one layer to the next. Except for the input nodes,
each node is a neuron or processing element with a non-linear
activation function. An MLP is a modification of the standard 4. Case study
linear perceptron which can differentiate data that is not linearly
separable [35]. BP is the generalization of the Widrow–Hoff Ghomroud tunnel as part of a water conveyance system is
learning rule to multiple-layer networks and non-linear differ- located in the Lorestan and Isfahan provinces, Iran, between
entiable transfer functions [38]. The simplest implementation of Aligoudarz and Golpayegan cities (Fig. 3).
BP learning updates the network weights and biases in the The system is expected to convey some 120 106 m3 water
direction in which the error calculated by performance function per year (23 m3/s) from Dez river branches to the Golpayegan
decreases most rapidly. The procedure of the iteration of this dam. This project includes five deviation dams, a water convey-
algorithm can be written as ance channel and two tunnels with the lengths of 36 and 9 km,
respectively which are divided into five parcels. The excavation of
xk þ 1 ¼ xk ak g k ð7Þ
the tunnel under question (lots III and IV) was started in 2004
where xk þ 1 is the new vector of weights and biases, xk is the using a double shield TBM with excavation diameter of 4.5 m and
previous vector of weights and biases, ak is the learning rate and final diameter of 3.8 m. Fig. 4 shows the geological section of the
gk is the gradient. tunnel alignment considered in this study.
There are several training algorithms for function approxi- The area under question is located in the Sanandaj-Sirjan
mation problems in ANNs amongst which the Levenberg– geological formation. This formation consists of series of asym-
Marquardt algorithm has the fastest convergence. This advantage metric foldings and faults which have been subjected to mild to
is especially noticeable if very accurate training is required. high metamorphisms and includes massive limestone, dolomite,
In addition, Levenberg–Marquardt algorithm is able to obtain sandstone, slate, phyllite, schist and metamorphic shale. The
lower error [38]. The direction in which the search is made is dominating rocks in this tunnel are of metamorphic and sedi-
described by mentary types with weak to fair quality and causing some
xk þ 1 ¼ xk A1 problems due to the convergence and instability. As shown in
k gk ð8Þ
Fig. 5, the TBM has had several stoppages including a few major
where Ak is the Hessian matrix (second derivatives) of the error delays related to being trapped in squeezing ground.
function at the current values of weights and biases and gk is the Due to existing weak rock formations and high overburden, the
gradient of the error function. Since the error function has the tunnel is susceptible to squeezing condition and therefore con-
form of a sum of squares, the Hessian matrix and the gradient can vergence phenomenon. Squeezing condition causes high ground
be approximated as [38] pressure which in turn decreases TBM advance. In this research, a
A ¼ JT J ð9Þ new procedure to predict squeezing condition providing possibi-
lity of taking proper remedial measures for tackling the problem is
g ¼ JT Z ð10Þ presented.
where J is the Jacobian matrix which contains first derivative of

the network errors with respect to the weights and biases and Z is 4.1. Dataset
a vector of network errors.
Finally, the search direction is given by For this study a database was established based on the data
collected from the information of the boreholes and geological
xk þ 1 ¼ xk ½JT J þ mI1 JT Z ð11Þ reports of related to different sections of the tunnel. Some necessary
Fig. 3. Schematic location of Ghomroud water conveyance tunnel.

Fig. 4. Geological longitudinal section of the studied parcels of the tunnel [39].
Fig. 5. TBM jamming under squeezing conditions in Ghomroud tunnel.
Table 1 including c, phi, E, scm and stm are rock mass parameters and
Parameters considered for the model development [39]. are output of RocLab program according to the Hoek–Brown and
Mohr–Coulomb criteria.
Type of data Symbol Unit Ave. Std deviation Min. Max.
Three parameters namely Poisson’s ratio (n), dry unit weight
Inputs H m 377.85 72.95 148.00 490.00 (gdry) and saturated unit weight (gsat) was also available, but after
GSI – 46.93 9.84 29.00 68.00 normalization and analysis of the principal components, it
RQD % 88.43 17.84 35.00 100.00 appeared these three parameters do not change considerably, so
scm MPa 7.05 4.07 1.15 13.21
stm MPa 0.160 0.093 0.064 0.477
were omitted.
c MPa 2.40 0.96 0.69 4.73
phi deg. 32.05 6.99 22.13 49.51
E GPa 5.87 3.82 1.64 13.89
4.2. Data normalization
UCS MPa 50.39 17.07 10.00 82.35
Output dmax mm 30.96 14.12 6.50 62.10
Before training and modeling, the data had to be normalized
for keeping them in the prescribed range of 0 and þ1. Dimen-
sionless input data are necessary in order to improve the learning
speed and the stability of the models. In addition, because the
geotechnical data of intact rock was obtained by conducting rock input parameters have different units, normalization leads to
mechanics tests. On the other hand, a monitoring program was dimensionless. Normalization of the data was performed using
arranged to record deformations. the following equation:
The parameters applied for the model development are given
in Table 1, included the uniaxial compressive strength (UCS), rock X ij X jmin
X ijNorm ¼ ð12Þ
quality designation (RQD), height of overburden (H), geological X jmax X jmin
strength index (GSI), cohesion (c), angle of internal friction (phi),
Young’s modulus (E), unconfined compressive strength of rock where X ijNorm is the scaled value and Xij is the original data in
mass (scm) and uniaxial tensile strength of rock mass (stm). ith row and jth column respectively, and X jmax and X jmin are the
Among these parameters RQD, H and GSI are geological compo- respective maximum and minimum values of each related jth
nents, and UCS is intact rock parameter. Other parameters column.
5. Results package utilizes a fast and efficient method known as sequential

minimal optimization (SMO) for solving large quadratic program-
5.1. SVM model designing ming problems and thereby estimating function parameters aiþ , a i
and b in Eq. (5).
Learning the parameters of a prediction function and testing it A dataset of 60 data points was used to train and test the SVR
on the same data, yields a methodological bias. To avoid over- model. In order to find out the optimum values of the parameters
fitting, it is necessary to define two different sets: a training set and prevent the over-fitting, 70% of the total data was selected
and a test set, which are used for learning and testing the randomly for the training set and the rest was kept for testing the
prediction function, respectively. Cross-validation is a solution model. The mentioned 10-fold cross-validation method was
to split the whole data in different training set and test set. performed in the whole training set. We investigated all the
combinations of parameters C, e and g over a Log2 range of values,
5.1.1. Cross-validation as suggested in [11,16,46], so that in the range of [2 5, 215] for C
Cross-validation is a standard technique for adjusting hyper- with step sizes of 20.1, in the range of [0.01, 1.0] for e with step
parameters of predictive models and a popular technique for sizes of 0.01, and in the range of [2 15, 25] for g with step sizes of
estimating generalization error and there are several types e.g. 20.1. This procedure continues for 4040100 times (201 201
K-fold which is used in this study. In K-fold cross-validation, the 100) and finally, we pick and choose the optimum C, e and g with
training data is randomly split into K mutually exclusive subsets the lowest cross-validation error and the highest squared correla-
(the folds) of equal size. The SVM decision rule is obtained using tion coefficient according to the following function:
K 1 of the subsets and then is tested on the subset left out. This
1
procedure is repeated K times and in this mode each subset is Fitness ¼ Max R2 þ ð13Þ
MSE
used for validating once. Averaging the validation error over the K
trials gives an estimation of the expected generalization error.
where R2 is the squared correlation coefficient and MSE is the
mean squared error given as
5.1.2. SVM parameters optimization using cross-validation
There are many parameters that must be set for prediction N n2
1X Y i Y i
with SVMs. Choosing optimal parameters is an important step in MSE ¼ ð14Þ
N i ¼ 1 Yi

SVR design which strongly affects the performance of SVMs.
In this research, three parameters control SVR quality: penalty
parameter C, insensitivity zone e and the kernel function para- where Yi is the observation value, Y ni is the predicted value and N
meter g. As mentioned the parameter C determines trade-off is number of input–output data pairs. We define this fitness
between training error and VC dimension. The parameter e is the function in three loops for each combination of C, e and g and
insensitivity zone in e-insensitive loss function. The g which is select the optimum C, e and g in each fold. The average of the R2
width of the Gaussian kernel function defines the non-linear and MSE are obtained as 0.908 and 0.047 respectively, which
mapping from the input space to some multi-dimensional feature represent a suitable accuracy. Results of this operation are listed
space and controls sensitivity of the kernel function. in Table 2.
There is no single accepted procedure for estimating these Fig. 6 shows the results of cross-validation in 3D surface plots
parameters. Selection of appropriate values for the parameters C, to check for local minimization. As can be seen, at each step C and
e and g has been proposed by various researchers. For example, g have been calculated in an order that corresponding MSE is
Cherkassky and Mulier [40] suggested employment of cross- located in saddle point. This indicates that C and g are optimum
validation for the SVMs parameters choice. Mattera and Haykin because of MSE in each fold is calculated in global minimization
[41] proposed the parameter C to be within the output range of condition and the model does not over fit.
max–min values while the e value is such that the percentage Figs. 7 and 8 illustrate the MSE and R2 process during training
of support vectors in the SVR model is around 50% of the number the SVR model for C and g, respectively. As it is seen in these
of samples. Smola and Scholkopf [34] assigned optimal e values as figures, maximum R2 and minimum MSE (Eq. (13)) are obtained
proportional to the noise variance which is in agreement with the when g approximately falls in the range (0, 5). In the case of C, it
general sources on SVMs. Cherkassky and Ma [42] proposed the can be said that while this parameter is too large we have a high
selection of e parameters based on the estimated noise. An penalty for non-separable points and many support vectors may
increase in e means a reduction in requirements for the accuracy be stored and encounter over-fitting. This parameter controls the
of approximation. It also decreases the number of support vectors trade-off between margin maximization and error minimization.
leading to data compression [43]. If e is zero over-fitting is If it is too small, under-fitting may occur.
expected.
In practice the parameters C and g are varied through a wide Table 2
range of values and the optimal performance assessed using a Results of 10-fold cross-validation and optimal values of C, e and g.
separate validation set or a technique such as cross-validation for
Step Best e Best C Best c MSEOpt. R2Opt.
verifying performance using only the training set [44].
1st 0.102 0.031 0.063 0.097 0.908
2nd 0.105 0.189 0.933 0.068 0.752
5.1.3. SVR training 3rd 0.100 55.715 0.287 0.021 0.975
After selecting a suitable kernel function, one of the primary 4th 0.100 9.190 4.287 0.018 0.930
aspects of training a SVR model is an appropriate selection of the 5th 0.100 0.287 13.929 0.089 0.870
loss function parameter e, the penalty term C and the Gaussian 6th 0.100 147.033 0.117 0.015 0.965
7th 0.104 0.031 2.639 0.015 0.876
kernel parameter g. In the first step, a 10-fold cross-validation was
8th 0.100 26615.887 0.000 0.083 0.834
used to realize the optimum condition. Besides, a SVM-implemen- 9th 0.109 0.330 0.082 0.054 0.981
tation known as ‘‘e-SVR’’ in the LIBSVM 2.91.1 software library [45] 10th 0.102 14263.100 0.002 0.008 0.993
was utilized to train the SVR model. LIBSVM is a library for the Average of the best results in SVM model 0.047 0.908
support vector machines regression and classification. The LIBSVM
Fig. 6. 3D view of MSE versus log2 C and log2 g in each fold.
5.2. ANN model designing information. The development process of BP networks involves
three steps: defining the network architecture, training and
In ANN algorithm computations are structured in terms of testing the network [47]. Feed-forward networks often have one
an interconnected group of artificial neurons for processing or two hidden layers of sigmoid neurons followed by an output
Fig. 7. MSE for C and g during training the SVR model.
Fig. 8. R2 for C and g during training the SVR model.
layer of linear neurons. The linear output layer lets the network Table 3
produce values outside the range 0 to þ1. Multiple layers of Network architecture settings.
neurons with non-linear transfer functions allow the network to
Parameter Function or value
learn non-linear relationships between input and output vectors.
The neural network toolbox of the MATLAB software has been Transformation functions TanSig and LogSig
used for building the ANN code to construct the model. In this Performance function MSE
way, relationship between the dependent variable of convergence Learning rate 0.05
Momentum rate 0.01
and the independent variables of geomechanical parameters are Goal 1e-3
established using MLP method. Epochs 300
For ANN model development, the same datasets which were
used for SVR analysis are employed to train and test the model.
From this dataset, 60% of the total data was randomly selected Table 4
for the network training, 20% was used for validation and the Comparison between the best results of some MLP models for training data.
remaining 20% of the total data was employed for testing the
network. No. Transfer function Model architecture R2 MSE
Finally, each data point is a vector of nine input values, namely 1 LogSig-PureLin 9-21-1 0.684 1.546
H, GSI, RQD, UCS, c, phi, E, scm and stm as described earlier. In our 2 TanSig-PureLin 9-16-1 0.732 0.932
final model, the input layer of the network will receive input data 3 TanSig-LogSig-LogSig-PureLin 9-32-14-25-1 0.765 0.854
at nine nodes and the network will generate an output at the 4 TanSig-LogSig-TanSig-PureLin 9-12-38-19-1 0.825 0.538
5 TanSig-TanSig-PureLin 9-23-16-1 0.880 0.329
final layer.
6 LogSig-LogSig-PureLin 9-38-22-1 0.893 0.357
For designing model, the ANN architecture is tested with 7 LogSig-TanSig-PureLin 9-26-13-1 0.845 0.560
various numbers of hidden layers and nodes per hidden layer, 8 TanSig-LogSig-PureLin 9-35-28-1 0.939 0.128
and the ANN parameters are checked with various learning rules, Average of the best results in ANN model 0.820 0.656
transformation functions, learning rates and momentum rates to
find better values and architecture.
The Levenberg–Marquardt algorithm is used for training the where e is Neperian number, a is output and n is given as
network and the transformation functions used are hyperbolic
p
X
tangent sigmoid (TanSig) according to Eq. (15) and logistic n¼ pi wl,i þ b ð17Þ
sigmoid (LogSig) according to Eq. (16). The details of network i¼1
architecture settings are given in Table 3.
where p, w and b are input vector, weight matrix and bias,
en en respectively.
a¼ ð15Þ
en þ en After developing several MLP models based on trial and error,
the best result of each model listed in Table 4. The average of the
1 R2 and MSE are obtained 0.82 and 0.66 respectively, which
a¼ ð16Þ
1 þ en represent a reasonable accuracy.
Fig. 9 illustrates the graphical plot between observed and the targets as open circles. The best linear fit is indicated by a
predicted values of training dataset by post-regression analysis solid red line. The perfect fit (predicted equal to measured) is
for the best MLP model. The network outputs are plotted versus indicated by the dotted line. As shown in Fig. 9 maximum of R2 is
obtained 0.94 for the best ANN model.
Since MSE is the typical performance function used for training
feed-forward neural networks, changes in the error level during
the iterations is illustrated in Fig. 10. This figure shows the MSE of
the network for the best ANN model starting at a large value and
decreasing to a smaller value. In other words, it shows that the
ANN model is learning.
5.3. Comparison between SVM and ANN results
The graphical plots between measured and predicted values of

testing dataset by regression analysis for the best AI models are
depicted in Fig. 11. As shown in this figure, R2 is obtained as 0.965
and 0.872 for SVM and ANN models, respectively.
In SVM model, a non-linear mapping is used to map the data
into a high-dimensional feature space where linear regression is
performed. Compared to other neural network regressors, there are
three distinct characteristics when SVMs are used to estimate the
regression function. First of all, SVMs estimate the regression using
a set of linear functions that are defined in a high-dimensional
space. Secondly, SVMs carry out the regression estimation by risk
Fig. 9. Post-regression for the training data in the best MLP model. (For minimization where the risk is measured using loss function.
interpretation of the references to color in this figure legend, the reader is referred Thirdly, SVMs use a risk function consisting of the empirical error
to the web version of this article.) and a regularization term which is derived from the SRM principle.
Fig. 10. MSE of the best MLP model during learning.
Fig. 11. Testing the AI models with 30% of data.

Fig. 12. Comparison of the predicted versus measured values.
Fig. 13. Comparison between AI Results and measured values.
Table 5 convergence. In practice, convergence can be considered as a

R2 and MSE in the best AI models. primary field measurement, because it is not only a readily
recordable indicator of the overall ground response, but its
AI models Type of data R2 MSE
magnitude constitutes a very useful parameter for the evaluation
ANN model Train data 0.94 0.128 of the stability [49].
Test data 0.87 0.631 Rate of squeeze and rock loads are somewhat dependent on
SVM model Train data 0.99 0.008 tunnel size and rate of advance. It is essential in squeezing
Test data 0.97 0.143 conditions to establish a program of convergence point installa-
tion which will be routinely used to monitor the amount and rate
of movement of the tunnel walls. This information collected over
time and collated with the behavior of the tunnel support system
These significant features result in high generalization performance, will provide the information needed both to predict and to install
global extrema, and avoidance to fall into local extrema. the appropriate amount of support as tunneling progresses.
Fig. 12 shows two graphs comparing the measured and Weak rocks such as slate, phyllite, shale and schist, and the
predicted data for the ANN and preferred SVM models. Results rock mass consist of series of foldings and faults are incapable
of these two models are compared in Fig. 13. It appears that the of sustaining high tangential stress. Severe tunnel squeezing is
proposed models have predicted values close to the measured therefore common in the Ghomroud tunnel and is one of the
ones and the accuracy between measured and predicted is major concerns regarding stability.
acceptable. In addition, it is concluded that performance of the Due to the uncertainties in the geotechnical model, the hetero-
SVM model is better than ANN model. geneity of the rock mass and the deficiencies in modeling of
The R2 for the two methods of MLP and SVR are 0.94 and 0.99 the rock mass support interaction prior to construction, conver-
for training set, and 0.87 and 0.97 for testing set, respectively. gence monitoring is an important issue for optimization of the
Also, the MSE for MLP and SVR are 0.128 and 0.008 for training set tunnel construction while simultaneously observing the safety
and 0.631 and 0.143 for testing set, respectively. This results show requirements.
a good level of accuracy for prediction of tunnel convergence by The empirical and analytical approaches cannot be used in all
AI methods. Summaries of results of the AI models on the training geological situations and as they predict convergence using only
and testing data are listed in Table 5. a limited number of geomechanical parameters and applying
simplification, cannot yield realistic results. In addition, these
approaches consider circular opening in homogeneous rock
6. Discussion material with a hydrostatic stress state to estimate deformation
and none of these approaches consider time dependent deforma-
Since the easiest and the most reliable parameter recorded in tion. They consider only instantaneous squeezing deformation.
the field is certainly the tunnel convergence [48], this research Thus, these approaches cannot present a reliable and unique
can be applied in stability analysis using predicting tunnel solution for all the conditions.
In this research using artificial intelligent based methods, a [8] Lee C, Sterling R. Identifying probable failure modes for underground open-
simple and dynamically procedure has been established for ings using a neural network. Int J Rock Mech Min Sci Geomech Abstr
1992;29(1):49–67.
estimating ground behavior based on geomechanical and mon- [9] Leu SS, Chen CN, Chang SL. Data mining for tunnel support stability: neural
itoring data obtained from the same case. According to the results network approach. Autom Constr 2001;10(4):429–41.
obtained from this research, it can be said that unlike the [10] Yoo C, Kim J. Tunneling performance prediction using an integrated GIS and
neural network. Comput Geotech 2007;34(1):19–30.
traditional methods, the AI models are useful means to predict [11] Zhang H, Wang YJ, Li YF. SVM model for estimating the parameters of the
tunnel convergence reliably in all geological environments. probability-integral method of predicting mining subsidence. Min Sci Tech-
nol 2009;19:385–8.
[12] Khandelwal M. Evaluation and prediction of blast-induced ground vibration
using support vector machine. Int J Rock Mech Min Sci 2010;47:509–16.
7. Conclusions [13] Zhao HB. Slope reliability analysis using a support vector machine. Comput
Geotech 2008;35:459–67.
[14] Zhao XH, Wang G, Zhao KK, Tan DJ. On-line least squares support vector
In this paper, an attempt was made to apply AI algorithms machine algorithm in gas prediction. Min Sci Technol 2009;19:194–8.
for prediction of the ground condition in a TBM excavated tunnel [15] Feng XT, Zhao H, Li S. Modeling non-linear displacement time series of geo-
by which preventative measures can be considered to avoid materials using evolutionary support vector machines. Int J Rock Mech Min
Sci 2004;41(7):1087–107.
undesirable events such as machine trapping. As such, the [16] Tan ZX, Li PX, Yan LL, Deng KZ. Study of the method to calculate subsidence
most commonly AI based methods, SVM and ANN were selected coefficient based on SVM. Procedia Earth Planet Sci 2009;1:970–6.
and utilized to predict tunnel convergence in relation to effective [17] Liu KY, Qiao CS, Tian SF. Design of tunnel shotcrete-bolting support based
on a support vector machine approach. Int J Rock Mech Min Sci 2004;41:
parameters (i.e., H, GSI, RQD, UCS, c, phi, E, scm and stm). For
510–1.
this purpose, a database of the Ghomroud tunnel having squeez- [18] Yao BZ, Yang CY, Yu B, Jia FF, Yu B. Applying support vector machines to
ing problem was prepared for establishment of the AI models. predict tunnel surrounding rock displacement. Appl Mech Mater 2010;29–
Running the SVM and ANN models, it was observed that the 32:1717–21.
[19] Ramoni M, Anagnostou G. Thrust force requirements for TBMs in squeezing
SVM model gives more accurate prediction as compared to the ground. Tunnelling Underground Space Technol 2010;25:433–55.
ANN model. Also SVM results are competent and demands the [20] Ramoni M, Anagnostou G. On the feasibility of TBM drives in squeezing
optimal selection of only a few number of control parameters ground. Tunnelling Underground Space Technol 2006;21:262.
[21] Aydan Ö, Akagi T, Kawamoto T. The squeezing potential of rocks around
such as C, e and g when compared with ANN. The ANN model tunnels: theory and prediction. Rock Mech Rock Eng 1993;26(2):137–63.
involves large number of such parameters and their optimal [22] Hoek E, Marinos P. Predicting tunnel squeezing problems in weak hetero-
combination is a tedious process. geneous rock masses. Tunnels Tunnelling Int 2000;32(11):45–51.
[23] Carranza-Torres C, Fairhurst C. Application of the convergence-confinement
Performance evaluation of the developed models was fulfilled method of tunnel design to rock masses that satisfy the Hoek–Brown
by calculating R2 and MSE between the models outputs and failure criterion. Tunnelling Underground Space Technol 2000;15(2):
actual convergence occurred. Accordingly, R2 was determined 187–213.
[24] Deere DU. Adverse geology and TBM tunneling problems. In: Proceedings of
0.965 and 0.872 for SVM and ANN respectively, whereas MSE
the rapid excavation and tunneling conference. San Francisco: University of
was calculated 0.143 and 0.631 for SVM and ANN, respectively. Florida; 1981. p. 574–85.
These results show an acceptable accuracy of the proposed [25] Kovari K, Staus J. Basic considerations on tunneling in squeezing rock. Rock
models. Mech Rock Eng 1996;29(4):203–10.
[26] Barla G. Squeezing rocks in tunnels. Int Soc Rock Mech (ISRM) News J
Continuous monitoring of displacements along the tunnel face 1995;2(3–4):44–9.
is a reliable method for checking the efficiency of the proposed [27] Panet M. Two case histories of tunnels through squeezing rocks. Rock Mech
models. It is not out of place to mention that such methods can Rock Eng 1996;29(3):155–64.
[28] Steiner W. Tunneling in squeezing rocks: case histories. Rock Mech Rock Eng
also be established to be used for the prediction of the conver- 1996;29(4):211–46.
gence of other underground spaces than tunnels such as mine [29] Broere W. Tunnel face stability and new CPT applications. In: Tol AF, Horvat
openings and caverns. E, advisors. PhD thesis. Netherlands: Delft University Press Science; 2001.
ISBN 90-407-2215-3.
[30] Vapnik V, Golowich S, Smola A. Support vector method for function
approximation, regression estimation and signal processing. In: Mozer MC,
Acknowledgments Jordan MI, Petsche T, editors. Advances in neural information processing
systems. Cambridge MA: MIT Press; 1997. p. 281–7.
[31] Cortes C, Vapnik V. Support vector network. Mach Learn 1995;20:273–97.
The authors wish to extend their sincere thanks to the [32] Basak D, Pal S, Patranabis DC. Support vector regression. Neural Inf
authorities of the Ghomroud conveyance tunnel project especially Process—Lett Rev 2007;11(10):203–24.
[33] Vapnik V. The nature of statistical learning theory. 2nd ed. New York:
Sahel Consulting Engineers, for providing facilities and access to Springer-Verlag; 1995.
the data. [34] Smola A, Scholkopf B. A tutorial on support vector regression. Stat Comput
2004;14:199–222.
[35] Cybenko GV. Approximation by superpositions of a sigmoidal function. Math
References Control Signal Syst 1989;2(4):303–14.
[36] Rosenblatt FX. Principles of neurodynamics: perceptrons and the theory of
brain mechanisms. Washington DC: Spartan Books; 1961.
[1] Ramoni M, Anagnostou G. Tunnel boring machines under squeezing condi- [37] Rumelhart DE, Hinton GE, Williams RJ. Learning internal representations by
tions. Tunnelling Underground Space Technol 2010;25:139–57. error propagation. In: Rumelhart DE, McClelland JL, editors. Parallel distrib-
[2] Cantieni L, Anagnostou G, Hug R. Interpretation of core extrusion measure- uted processing: explorations in the microstructure of cognition. Cambridge:
ments when tunneling through squeezing ground. Rock Mech Rock Eng MIT Press; 1986. p. 318–62.
2011;44:641–70. [38] Demuth H, Beale M. Neural network toolbox for use with MATLAB. 4th ed.
[3] Benardos AG, Kaliampakos DC. Modeling TBM performance with artificial Cambridge, MA: The Math Works Inc; 2002.
neural networks. Tunnelling Underground Space Technol 2004;19(6): [39] Sahel Consulting Engineers. Report of geological study and engineering
597–605. geotechnical of III and IV parcels of long Ghomroud tunnel. Tehran; 2003.
[4] Rafiai H, Jafari A. Artificial neural networks as a basis for new generation of [40] Cherkassky V, Mulier F. Learning from data: concepts, theory and methods.
rock failure criteria. Int J Rock Mech Min Sci 2011;48:1153–9. New York: Wiley; 1998.
[5] Yang Y, Zhang Q. A hierarchical analysis for rock engineering using artificial [41] Mattera D, Haykin S. Support vector machines for dynamic reconstruction of
neural networks. Rock Mech Rock Eng 1997;30(4):207–22. a chaotic system. In: Scholkopf B, Burges CJC, Smola A, editors. Advances in
[6] Kim CY, Bae GJ, Hong SW, Park CH, Moon HK, Shin HS. Neural network based kernel methods: support vector machine. Cambridge, MA. MIT Press; 1999.
prediction of ground surface settlements due to tunneling. Comput Geotech p. 169–84.
2001;28:517–47. [42] Cherkassky V, Ma Y. Practical selection of SVM parameters and noise
[7] Suwansawat S, Einstein HH. Artificial neural networks for predicting the estimation for SVM regression. Neural Networks 2004;17:113–26.
maximum surface settlement caused by EPB shield tunneling. Tunnelling [43] Kecman V. Learning and soft computing, Support Vector Machines, Neural
Underground Space Technol 2006;21(2):133–50. Networks and Fuzzy Logic models. Cambridge, MA: MIT Press; 2001.
[44] Shawe-Taylor J, Cristianini N. Kernel methods for pattern analysis. Cambridge: National Taiwan University. Technical report available at: /http://www.
Cambridge University Press; 2004. csie.ntu.edu.tw/ cjlin/libsvmS; 2010.
[45] Chang CC, Lin CJ. LIBSVM: a library for support vector machines. Taipei: [47] Lawrence J, Stanley J. Introduction to neural networks. 3rd ed. Grass Valley
Department of Computer Science and Information Engineering, National California: Scientific Software; 1991.
Taiwan University. Software available at: /http://www.csie.ntu.edu.tw/ [48] Sulem J, Panet M, Guenot A. Closure analysis in deep tunnels. Int J Rock Mech
cjlin/libsvmS;2011. Min Sci Geomech Abstr 1987;24:145–54.
[46] Hsu CW, Chang CC, Lin CJ. A practical guide to support vector classification. [49] Indraratna B, Kaiser PK. Design for grouted rock bolts based on the convergence
Taipei: Department of Computer Science and Information Engineering. control method. Int J Rock Mech Min Sci Geomech Abstr 1990;27(4):269–81.

Mahdevari 2012

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Mahdevari 2012

Încărcat de

Drepturi de autor:

Formate disponibile

International Journal of Rock Mechanics & Mining Sciences 55 (2012) 33–44

Contents lists available at SciVerse ScienceDirect

Application of artiﬁcial intelligence algorithms in predicting tunnel

1. Introduction employed before facing undesirable phenomena such as TBM

Fig. 2. The e-insensitive loss setting corresponds to a linear SVR [33].

where J is the Jacobian matrix which contains ﬁrst derivative of

Fig. 3. Schematic location of Ghomroud water conveyance tunnel.

Fig. 5. TBM jamming under squeezing conditions in Ghomroud tunnel.

5. Results package utilizes a fast and efﬁcient method known as sequential

Fig. 6. 3D view of MSE versus log2 C and log2 g in each fold.

Fig. 7. MSE for C and g during training the SVR model.

Fig. 8. R2 for C and g during training the SVR model.

5.3. Comparison between SVM and ANN results

The graphical plots between measured and predicted values of

Fig. 10. MSE of the best MLP model during learning.

Fig. 11. Testing the AI models with 30% of data.

Fig. 12. Comparison of the predicted versus measured values.

Fig. 13. Comparison between AI Results and measured values.

Table 5 convergence. In practice, convergence can be considered as a

S-ar putea să vă placă și