Sunteți pe pagina 1din 5

Journal of Loss Prevention in the Process Industries 29 (2014) 262e266

Contents lists available at ScienceDirect

Journal of Loss Prevention in the Process Industries


journal homepage: www.elsevier.com/locate/jlp

A simplied Markov-based approach for safety integrity level


verication
Yidan Shu, Jinsong Zhao*
State Key Laboratory of Chemical Engineering, Department of Chemical Engineering, Tsinghua University, Beijing 100084, China

a r t i c l e i n f o

a b s t r a c t

Article history:
Received 3 April 2013
Received in revised form
4 August 2013
Accepted 30 March 2014

Safety integrity level (SIL) verication is a critical step in the life cycle of safety instrumented systems
(SIS). For the chemical process industry (CPI), SIL verication often means calculation of average probability of failure on demand (PFDavg) of a SIS. Markov analysis covers most aspects of the quantitative
safety evaluation and shows great exibility. However, for a complex SIS, the model building is so
cumbersome that industrial users always try to avoid using it. A simplied approach by conducting
Markov analysis on each channel of a SIS with any type of architecture and combining the results
together is proposed in this paper. A case study is performed and the results prove that the simplied
approach can simplify Markov modeling without loss of accuracy if a proper common cause failure (CCF)
model is adopted.
2014 Elsevier Ltd. All rights reserved.

Keywords:
Safety integrity level
Markov analysis

1. Introduction
It is of great importance to ensure the safety of industrial processes. Safety instrumented system (SIS) is an important layer of
protection for reducing the process safety risk.
To categorize the risk reduction factor, the SIS-related standard,
IEC 61508, puts forward an important concept, safety integrity level
(SIL). SIL, which means the order of magnitude levels of risk
reduction, is a critical parameter in most of the phases of safety
lifecycle of SIS. Despite of being built to achieve a high availability,
SIS may still fail. According to the standards, the ability of the SIS to
achieve a specic SIL must be validated at each stage of design and
prior to any changes made to the design after commissioning
(Wang, West, & Mannan, 2004).
IEC 61508 denes two modes of a SIS. One is the low demand
mode, and the other is the high demand or continuous mode. For
the low demand mode, which is the most commonly used mode in
the CPI, SIL is related to both the hardware fault tolerance (HFT) and
the average probability of failure on demand (PFDavg). HFT expresses architectural constraints, which means the requirements to
achieve a sufciently robust architecture (Lundteigen & Rausand,
2009). On the other hand, PFDavg represents the unavailability of

* Corresponding author. Tel.: 86 010 62783109.


E-mail address: jinsongzhao@tsinghua.edu.cn (J. Zhao).
http://dx.doi.org/10.1016/j.jlp.2014.03.013
0950-4230/ 2014 Elsevier Ltd. All rights reserved.

a SIS, and is more difcult to determine. Table 1 shows the relation


between SIL and PFDavg.
Commonly, each part of SIS (sensor, logic controller and actuator) consists of a MooN architecture. A MooN architecture is
dened as a system with N units where M  N, and in which M out
of N units are sufcient to initiate an action. The modeling of MooN
systems was discussed (Lu & Lewis, 2008; Vaurio, 2011). The reliabilities of each part of SIS are often assumed to be independent of
each other and their PFDavg can be calculated separately and then
added up to get the PFDavg of the whole SIS. That is to say, the key
issue of determination of PFDavg of SIS is how to calculate the PFDavg
of a certain MooN architecture. In the rest of this paper, we will
discuss the PFDavg of MooN architecure rather than that of a SIS.
So far a number of techniques have been proposed for the calculation of PFDavg of MooN architectures. Among them are reliability
block diagram (RBD) (Guo & Yang, 2007), fault tree analysis (FTA)
(Summers, 2000), Markov analysis (Bukowski & Goble, 1995), and
simplied equation (SE) (Oliveira & Abramovitch, 2010). Rouvroye
and Brombacher compared some of these techniques and concluded
that different analysis techniques may lead to different calculated
SILs and Markov analysis covers most aspects for quantitative safety
evaluation (Rouvroye & Brombacher, 1999). Guo and Yang also
pointed out that Markov analysis shows more exibility and is the
only technique that can describe dynamic transitions among
different system states (Guo & Yang, 2008). Jin et al. utilized Markov
analysis to calculate hazardous event frequency (HEF), which also
relates to the reliability of SIS (Jin, Lundteigen, & Rausand, 2011).

Y. Shu, J. Zhao / Journal of Loss Prevention in the Process Industries 29 (2014) 262e266

263

10

Table 1
SIL vs. PFDavg for low demand mode.
SIL1

SIL2

SIL3

SIL4

PFDavg

102e101

103e102

104e103

105e104

Although Markov analysis has the advantages described above, it


is very time-consuming to build the Markov model for a large and
complex SIS system manually due to the large amount of system states
that need to be considered. Therefore, it has been widely recognized
that the design of Markov models for a complex SIS architecture is
difcult and error prone (Dutuit, Innal, Rauzy, & Signoret, 2008).
Torres-Echeverria et al. developed a time-dependent PFD algorithm to model SIS with MooN architectures (Torres-Echeverria,
Martorell, & Thompson, 2009, 2011). Their algorithm avoided using state/transition approaches by calculating the probability of
each type of failures separately and thus did not present the
disadvantage of excessive complexity growth. In Section 2 its difference from Markov analysis will be discussed.
To make use of the advantages of Markov analysis and avoid its
great time-consumption meanwhile, Guo and Yang developed an
automatic Markov model creation technique by a computer program (Guo & Yang, 2008). However, the large size of the model
caused by the large amount of states makes it difcult to read and
revise manually whenever necessary.
Differing from the methods above, we presents an approach in
this paper to simplify the Markov modeling based on the idea that
the Markov modeling process of each components of the whole
system can be performed individually. Through the simplied
approach, building huge Markov models can be avoided. Meanwhile, the results of our simplied approach show good agreement
with those of the full Markov models.
2. Conventional Markov analysis
Markov analysis is based on the process of transitions between
system states. The probability of each particular transition is
assumed to be constant, so Markov process has the characteristic
that the future status of the system directly depends on its current
status, rather than its history. In Markov analysis the transitions
between system states are considered simultaneously. As a result,
as, it will lead to signicantly different PFD results from TorresEcheverrias approach (Torres-Echeverria et al., 2009, 2011), which
didnt consider the safe failure rates in calculating PFD, when the
rate of safe failure undetected is large.
For example, suppose there is a 1oo1 architecture where the safe
and dangerous failures cannot be detected, i.e. lDD lSD 0. Without
loss of generality, we may assume there are three 1oo1 architectures
with lDU 1e-6 h1, and lSU 1e-6, 1e-5, 1e-4 h1 respectively. By
using Torres algorithm (Torres-Echeverria et al., 2011) and the Markov analysis algorithm for the 1oo1 architecture in appendix, the
time-dependent PFDs during the rst 12 month before the rst proof
test can be obtained for the above three 1oo1 architecture. Fig. 1
shows the PFD comparisons between Torres algorithm and the Markov analysis algorithm for the three architectures.
Since Torres algorithm didnt consider safe failure in calculating
PFD, its PFD calculation results dont change with the change of lSU.
From Fig. 1, it can be concluded that for architectures with small
lSUs, the PFD calculation results from Torres algorithm and Markov
analysis are similar while for those with large lSUs, their PFD
calculation results are signicantly different.
Markov analysis takes account of the state transition processes not
only of failures, but also of reparations and periodic proof tests. Such
feature inherently makes Markov analysis more precise to analyze the

Torres algorithm

PFD

SIL

SU

SU

MA:
MA:

=1e6 hr
=1e5 hr

MA: SU=1e5 hr1

10

t/month

10

11

12

Fig. 1. PFD calculation result comparison between Torres algorithm and the Markov
analysis (MA) algorithm.

time-dependent process of the state transitions and more exible to


evaluate different safety indices, including PFD, probability of failing
safety (PFS), mean time before failure (MTBF) and so on.
However, as the conventional Markov analysis always conduct a
full Markov analysis (FMA), which includes all statues of the whole
SIS system in one transition matrix, it suffers from a great difculty
in matrix building when the system becomes complex. For
example, for a MooN architecture which is commonly used to
improve the reliability of a SIS, with N, the number of channels
increasing, the number of possible intermediate states of the MooN
architectures grows quickly. As a result, the matrix model becomes
quite big when N is large. Table 2 shows how the size of the matrix
model derived from literature (Guo & Yang, 2008) increases
explosively as the system becomes more complex.

3. Simplied Markov analysis (SMA)


3.1. Introduction to the simplied approach
Conventional Markov analysis tries to include the whole voting
group and suffers from the exploding size of the model as the
number of channels in the group increases. To overcome such a
difculty, we assume that each channel can be treated separately
and then the failure probabilities of all channels can be combined
together. Based on this assumption, the Markov analysis can be
simplied. A critical step of our simplied approach is the consideration of common cause failure (CCF). A CCF can present a systematic failure, or a random hardware failure. It will lead to a
dependent failure in the MooN architecture, and make the whole
architecture less reliable (Boercsoek & Holub, 2008).
Table 2
The size of the matrix model in reference (Guo & Yang, 2008).
Type of MooN

Size of matrix model

1oo1
1oo2
2oo2*
2oo3
3oo4*
6oo6*

4-by-4
7-by-7
13-by-13
23-by-23
54-by-54
204-by-204

*Deduced based on reference (Guo & Yang, 2008) by authors of this


paper.

264

Y. Shu, J. Zhao / Journal of Loss Prevention in the Process Industries 29 (2014) 262e266

Knegtering and Brombacher developed the micro Markov


model which is similar to ours (Knegtering & Brombacher, 1999).
Their model divided the whole Markov model into small ones
according to reliability block diagram and combined the results
on the assumption of independent events. Although their method
was viable in reducing the model size, the impact of CCF was
ignored, which makes the preciseness of their modeling results
suspicious.

3.2.3. Step 3 calculating PFDavg of each part of the SIS


In step 2, PFD(t) represents the failure probability of a certain
part of the SIS (either sensor, logic controller or actuator) at each
month. The average PFD of each part of the SIS can be derived by:

PFDavg

X
1 12LT
PFDt
12LT t 1

(2)

4. Case study
3.2. Procedure of the simplied approach

3.2.1. Step 1 conducting conventional Markov analysis on each


channel
Each channel of the MooN architecture will be treated as a 1oo1
architecture to generate its time-dependent process of failure by
conducting Markov analysis. Hence only a 5-by-5 matrix is required
to be built no matter how large M and N are. Details of Markov
analysis on a 1oo1 architecture are demonstrated in the appendix
of this paper. By utilizing the simple Markov analysis of the 1oo1
architecture, l(t), the time-dependent failure probability of each
channel, can be obtained.
3.2.2. Step 2 calculating the time-dependent failure probability for
the whole MooN architecture
To take into account the effect of CCF, a CCF model is necessary
to combine the failure probabilities of all the channels in a MooN
conguration. In our proposed approach, the CCF model is directly
applied to the combination of time-dependent PFD obtained in
Step 1.
In IEC 61508, the single b-factor model was introduced to handle
CCF. In the single b-factor model, the factor b represents the ratio of
CCF rate to total failure rate, and it can be estimated as suggested in
IEC 61508. However, the single b-factor model doesnt distinguish
between different voting architectures, and the same result is obtained, e.g. for 1oo2, 1oo3 and 2oo3 (Hokstad & Corneliussen,
2004). For CCF occurring in different voting architectures, the
multi-b factor model will give a more accurate description. In the
multi-b factor model, CMooN, a modication factor of b-factor for
different MooN architectures has to be considered. Based on Hokstads work (Hokstad & Corneliussen, 2004; Hokstad, Maria, &
Tomis, 2006), values of CMooN for a series of MooN architectures
are shown in Table 3.
Then with the multi-b factor model, our simplied approach
will give the time-dependent PFD of any MooN architecture as:

A case study is performed in this section to illustrate the effectiveness of the above simplied approach. The result of the
approach is compared with the result of conventional Markov
approach. The matrix model for a full Markov analysis (FMA) is built
by using the automatic Markov analysis approach described in
literature (Guo & Yang, 2008).
In FMA, only the single b-factor model was applied. To keep
consistent with the CCF model, the multi b-factor model is applied
to both FMA and SMA in Section 4.2.
4.1. Parameter initiation
In the architectures discussed in this section, we assume all the
channels are identical. The data of the relevant parameters are set
as follows:

lDD 1e-6 h1


lDU 1e-7 h1
lSD 1e-6 h1
lSU 1e-7 h1
b 0.1

MTTR 24 h
TSD 24 h
LT 5 year
TI 12 month
CTI 0.9

4.2. Comparison of PFD results


To obtain a comparison between FMA and SMA with multi-b
factor model, the CCF-related elements in the transient matrix in

10
6oo6
4oo4
3oo3
2oo2
1oo1
5oo6

10

3oo4
2oo3
1oo2
2oo4
3oo6

10

1oo3

PFD

In our simplied approach, to avoid building the complex


transition matrix, the time-dependent PFD calculation of each part
of a SIS system is decomposed into two steps: Step 1, conducting
analysis on each channel as a 1oo1 system and Step 2, calculating
the time-dependent PFD of the whole MooN architecture. So our
simplied approach to the calculation of PFDavg consists of the
following three steps:

1oo4

PFDt CMooN ,b,lt

(1)

10
1oo6

10

Table 3
Values of CMooN of different MooN voting architectures.

N
N
N
N
N

2
3
4
5
6

M1

M2

M3

M4

M5

M6

1
0.3
0.15
0.08
0.04

2/b  1
2.4
0.75
0.45
0.26

e
3/b  2.7
4.0
1.2
8.3

e
e
4/b  4.95
6.0
1.6

e
e
e
5/b  7.7
8.1

e
e
e
e
6/b  10.8

10

t/year

Fig. 2. Results of FMA (red stars) and SMA (black lines) when multi b-factor model
used. (For interpretation of the references to color in this gure legend, the reader is
referred to the web version of this article.)

Y. Shu, J. Zhao / Journal of Loss Prevention in the Process Industries 29 (2014) 262e266

Appendix. Markov analysis for 1oo1 architecture

Table 4
PFDavg result comparison with the multi b-factor model applied.
Voting architecture
1oo1
1oo2
2oo2
1oo3
2oo3
3oo3
1oo4
2oo4
3oo4
4oo4
1oo6
3oo6
5oo6
6oo6

FMA
5.84e-4
5.88e-5
1.11e-3
1.76e-5
1.41e-4
1.59e-3
8.78e-6
4.40e-5
2.38e-4
2.05e-3
2.34e-6
4.27e-5
4.72e-4
2.87e-3

SMA
5.84e-4
5.84e-5
1.11e-3
1.75e-5
1.40e-4
1.60e-3
8.76e-6
4.38e-5
2.37e-4
2.05e-3
2.34e-6
4.27e-5
4.69e-4
2.87e-3

265

Relative
deviation (%)
0
0.68
0
0.57
0.71
0.63
0.22
0.45
0.42
0
0
0
0.64
0

the Markov model of FMA are replaced by the failure rates calculated based on the multi-b factor model.
Based on FMA and SMA with Equation (1) applied, the timedependent PFDs are obtained and shown in Fig. 2. The PFDavgs for
different voting architectures derived from FMA and SMA using
equation (2) are listed in Table 4. According to the comparison results in
Fig. 2 and Table 4, the proposed SMA shows high agreement with FMA
on MooN architectures as well with the multi-b factor model applied.
4.3. Summary
Although the simplied modeling of SMA omits the intermediate states of the MooN architecture, the results still show high
agreement with those of the full modeling of FMA if a proper CCF
model is applied. In our opinion, such phenomenon is caused by the
following features of SIS.
In common practice, SISs often stay in the normal state, which
means the probability of the normal state is much higher than that
of the other states. As a result, if the normal state has a transition
probability to dangerous failure state in a full Markov matrix, the
transition probability will be a dominant contribution factor to the
calculated PFD, and other transition paths can be ignored without
loss of signicant accuracy. The range of parameters within which
our simplication can keep good agreement with the full modeling
still needs further study.
5. Conclusion
SIL verication is critical for assuring the reliability of SIS. In the
CPI, the calculation of PFDavg is an important and difcult part in SIL
verication. Markov analysis shows great advantages in exibility
and ability to describe the time-dependent PFD. But the size of
Markov models increases explosively as systems become complex.
This paper presents a simplied three-step Markov analysis
approach that avoids building large Markov models. The presented
approach simplies the modeling and calculating of Markov-based
SIL verication without loss of accuracy, and makes it more
convenient to apply different CCF models or other failure-related
assumptions.
Acknowledgments
The authors gratefully acknowledge nancial support from the
National Basic Research Program of China (973 Program, Grant No.
2012CB720500) and the National High-Tech R&D Program of China
(863 Program)(No. 2013AA040702).

The transition matrix of 1oo1 between proof tests can be built


as:

1S
B 1=T
B
SD
P B
B
@ 1=MTTR

lSD

1S

lSU

lDD

lDU

1
1S

1
C
C
C
C
A

(3)

1
S means sum of other elements on the same row. The statuses in
sequence are:
1.
2.
3.
4.
5.

Normal;
Safe failure detected;
Safe failure undetected
Dangerous failure detected;
Dangerous failure undetected

The transition matrix during a proof test with the same


sequence of statuses is

1
B 1
B
W B
B CTI
@ 1
CTI

0
0
0
0
0

0
0
1  CTI
0
0

0
0
0
0
0

1
0
0 C
C
0 C
C
0 A
1  CTI

(4)

Considering there are 24 h each day, if we assume each month


has 31 days, the monthly transition matrix is

Pm P 744

(5)

The initial status is assumed to be normal, so the initial


probability of status distribution is S0 [1 0 0 0]. Then the
probability distribution of status of month t can be calculated
by:

01

8
>
<


n
TI W
r ; rs0
S
P
Pm
0
m
BC
S@t A

n1
>
TI W
TI ; r 0
: S0 Pm
Pm

(6)

Wherein r tmod TI and n t  r=TI


PFD corresponds to the detected and undetected dangerous
failure, so it can be calculated by:

lt St,V

(7)

Wherein

V 0

1 T

(8)

References
Boercsoek, J., & Holub, P. (2008). Consideration of common cause failure in safety
systems. In WSEAS Conference on Recent Advances in Systems, Communication
and Computers, Hangzhou, China.
Bukowski, J., & Goble, W. (1995). Using Markov models for safety analysis of programmable electronic systems. ISA Transactions, 34, 193e198.
Dutuit, Y., Innal, F., Rauzy, A., & Signoret, J.-P. (2008). Probabilistic assessments in
relationship with safety integrity levels by using fault trees. Reliability Engineering and System Safety, 93, 1867e1876.
Guo, H., & Yang, X. (2007). A simple reliability block diagram method for safety
integrity verication. Reliability Engineering and System Safety, 92, 1267e
1273.
Guo, H., & Yang, X. (2008). Automatic creation of Markov models for reliability
assessment of safety instrumented systems. Reliability Engineering and System
Safety, 93, 807e815.

266

Y. Shu, J. Zhao / Journal of Loss Prevention in the Process Industries 29 (2014) 262e266

Hokstad, P., & Corneliussen, K. (2004). Loss of safety assessment and IEC 61508
standard. Reliability Engineering and System Safety, 83, 111e120.
Hokstad, P., Maria, A., & Tomis, P. (2006). Estimation of common cause factor from
systems with different number of channels. IEEE Transactions on Reliability, 55,
18e25.
Jin, H., Lundteigen, M. A., & Rausand, M. (2011). Reliability performance of
safety instrumented systems: a common approach for both low- and highdemand mode of operation. Reliability Engineering and System Safety, 96,
365e373.
Knegtering, B., & Brombacher, A. (1999). Application of micro Markov models for
quantitative safety assessment to determine safety integrity levels as dened by
the IEC 61508 standard for functional safety. Reliability Engineering and System
Safety, 66, 171e175.
Lu, L., & Lewis, G. (2008). Conguration determination for k-out-of-n partially
redundant systems. Reliability Engineering and System Safety, 93, 1594e
1604.
Lundteigen, M., & Rausand, M. (2009). Architectural constraints in IEC 61508: do
they have the intended effect? Reliability Engineering and System Safety, 94,
520e525.
Oliveira, L., & Abramovitch, R. (2010). Extension of ISA TR84.00.02 PFD equation to KooN architectures. Reliability Engineering and System Safety, 95,
707e715.
Rouvroye, J., & Brombacher, A. (1999). New quantitative safety standards: different
techniques, different results? Reliability Engineering and System Safety, 66, 121e
125.
Summers, A. (2000). Viewpoint on ISA TR84.0.02 e simplied methods and fault
tree analysis. ISA Transactions, 39, 125e131.
Torres-Echeverria, A. C., Martorell, S., & Thompson, H. A. (2009). Modelling and
optimization of proof testing policies for safety instrumented systems. Reliability Engineering and System Safety, 94, 834e854.
Torres-Echeverria, A. C., Martorell, S., & Thompson, H. A. (2011). Modeling safety
instrumented systems with MooN voting architectures addressing system
reconguration for testing. Reliability Engineering and System Safety, 96, 545e
563.
Vaurio, J. (2011). Unavailability equations for k-out-of-n systems. Reliability Engineering and System Safety, 96, 350e352.

Wang, Y., West, H. H., & Mannan, M. S. (2004). The impact of data uncertainty in
determining safety integrity level. Process Safety and Environmental Protection,
82, 393e397.

Glossary
CCF: common cause failure
CMooN: modication factor of b factor in multi-b factor model, unitless
CkN: the number of k-combination from a set of N elements, unitless
CTI: coverage of proof test, unitless
FTA: fault tree analysis
FMA: full Markov analysis
HFT: hardware fault tolerance
IEC: International Electrotechnical Commission
LT: lifetime, year
mod: modulo operator
MTTR: mean time to repair, h
MooN: M out of N voting architecture
MTBF: mean time before failure, h
PFD: probability of failure on demand, unitless
PFDavg: average PFD, unitless
PFD(t): time-dependent PFD, unitless
PFS: probability of failing safely, unitless
SIL: safety integrity level
SIS: safety instrumented system
SMA: simplied Markov analysis
t: time, month
TI: proof test interval, month
TSD: time to start up, h
b: b factor in b factor model for quantication of CCF, unitless
l(t): time-dependent probability of 1oo1 failure
lDD: rate of dangerous failure detected, h1
lDU: rate of dangerous failure undetected, h1
lSD: rate of safe failure detected, h1
lSU: rate of safe failure undetected, h1

S-ar putea să vă placă și