Sunteți pe pagina 1din 18

7 Fault Tree Analysis

Dont meet troubles halfway.


Sixteenth-century proverb
Dig a well before you are thirsty.
Chinese proverb
Nothing is so easy as to deceive ones self; for what we wish, that we readily believe.
Orations, Vol. 1, 349 BCE
Demosthenes

Fault tree analysis (FTA) is a graphical method commonly used in both reliability
engineering and system safety engineering (though it is more well known in reliabil-
ity circles). It is a deductive approach that is very powerful as a qualitative analysis
tool that can be quantified. You postulate a top eventor faultsuch as train derail-
ment, then branch down from the top event, listing the faults in the system that must
occur for the top event to occur. This top-down method forces you to go through
systematically, listing the various sequential and parallel events or combinations of
faults that must occur for the undesired top event. Logic gates and standard Boolean
algebra allow you to quantify the fault tree with event probabilities and thus deter-
mine the probability of the top event.
It is important to understand that this is not a model of all possible system failures
or all possible causes, but rather, a model of particular system failure modes and
their constituent faults that lead to the top event. Not all system or component fail-
ures are listed, only the ones leading to the top event. Like the other safety analysis
techniques discussed previously, only credible faults are assessed. The faults can
be events associated with component hardware failures, software glitches, human
errors, and environmental conditionsin short, any of the elements that make up
the complete system.
The fault tree was first developed in 1961 for the U.S. military intercontinental
missile program. The U.S. Nuclear Regulatory Commission published a guide in
1981, and since then, FTA has been used in almost every engineering discipline
around the world, from mass transit to commercial nuclear power plants, chemical
process plants, oil drilling platforms, NASA satellites, and aircraft control centers.
Fault trees are used extensively in accident investigation. NASA used fault trees to
recreate the events that lead up to the Challenger and Columbia Space Shuttle acci-
dents. Fault trees have been combined with event trees, and other root cause analyses
have been used very effectively in accident investigation, including the investiga-
tion of a plutonium spill at a Boulder, Colorado, National Institute of Standards and
Technology laboratory.

205
206 System Safety Engineering and Risk Assessment: A Practical Approach

Dynamic FTA is used more commonly in computer systems fault analysis and
involves employing Markov analysis to generate the tree. Dynamic fault trees are
also frequently used to model fault-tolerant systems. The challenge is that the size of
the tree grows very quickly and can be very cumbersome to manipulate.
NASA succinctly defines (Stamatelatos et al., 2002) the process of conducting
anFTA:

1.
Identify the objective for the FTADetermines what the engineer wants to
know before starting the analysis
2.
Define the top event of the fault treeStates the end result that is being
investigated and should give the information needed to meet the objective
defined in Step 1 (it defines the fault mode of the system)
Downloaded by [Wayne State University] at 13:03 16 August 2016

3.
Determine the scope of the FTABounds how far the analysis should go
and determines which faults will be included and their boundary conditions
4.
Define the resolution of the FTADetails the level of fault causes that will
be followed to reach the top event
5.
Define the ground rules of the FTADetermines the naming scheme for
the analysis and how the fault tree will be modeled
6.
Construct the fault treeBuilds the actual fault tree (graphically and
logically)
7.
Evaluate the fault treeConducts quantitative and qualitative analysis of
the fault tree through cut sets and Boolean algebra
8.
Interpret and present resultsExplains to the reader what all this means
(this is the most important part of the analysis; results must be put into a
context that makes sense and is understandable)

7.1 FAULT TREE SYMBOLS AND LOGIC


The fault tree uses logic gates to describe graphically how the top event occurs. You
read a fault tree from the top event down to the constituent events. The higher gates
are the outputs from lower gates in the tree. Therefore, the top event is the output of
all the input faults or events that occur.
First, it is important to understand the difference between fault and failure. Simply
put, failure means that something has broken. Fault means that something does not
perform the action you desire, even though it operates as designed. For example, a
valve closing is a fault if it occurs at the wrong time due to the improper functioning
of some upstream component or human error. But the same valve can fail closed due
to seizing of the poppet. Rupture of a pressure vessel is a component failure. You can
say that all failures are faults, though not all faults are failures.

NOTES FROM NICKS FILE


Spend time clearly understanding the difference between fault and failure.
When I review fault trees, that is one of the biggest mistakes that I frequently
find. I have seen many fault trees filled with failures but few faults.
Fault Tree Analysis 207

The system or subsystem fault is an undesirable state of existence of the system


or subsystem. That system fault (the top event) is comprised of component (or sub-
system) faults. It depends to what level of detail you wish to delve whether the FTA
faults are to the component, black box, or subsystem level. The fault tree of a nuclear
plant is very large even at the subsystem level. However, it would be advantageous to
go to the component fault level to analyze the plant safety subsystems.

NOTES FROM NICKS FILE


Fault trees are very diverse and can be used in many ways. They are one of my
most favorite safety analysis tools. I have used them for such diverse activities
as understanding integrity management of an upstream oil pipeline system,
Downloaded by [Wayne State University] at 13:03 16 August 2016

employee and management actions taken during a plutonium spill at a labora-


tory, and the Sydney, Australia, Waterfall rail accident investigation.

The component fault is the state of existence of that component that contrib-
utes to the mechanism that leads to the next-level fault. In understanding what the
component fault is, it is important to consider what the component state is in and
when it is in that state of existence. Component faults are comprised of primary,
secondary, and command faults. However, most primary and secondary faults are
comprised of component failures, so they are usually called primary and second-
ary failures.
A primary failure is a failure that occurs under normal operating and environ-
mental conditions. A secondary failure is a failure outside of normal conditions. A
command fault occurs when a component performs as designed but produces the
output signal at the wrong time. Roberts etal. (1981) demonstrate command faults
with a humorous story from the American Civil War.
It appears that General Beauregard had sent his courier to deliver a message to
one of the commanders in the field. The battle situation changed, and sometime
later, the general sent another message with updated information. The battle situ-
ation changed again, and the general amended the previous messages with a third.
The messages all arrived (as designed) to the commander in the field, but in the
wrong order. Because the messages arrived in the incorrect order, that fault caused
the battle commander to take the wrong actionwith disastrous results.
Fault tree symbols are divided into four categories: primary event, intermediate
event, gate, and transfer. Figure 7.1 defines each of the symbols used in fault tree
generation.
Primary events are end events; in other words, for one reason or another, they are
not studied further. For example, the circle or basic event describes a fault that is an
initiating event itself and has no inputs depicted in the fault tree. Some examples are
as follows: K1 timer contacts inadvertently open; K2 relay contacts fail to close; bat-
tery 2A is 0 V; or pressure switch contacts fail to open.
An ellipse, or conditioning event, is a sort of message bubble that records any
conditions or restrictions that apply to any of the logic gates. This symbol is used
primarily with INHIBIT and PRIORITY AND gates.
208 System Safety Engineering and Risk Assessment: A Practical Approach

Primary event symbols

BASIC EVENT A basic initiating fault requiring no


further development
Specific conditions or restrictions that
CONDITIONING apply to any logic gate (used primarily
EVENT with PRIORITY AND and INHIBIT gates)
An event which is not further developed
UNDEVELOPED either because it is of insufficient
EVENT consequence or because information
is unavailable.

EXTERNAL An event which is normally expected to occur


EVENT
Downloaded by [Wayne State University] at 13:03 16 August 2016

Intermediate event symbols

A fault event that occurs because of one or


INTERMEDIATE more antecedent causes acting through logic
EVENT gates

Gate symbols

AND Output fault occurs if all of the input


faults occur
Output fault occurs if at least one of the input
OR
faults occurs
Output fault occurs if exactly one of the input
EXCLUSIVE OR faults occurs
Output fault occurs if all of the input faults
occur in a specific sequence (the sequence
PRIORITY AND
is represented by a CONDITIONING EVENT
drawn to the right of the gate)
Output fault occurs if the (single) input fault
INHIBIT occurs in the presence of an enabling
condition (the enabling condition is
represented by a CONDITIONING EVENT
drawn to the right of the gate)
Transfer symbols

Indicates that the tree is developed further at


TRANSFER the occurrence of the corresponding
IN TRANSFER OUT (e.g., on another page)

TRANSFER Indicates that this portion of the tree must be


OUT attached at the corresponding TRANSFER IN

FIGURE 7.1 Fault tree symbols. (From Roberts, N.H. etal., Fault Tree Handbook, NUREG-
0492, U.S. Nuclear Regulatory Commission, Washington, DC, 1981, p. IV-3.)

A diamond describes an undeveloped event for which no further analysis is


required or for which information is unavailable to develop the event further.
Undeveloped events are de facto boundary conditions to the problem. Any fault can
be left undeveloped if you cannot ascertain how it occurs (what its inputs are) or if
they are not important to the next event.
Fault Tree Analysis 209

An external event is a normal event. External events can be thought of as assump-


tions in the graphical analysis: Gravity as the predominate force, air (with sufficient
oxygen to sustain a fire) in a warehouse, etc.
Intermediate events are the basic events described. Various input faults feed into
this intermediate event, which itself feeds into the next-level-up fault. A typical inter-
mediate event is that motor 1 fails to start. The various faults that can prevent the
motor from starting are the inputs to the intermediate event. The fact that motor 1
does not start is itself an input to the next-up fault that the pump does not operate.
The next group of symbols are the fault tree graphic operators. These are the logic
gates. AND indicates that all the faults feeding into the AND gate must occur for
the output fault to occur. For example, to have an overheated wire (an intermediate
event), there must be a 5 mA current in the system, AND power must be applied to
Downloaded by [Wayne State University] at 13:03 16 August 2016

the system for t > 1 ms.


An OR gate is the opposite of an AND. Any of the input events occurring would
result in the event fault. For example, for P-2 line not to receive any water, V-3 OR
V-5 OR pump 3 must fail. The other gates are variations of the AND and OR gates
and are not used as frequently.
Triangles are used to depict transfer gates. The transfer in and transfer out gates
really are used to indicate a continuation of the fault tree onto another sheet of paper.
Figure 7.2 shows a high-level fault tree for a maglev (magnetic levitation) train. The
top event is as follows: Train comes to a sudden stop. The next tier down states that if

Sudden stop
occurs

Vehicle collides Vehicle leaves Vehicle collision Untimely braking


with guideway guideway occurs occurs
B D

Loss of Unauthorized individual Other vehicle present Debris on


safe hover on guideway on guideway guideway

Wind A Loss of C
pressure guidance

Intrusion not Guideway easily Malicious Other


detected accessible damage debris

Intrusion system No Low height Ladder


intrusion
detection inoperative alarm of guideway available
No
fencing

Damage loss Commercial


of sensor power failure
Incorrect
installation

FIGURE 7.2 Fault tree of sudden stop of maglev train. (From Dorer, R.M. and Hathaway,
W.T., Safety of High Speed Magnetic Levitation Transportation Systems: Preliminary Safety
Review of the Transrapid Maglev System, DOT-VNTSC-FRA-90-3, U.S. Department of
Transportation, Washington, DC, 1991, p. C-4.)
210 System Safety Engineering and Risk Assessment: A Practical Approach

any of the four intermediate events occur, it will lead to the top event. Transfer gate A
under the vehicle leaves guideway event indicates that that fault is further developed on
another page. The diamonds are undeveloped events that the analyst felt did not need to
be pursued any further for this study. However, that does not mean that sometime in the
future he may wish to take one or more of those diamonds and investigate their faults.
Fault trees are relatively easy to construct. However, there are a few rules
that should be followed. The U.S. Nuclear Regulatory Commissions Fault Tree
Handbook, NUREG-0492 (Roberts etal., 1981), though published many years ago
is still a classic, provides some good ground rules:

Write the statements that are entered in the event boxes as faults; state pre-
cisely what the fault is and when it occurs (e.g., motor fails to start when
Downloaded by [Wayne State University] at 13:03 16 August 2016

power is applied).
If the answer to the question, Can this fault consist of a component failure?
is Yes, classify the event as a state-of-component fault, add an OR gate
below the event, and look for primary, secondary, and command modes.
If the answer is No, classify the event as a state-of-system fault, and look
for the minimum necessary and sufficient immediate cause or causes. As a
general rule, when energy originates outside the component, the event may
be classified as state of the system.
If the normal functioning of a component propagates a fault sequence, then
it is assumed that the component functions normally. In other words, no
miracles are allowed. If a fault is going to occur, it must occur.
All the inputs to a particular gate should be completely defined before fur-
ther analysis of any one of them is undertaken.
Gate inputs should be properly defined fault events, and gates should not be
connected directly to other gates. Many people shortcut the FTA by hook-
ing the outputs of gates directly into another gate without describing the
event. Do not do that. It is sloppy.

7.2 FINDING CUT SETS


As stated earlier, the fault tree is a model of the system fault state. There are qualitative
and quantitative tools to evaluate the tree. Qualitative analysis of fault trees is conducted
through the use of cut sets and simple Boolean algebraic manipulation. Trees are quanti-
fied by applying probabilities or frequencies of occurrence of each event fault. The event
faults are then combined through Boolean manipulation, and the top-event probability
is determined. You may wish to review a math book and become familiar with Boolean
algebra and probability theory. The U.S. Nuclear Regulatory Commissions Fault Tree
Handbook (Roberts et al., 1981) and NASAs Fault Tree Handbook with Aerospace
Applications (Stamatelatos etal., 2002) are excellent references as well.
In Boolean algebra, the OR gate represents the union of two or more events. The
AND gate represents the intersection of two or more events.
In Figure 7.3, for the no current example, either A OR B must occur for event A to
occur. In Boolean algebra, this can be written as
A = B + C
Fault Tree Analysis 211

Overheated
A No current D
wire

Switch A Battery B 5 mA current


B C E F Power applied
open 0V in system t > 1 ms

FIGURE 7.3 Fault tree representation.

The same expression in set theory is A is B union C (A = B C). The overheated


wire is stated as E AND F must occur for D to occur. Again, in Boolean algebra,
Downloaded by [Wayne State University] at 13:03 16 August 2016

D = E F

In set theory, D = E intersection F (D = E F).


Table 7.1 is a refresher of the rules for Boolean algebra. Because of space limita-
tions, it is impossible to go into set theory, Venn diagrams, and probability theory; it
is strongly recommended that you review a good book on probability before applying
any of these rules.

TABLE 7.1
Boolean Manipulation Rules
Algebraic Rule Set Theory Representation Engineering Representation
Commutative law XY=YX X*Y=Y*X
XY=YX X+Y=Y+X
Associative law X Y(Y Z) = (X Y) Z X * (Y * Z) = (X * Y) * Z or X(YZ) = (XY)Z
X (Y Z) + (X Y) Z X + (Y + Z) = (X + Y) + Z
Distributive law X (Y Z) = (X Y) (X Z) X(Y + Z) = XY + XZ
X (Y Z) = (X Y) (X Z) X + Y * Z = (X + Y) * (X + Z)
Idempotent law XX=X X*X=X
XX=X X+X=X
Law of absorption X (X Y) = X X * (X = Y) = X
X (X Y) = X X+X*Y=X
Complementation X X = X * X =
X X = X + X = = 1
(X) = X (X) = X
De Morgans theorem (X Y) = X Y (X * Y) = X + Y
(X Y) = X Y (X + Y) = X * Y
Other operations X= *X=
X=X +X=X
X=X *X=X
X= +X=
= =
= =
212 System Safety Engineering and Risk Assessment: A Practical Approach

B1 B2

C3 Z1 Z2 C4 C3 Z3 Z4 C5

Z5 Z6 Z5 Z6
Z11 Z12 C6 Z9 C6 Z10

Z7 Z8 Z7 Z8
Downloaded by [Wayne State University] at 13:03 16 August 2016

FIGURE 7.4 Example of fault tree.

Figure 7.4 is a typical branch of a large fault tree. There are a number of ways
to solve a fault tree: top-down substitution, bottom-up substitution, and even using
Monte Carlo simulations (with actual failure data). Also, a number of computer pro-
grams can solve (and draw) the tree. It is impossible to keep up to date with the
changes in software programs for fault trees. Here are some of the software pro-
grams on the market:

CAFTA (Data Systems and Solutions)


FaultTree + (Isograph)
PTC Windchill FTA (formerly Relex Fault Tree)
Fault Tree Analysis Module of ITEM Tool Kit (Item Software [USA] Inc.)
SAPHIRE (formerly IRRAS, U.S. Nuclear Regulatory Commission)
Probabilistic Risk Assessment Workstation (Electric Power Research
Institute)
RiskSpectrum FTA (Lloyds Register)
FaultrEASE (Arthur D. Little, Inc.)
Windchill FTA
LOGAN (RM Consultants)
FTAnalyzer Lite (SoHaR)
TDC FTA (TDC)
ELMAS Fault Tree (Ramentor Oy)
GRIF-Workshop (TOTAL, STATODEV)
CARE FTA (BQR)

Using the top-down substitution methodactually writing Boolean equations from


the top event downwe can write

A = B1 * B2

B1 = C 3 + Z1 + Z 3 + Z 4

B2 = C 3 + Z 3 + Z 4 + Z 5

Fault Tree Analysis 213

Start at the top event and then create Boolean equations for each level or branch on the
tree. Once the next couple of levels have been written, you can use the various Table7.1
substitution laws. So, combining B1 and B2 and through Boolean manipulation,
A = C 3 + Z1 * Z 2 + Z 3 * Z 4 + Z 2 * Z 3 + Z 2 * Z 4 + C 4 * Z 3

+ C 4 * Z 4 + C 4 * C 5 + Z1 * C 5 + Z 2 * C 5

Note that two branches are repeated in the tree, the C3 and C6 branches. It is not uncom-
mon that the fault scenario is repeated in a large fault tree. If one subsystem feeds various
plant units, then that branch will be repeated wherever it occurs. Parallel pumps, dual
motors, or even single units (e.g., emergency backup power units) are simple examples
of repeat branches. This is a very important point to note: if a repeat branch happens to
Downloaded by [Wayne State University] at 13:03 16 August 2016

be failure prone, then its faults will be replicated throughout the fault tree:
C3 = Z 5 * Z 6

C 4 = C 6 + Z11 + Z12

C 5 = C 6 + Z 9 + Z10

C6 = Z 9 + Z 8

Again, using Boolean manipulation, the final fault scenario that leads to the top
event, A, can be written as

A = ( Z 7 ) + ( Z 8 ) + ( Z 5 * Z 6 ) + ( Z1 * Z 3 ) + ( Z1 * Z 4 ) + ( Z 2 * Z 3 ) + ( Z 2 * Z 4 )

+ ( Z 3 * Z11) + ( Z11 * Z 4 ) + ( Z11 * Z 9 ) + ( Z1 * Z 9 ) + ( Z 5 * Z 9 )


A cut set is a collection of basic events that will lead to the top event. A minimal cut set is
the smallest combination of component failures, which, if they all occur, will cause the
top event to occur. A single-component minimal cut set means that if that single com-
ponent fails, then the top event will occur. In the aforementioned example, parentheses
have been placed around the cut sets. If the components indicated in the parentheses
fail, then the system will fail. As can be seen, there are numerous single-point failures.

PRACTICAL TIPS AND BEST PRACTICE


The more AND gates you use, the safer the system is. AND gates denote
a fault tolerance; for example, for a braking subsystem failure, both
the primary brake AND the backup (emergency) eddy-current brake
mustfail.
Likewise, if you have a lot of OR gates in your system, you are very failure
prone. Any of those failures can lead to the event. A string of OR gates lead-
ing up to the top event is extremely dangerous. Try to change the system and
incorporate more AND gates.
214 System Safety Engineering and Risk Assessment: A Practical Approach

NOTES FROM NICKS FILE


Once, I was doing a fault tree for a very complicated system that needed to
improve its safety performance. I used the fault tree with lots and lots of OR
gates to illustrate to senior leadership that we were seriously at risk if we could
not find a way to substitute the OR for AND gates and build in more resiliency.
The fault tree graphic did the job. Senior leaders immediately understood the
risk they were facing, and we changed the design and operations.

Obviously, the bottom-up method of FTA is the exact opposite of what was just dem-
onstrated. You start at the lowest level, substitute the Boolean equations, and solve
Downloaded by [Wayne State University] at 13:03 16 August 2016

for the top event.


As stated earlier, the opposite of a fault tree is a success tree. In Boolean algebra,
a success tree is the complement of a fault tree. The complement of a cut set is a path
set. To solve a success tree, you have two options: either draw a success tree from
the start or draw a fault tree and then take the complement of the tree (along with the
corresponding Boolean equations).

7.3 FAULT TREE QUANTIFICATION


FTA is not a quantitative analysis; however, the tree can be quantified. The most
common method of quantification is to assign failure probabilities to each of the
events. Then use the various laws of probability and statistics and solve for the top
event. NASAs Fault Tree Handbook with Aerospace Applications (Stamatelatos
etal., 2002) is a great reference. The U.S. Nuclear Regulatory Commissions Fault
Tree Handbook (Roberts etal., 1981), again, is an excellent reference. Henley and
Kumamotos Probabilistic Risk Assessment book, listed under Further Reading at
the end of this chapter, also goes into a lot of detail about how to assign probabilities
to fault events. Before assigning failure probabilities to your tree, consult a reliability
engineering book to ensure that you are manipulating the data appropriately.

PRACTICAL TIPS AND BEST PRACTICE


A very useful way of demonstrating how your safety system operates is through
a success tree. The success tree will demonstrate the must succeed events. At
times, this can be a very poignant method of demonstrating how difficult it
will be to meet an exceedingly success-oriented project.

The fault tree is drawn, and then the Boolean equations and minimal cut sets are
derived for the top event. Probability estimates can be generated from hardware
failure data, human error estimation, maintenance frequency, etc. Probability esti-
mates are then assigned to the events. Be sure to take into consideration uncer-
tainty limits to your failure data. Through the laws of probability, combine the
Fault Tree Analysis 215

probabilities to determine the top event. The rare-event approximation is an excel-


lent method to help truncate the math. It is used to facilitate the manipulation of
very small probability numbers. Obviously, the smaller the probability, the better
will be the approximation.

7.4EXAMPLE OF A FAULT TREE CONSTRUCTION


OF A MOTORPUMP PRESSURE SYSTEM
The Fault Tree Handbook, Chapter VIII, has an excellent example of how to con-
struct a fault tree of a pump-motor pressure system, as shown in Figure 7.5. The
handbook describes the problem (Roberts etal., 1981).
The function of the control system is to regulate the operation of the pump. The
Downloaded by [Wayne State University] at 13:03 16 August 2016

latter pumps fluid from an infinitely large reservoir into the tank. We shall assume
that it takes 60 s to pressurize the tank. The pressure switch has contacts, which are
closed when the tank is empty. When the threshold pressure has been reached, the
pressure switch contacts open, de-energizing the coil of relay K2 so that relay K2
contacts open, removing power from the pump, causing the pump motor to cease
operation. The tank is fitted with an outlet valve that drains the entire tank in an
essentially negligible time; the outlet valve, however, is not a pressure relief valve.
When the tank is empty, the pressure switch contacts close, and the cycle is repeated.
Initially, the system is considered to be in its dormant mode: switch S1 contacts open,
relay K1 contacts open, and relay K2 contacts open: that is, the control system is de-
energized. In this de-energized state, the contacts of the timer relay are closed. We will
also assume that the tank is empty and the pressure switch contacts are therefore closed.
System operation is started by momentarily depressing switch S1. This applies
power to the coil of relay K1, thus closing K1 contacts. Relay K1 is now electrically
self-latched. The closure of relay K1 contacts allows power to be applied to the coil
of relay K2, whose contacts close to start up the pump motor.
The timer relay has been provided to allow emergency shutdown in the event that
the pressure switch fails to close. Initially, the timer relay contacts are closed and the
Outlet
valve
Relay
K1
Pressure
Relay switch S
K2
Switch Timer relay Pressure
S1 tank
Motor
Pump

From reservoir

FIGURE 7.5 Pressure tank system. (From Roberts, N.H. et al., Fault Tree Handbook,
NUREG-0492, U.S. Nuclear Regulatory Commission, Washington, DC, 1981, p. V-III.)
216 System Safety Engineering and Risk Assessment: A Practical Approach

timer relay coil is de-energized. Power is applied to the timer coil as soon as relay
K1 contacts are closed. This starts a clock in the timer. If the clock registers 60 s of
continuous power application in the timer relay coil, the timer relay contacts open
(and latch in that position), breaking the circuit to the K1 relay coil (previously latched
closed) and thus producing system shutdown. In normal operation, when the pressure
switch contacts open (and consequently relay K2 contacts open), the timer resets to 0 s.
Figure 7.6 is the resulting fault tree. In constructing the fault tree from the pres-
sure tank schematic, it is obvious that the top event should be rupture of pressure
tank after the start of pumping. This is a fairly simplified tree, in which piping,
wiring, etc., have been ignored. The Fault Tree Handbook makes a good point of
emphasizing that the fault must specify what happens and when it occurs.
An OR gate is drawn because the top event can be caused by a component failure.
Downloaded by [Wayne State University] at 13:03 16 August 2016

This is a good example of the use of primary and secondary component failures. The
circle or primary failure of the tank could be due to things such as material fatigue
and poor workmanship. If there is concern that the tank does not meet the minimum
necessary design specifications (i.e., ASME Section VIII), then the circle could be
another rectangle (or secondary failure). However, in this case, we feel that the tank
was designed appropriately. Likewise, the diamond is highly unlikely and would not
need to be developed further.
So, now, we concentrate on the secondary failure of tank rupture. The Fault Tree
Handbook again emphasizes a critical point with primary and secondary faults
namely, that a primary failure is one in which a component fails in the environment
for which it is qualified and the secondary failure is one in which it fails in an envi-
ronment for which it is not qualifiedimportant distinctions.
This secondary failure is composed of component failures, so again, an OR gate
is drawn. The same logic as used earlier is applied here to draw the secondary fail-
ure and the diamond. The INHIBIT gate documents that the input to the fault is a
continuous, t > 60 s pump operation. Remember, this is conditional fault. The pump
must operate longer than 60 s for the failure to occur.
The concept of state-of-component and state-of-system faults is worth discussing
briefly here. If a state-of-component existsin other words, the fault occurs because
of a component failurethen OR gates are used. The use of OR gates connotes that
any of the listed fault inputs can cause the event. If a state-of-system fault occurs,
that means that something in the system failed that caused the event to occur and
thus connotes an AND gateall the fault inputs must occur for the event to occur.
The fact that two faults are in place without a gate between them is not incorrect;
it only indicates that the author wishes to detail the failure sequence. If more detail is
needed to understand the process, then a string of rectangles in series can be drawn.
It is obvious that for the pump to operate continuously, it must have power for longer
than 60 s.
From there, an OR gate is drawn, state-of-component faults; however, the EMF
Applied to K2 Relay Coil for t > 60 s is a state-of-system fault and thus requires an
AND gate. This erroneous command signal to the component is due to other faults
in the system.
On the left side of the AND gate, all the events end as circles or diamonds. In other
fault trees or if the top event is highly significant (such as rupture of the reactor in a
Fault Tree Analysis 217

Rupture of
pressure tank
after the start
of pumping

Tank rupture
Tank Tank
(secondary rupture due to
failure) rupture
improper selection
of installation
(wrong tank)

Tank rupture due


to internal over Secondary
pressure caused by tank failure from
continuous pump other out of tolerance
operation for conditions
t > 60 s (e.g., mech.,
Downloaded by [Wayne State University] at 13:03 16 August 2016

If pump runs for t > 60 s thermal)


tank will rupture
with probability = 1
Pump operates
continuously
t > 60 s

K2 relay contacts
remain closed
for t > 60 s

EMF applied to
K2 relay K2 relay coil
K2 relay contacts for t > 60 s
(secondary fail to
failure) open

Pressure switch EMF remains on pressure


contacts closed switch contacts when
for t > 60 s pressure switch contacts
closed for t > 60 s

Pressure
switch Pressure
Excess switch EMF through S1 contacts
pressure not contacts
fail to (secondary when pressure switch
sensed by pressure failure) contacts closed for
actuated switch open
t > 60 s

EMF through K1 relay


S1 S1
contacts when pressure contacts
switch contacts closed External switch
reset actuation fail to (secondary
for t > 60 s open
force remains failure)
on switch S1

EMF not removed from K1


relay coil when pressure K1 relay
switch contacts closed contacts K1 relay
for t > 60 s fail to (secondary
open failure)
Timer relay contacts
fail to open when
pressure switch contacts
closed for t > 60 s

Timer
Timer relay
does not contacts Timer relay
time out fail to (secondary
due to improper open failure)
installation or
setting

FIGURE 7.6 Pump-motor pressure tank fault tree. (From Roberts, N.H. et al., Fault Tree
Handbook, NUREG-0492, U.S. Nuclear Regulatory Commission, Washington, DC, 1981, p. V-III.)
218 System Safety Engineering and Risk Assessment: A Practical Approach

E1

T E2

E3 K2
Downloaded by [Wayne State University] at 13:03 16 August 2016

S E4

Legend: Faults
E1 Top event
E2, E3, E4, E5 Intermediate fault events
R Primary failure of timer relay S1 E5
S Primary failure of pressure switch
S1 Primary failure of switch S1
K1 Primary failure of relay K1
K2 Primary failure of relay K2
T Primary failure of pressure tank
K1 R

FIGURE 7.7 Fault tree example. (From Roberts, N.H. etal., Fault Tree Handbook, NUREG-
0492, U.S. Nuclear Regulatory Commission, Washington, DC, 1981, p. VIII.)

nuclear power plant), then these entries may need to be evaluated further. Likewise,
the far right side of the fault tree ends with similar faults.
The remainder of the fault tree goes into further detail about how relay circuit
can fail. In Figure 7.7, the fault tree has been simplified and the Boolean expressions
developed:

E1 = T + E 2

= T + ( K 2 + E3)

= T + K 2 + (S * E4)

= T + K 2 + S * ( S1 + E 5 )

= T + K 2 + ( S * S1) + ( S * E 5 )

= T + K 2 + ( S * S1) + S * ( K1 + R )

= T + K 2 + ( S * S1) + ( S * K1) + ( S * R )

The minimal cut sets are K2, T, S * S1, S * K1, and S * R.


Fault Tree Analysis 219

TABLE 7.2
Failure Probabilities for Pressure Tank Example
Component Symbol Failure Probability (Pr)
Pressure tank T 5 106
Relay K2 K2 3 105
Pressure switch S 1 104
Relay K1 K1 3 105
Timer relay R 1 104
Switch S1 S1 3 105
Downloaded by [Wayne State University] at 13:03 16 August 2016

Now, if we assign failure probabilities to system components, we can see how


quickly the fault tree can be quantified. Table 7.2 shows the failure probabilities.
It is very important to remember to follow the rules for manipulation of probabili-
ties. In this example, all failures are independent, so they can be easily multiplied
together. If they were dependent failures, then the combinations would be different.
So the resulting probabilities are

Pr ( T ) = 5 10 6

Pr ( K 2 ) = 3 10 5

Pr ( S * K1) = 1 10 4 3 10 5 = 3 10 9
( )( )
Pr ( S * R ) = 1 10 4 1 10 4 = 1 10 8
( )( )
Pr ( S * S1) = 1 10 4 3 10 5 = 3 10 9
( )( )

So, by summing the minimal cut sets, the top-event probability of occurrence is

Pr ( E1) = 3.4 10 5

7.5 COMMON MISTAKES IN FAULT TREES


A few mistakes you should try to avoid in constructing, quantifying, and evaluating
fault trees are the following:

Try to model to the highest level possible that you have data; the more the
data used, the more uncertainty in the model.
Do not put too many inputs that have very small probabilities into gates.
Do not spend too much time on passive components in a system. Remember,
the fault tree really looks at functions, not components.
Do not model human errors of commission because they are very difficult
to capture realistically and can skew results.
220 System Safety Engineering and Risk Assessment: A Practical Approach

Remember, garbage ingarbage out. If the results of the quantified tree


donot make sense, do not give them too much weight. It is much better to
use quantitative trees for comparison, not as absolute number generators.
Do not fault tree everything. It is expensivebe judicious.
Do not try to treat Boolean algebra expressions as regular algebraic equa-
tions. Be careful when combining Boolean expressions.
Look closely at the failure modes to determine if they are independent or
dependent. This is very important in probability manipulations.
Be sure the top event is a high-priority concern.

PRACTICAL TIPS AND BEST PRACTICE


Downloaded by [Wayne State University] at 13:03 16 August 2016

Fault trees are extremely powerful methods to demonstrate your safety sys-
tems fault tolerance to an accident. The next time you want to demonstrate
how many things must go wrong for an accident, use fault trees. Fault trees are
great tools to educate a non engineer (e.g., in a lawsuit) of how hard it is for
something to occur.

REFERENCES
Dorer, R. M. and Hathaway, W. T. 1991. Safety of High Speed Magnetic Levitation
Transportation Systems: Preliminary Safety Review of the Transrapid Maglev System.
DOT-VNTSC-FRA-90-3. Washington, DC: U.S. Department of Transportation.
Roberts, N. H., Vesely, W. E., Haasl, D. F., and Goldberg, F. F. 1981. Fault Tree Handbook.
NUREG-0492. Washington, DC: U.S. Nuclear Regulatory Commission.
Stamatelatos, M., Caraballo, J., and Vesely, W. August 2002. Fault Tree Handbook with
Aerospace Applications. Version 1.1. Washington, DC: NASA Office of Safety and
Mission Assurance NASA Headquarters.

FURTHER READING
Anderson, T. and Lee, P. A. 1981. Fault Tolerance: Principles and Practice. Englewood Cliffs,
NJ: Prentice-Hall.
Center for Chemical Process Safety. 1999. Guidelines for Chemical Process Quantitative Risk
Analysis, 2nd edn. Hoboken, NJ: Wiley.
Center for Chemical Process Safety. 2008. Guidelines for Hazard Evaluation Procedures, 3rd
edn. Hoboken, NJ: Wiley.
Haasl, D. F. 1965. Advanced concepts in fault tree analysis. System Safety Symposium,
Seattle, WA.
Henley, E. J. and Kumamoto, H. 2000. Probabilistic Risk Assessment and Management for
Engineers and Scientists. Hoboken, NJ: Wiley-IEEE Press.
International Electrotechnical Commission. 2006. Fault Tree Analysis. IEC 61025. Geneva,
Switzerland: International Electrotechnical Commission.
Lacey, P. 2011. An application of fault tree analysis to the identification and management
of risks in government funded human service delivery. Proceedings of the Second
International Conference on Public Policy and Social Sciences, Kuching, Sarawak,
Malaysia. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2171117, downloaded
July 9, 2013.
Fault Tree Analysis 221

Lapp, S. A. and Powers, G. J. 1977. Computer-aided synthesis of fault trees. IEEE Transactions
on Reliability, R-26: 213.
Long, A. Beauty and the beastUse and abuse of fault tree as a tool. No date. http://www.
fault-tree.net/papers/long-beauty-and-beast.pdf downloaded May 17, 2014.
National Institute of Standards and Technology. 2009. Root Cause Analysis Report of
Plutonium Spill at Boulder Laboratory. Gaithersburg, MD. http://www.nist.gov/public_
affairs/releases/upload/root_cause_plutonium_010709.pdf downloaded May 17, 2014.
Downloaded by [Wayne State University] at 13:03 16 August 2016
Downloaded by [Wayne State University] at 13:03 16 August 2016

S-ar putea să vă placă și