Sunteți pe pagina 1din 8

Cover Story: The FMEA method

A powerful reliability tool for data analysis that lasts for decades
Fast Forward
Use FMEA to detect unrecognized design problems.
FMEDA product data is commonly used to do safety integrity calculations.
Field failure studies are used to refine FMEDA component databases.
By William Goble

A Failure Mode and Effects Analysis (FMEA) is a system reliability and safety review technique created in
the 1960s as part of the U.S. Minuteman rocket program to find and mitigate unanticipated design
problems. A rather simple technique, the failure modes of each component in a given system are listed in
a table, and the effect of that failure is postulated and documented. The method is systematic, effective,
and detailed, although sometimes called time-consuming and repetitive. The reason the method is so
effective is every failure mode of every single component is examined. Here is a table example following
the format from MIL-HNBK-1629, one of the original references (see "FMEA tabular format" table).
Column one describes the name of the component under review, while column two is available to list the
component's identification number (part number or code number). Together, columns one and two must
uniquely identify the reviewed component. Column three describes the function of the component, while
column four describes the predicted failure modes. One row will likely be used for each component failure
mode. Column five is used to record the known cause of the failure mode if applicable. The effect of that
failure on the system is recorded in column six. The remaining columns vary, depending on which version
of the many iterations of FMEA are being followed.

FMEA finds problems


The FMEA method has grown in popularity over the years and has become an essential part of many
design processes, especially in the automotive industry. This is primarily because it has been shown over
time to be effective and useful despite any negatives of the method. During a FMEA, one may hear "Oh,
no," and it becomes clear that a particular component failure effect is a serious problem that had
previously gone unrecognized. When these problems are significant enough, a corrective action item is
recorded. The design is then improved to detect, avoid, or control the problem.

Process industry applications


Variations of the FMEA technique are used in the process industries. One place where FMEA is used is
for hazard identification in a petro-chemical plant design. This technique fits in nicely with the familiar
Hazard and Operability Study (HAZOP) technique as FMEA and HAZOP methods are nearly the same.
Both variations of a common theme list the components of a system in a tabular format. The fundamental
difference between FMEA and HAZOP is HAZOP uses guide words to stimulate the participants to identify
system abnormalities, whereas FMEA uses known equipment failure modes.
A variation of the FMEA technique as applied to control systems is called Control Hazards and
Operability Analysis (CHAZOP). Known failure modes of control equipment, such as a basic process
control system (BPCS), an actuator-valve assembly or a sensing transmitter are listed, and the effect of
that failure is documented. An action item is recorded when this effect is a significant problem, therefore
prompting an improvement in the control system design.

An example FMEA
The "Simple reactor" figure shows a simplified reactor with an emergency cooling system from Control
Systems Safety Evaluation and Reliability, Third Edition, Chapter 5 (www.isa.org/link/BK_CSSER). The
system consists of a gravity feed water tank, a control valve (VALVE1), a cooling jacket around the
reactor, a cooling jacket drain pipe, a temperature-sensing switch (TSW1), and a power supply. Normal
operation consists of the temperature-sensing switch closed (conducting) because the reactor
temperature is below a dangerous limit. Electrical current flows from the power supply through the valve
and the temperature-sensing switch. This electrical current (energy) keeps the valve closed. If the
temperature inside the reactor gets too high, the temperature-sensing switch opens. This stops the flow
of electrical current, and the control valve opens. Cooling water flows from the tank, through the valve,
then the cooling jacket, and finally the jacket drain pipe. This water flow cools the reactor, therefore
lowering its temperature.
The FMEA procedure requires the creation of a table with all failure modes listed for each of the system
components. The "Simple reactor FMEA" table shows the results of this example system level FMEA. The
FMEA has identified six critical items that should be reviewed to determine the need for correction.

The system designer, in the case of a simple reactor, may consider installing two temperature switches
and wiring them in series. Alternatively the system designer may choose a smart IEC 61508 safety
certified temperature transmitter with automatic diagnostics and a relay output. The certified transmitter
would reduce proof-testing effort to detect one temperature-sensing switch failed shorted. A second
drain pipe could be installed in parallel with the first, therefore preventing a single clogged drain from
causing a critical failure. A level sensor on the water tank could warn of insufficient water level. Many
other possible design changes could be made to mitigate the critical failures or to reduce the number of
false trips.

FMEA method evolution


The FMEA method was expanded in the 1970s to include semi-quantitative ratings (a number between
one and 10) for severity, likelihood, and detection. Four columns were then added to the table. Three
columns include ratings and a fourth for the risk priority number (RPN), which was obtained by multiplying
the three numbers. This expanded method is called a Failure Modes, Effects and Criticality Analysis
(FMECA). The "FMECA reactor example" table shows the reactor example with RPN numbers added
(columns 7,8,9, and 10).

FMEA techniques have continued to evolve over the years. Some of the more recent variations include

using the method for processes as well as designs. Similar to listing components, each step in a process
is listed. Each step includes all anticipated ways in which the step can go wrong, equivalent to listing
known failure modes of each component. Once the list has been completed, the method is the same as a
design FMEA. After these two fundamentally different types of FMEA were created, the "design FMEA"
was then called DFMEA, and a "process FMEA" was called PFMEA in some literature. Similar to a design
FMEA, the process FMEA has been shown to be effective in finding unanticipated problems.

Failure Modes Effects and Diagnostic Analysis


The always evolving FMEA method prompted the development of the Failure Modes Effects and
Diagnostic Analysis (FMEDA) technique. The late 1980s presented a need to model the automatic
diagnostic capability of smart devices. There was a new "architecture" in the safety PLC market called
one out of two with diagnostic switch (1oo2D), which competed with the existing triple modular redundant
architecture (called two out of three, 2oo3). As the impact on safety and availability of this new
architecture was highly dependent on diagnostic coverage, a measurement of the diagnostic coverage
was important. A FMEDA accomplishes this by adding additional columns to include a failure rate for
each system failure mode and a probability of diagnostic detection column for each line in the analysis.
Similar to the FMEA, the FMEDA technique also lists all components and their failure modes, as well as
the effect of the component failure mode. The table has now added columns that express each failure
mode of the system, the probability of any diagnostic to detect that particular failure, and the quantitative

failure rate for that failure mode. When the FMEDA is completed, the diagnostic coverage factor is
calculated based on a failure rate weighted average of the diagnostic coverage of all parts.
Failure rate numbers and a failure mode distribution are required for each component in order to perform
an FMEDA. Therefore, a component database of this information is required, as shown in The "FMEDA
process" figure.
The component database must consider the key variables that impact component failure rates. This
includes environmental stress factors. Fortunately, standards exist to characterize the environments in
the process industries, and profiles can be created. The "Environmental profiles for the process
industries" table shows the profile set for the process industries from Electrical and Mechanical
Component Reliability Handbook, Second Edition (www.exida.com).
The "safety factor" built into each particular product design is another important variable in the failure
rate. This can be determined through a detailed study of each design, including the ratings of each
component and expected stress conditions.

Field Failure Data Analysis for FMEDA


Design analysis can be used to create theoretical failure databases; however, accurate information is
obtained only when the component failure rates and modes are based on a collection of field failure
studies as shown in the "Field failure studies" figure. Any unexplained difference in a product failure rate
calculated from field failure data and FMEDA must be resolved. Sometimes, the field failure data
collection process needs improvement. Sometimes, the component database is upgraded, mostly by
recognizing new failure modes and component types.

Fortunately for the process industries, some functional safety certification bodies study field failure return
data as part of most product assessments, providing a strong source of field failure data. Some projects
also gather field failure data from end users. After more than 10 billion unit operating hours of field failure
data from dozens of studies, the FMEDA component database is greatly improving, especially for
functional safety. The resulting FMEDA product data is commonly used to do safety integrity verification
calculations.
The FMEDA technique can also be used to evaluate manual proof test coverage of safety instrumented
functions. This number is important when safety instrumented function verification calculations are done
to determine if a given design meets a particular safety integrity level. Any particular proof test procedure
can detect some of the potentially dangerous failures, but not all. The FMEDA can identify which failures
are, or are not, detected by the proof test. This is done by adding another column where probability of
detection during the proof test is estimated for each component failure mode. While following this
detailed, systematic method, it becomes clear that some, potentially dangerous failures have not been
detected by a particular proof test.

Dealing with the negatives


The biggest challenge when performing a FMEA (or any of the variations) is time consumption. Many
analysts have complained about the boring, time-consuming process. A strict and focused facilitator is
needed to keep the process moving. It should always be remembered that solving the problem is not part
of the analysis; the problems are solved once the analysis has been completed. If these rules are
followed, the result is time-effective improvements in safety and reliability.
ABOUT THE AUTHOR

Dr. William Goble is a principal engineer and director of the functional safety certification group at
exida, an accredited certification body. He has over 40 years of experience in electronic design, software,
and safety system design. His Ph.D. is in quantitative reliability/safety analysis of automation systems.

Resources
Control Systems Safety Evaluation and Reliability, Third Edition
www.isa.org/link/BK_CSSER
Safety Rated Smart Transmitters - Failure Modes and Diagnostic Analysis
www.isa.org/link/paper_SRST
Field Failure Data - the Good, the Bad and the Ugly
www.isa.org/link/good_bad
Your ISA

S-ar putea să vă placă și