Documente Academic
Documente Profesional
Documente Cultură
A powerful reliability tool for data analysis that lasts for decades
Fast Forward
Use FMEA to detect unrecognized design problems.
FMEDA product data is commonly used to do safety integrity calculations.
Field failure studies are used to refine FMEDA component databases.
By William Goble
A Failure Mode and Effects Analysis (FMEA) is a system reliability and safety review technique created in
the 1960s as part of the U.S. Minuteman rocket program to find and mitigate unanticipated design
problems. A rather simple technique, the failure modes of each component in a given system are listed in
a table, and the effect of that failure is postulated and documented. The method is systematic, effective,
and detailed, although sometimes called time-consuming and repetitive. The reason the method is so
effective is every failure mode of every single component is examined. Here is a table example following
the format from MIL-HNBK-1629, one of the original references (see "FMEA tabular format" table).
Column one describes the name of the component under review, while column two is available to list the
component's identification number (part number or code number). Together, columns one and two must
uniquely identify the reviewed component. Column three describes the function of the component, while
column four describes the predicted failure modes. One row will likely be used for each component failure
mode. Column five is used to record the known cause of the failure mode if applicable. The effect of that
failure on the system is recorded in column six. The remaining columns vary, depending on which version
of the many iterations of FMEA are being followed.
An example FMEA
The "Simple reactor" figure shows a simplified reactor with an emergency cooling system from Control
Systems Safety Evaluation and Reliability, Third Edition, Chapter 5 (www.isa.org/link/BK_CSSER). The
system consists of a gravity feed water tank, a control valve (VALVE1), a cooling jacket around the
reactor, a cooling jacket drain pipe, a temperature-sensing switch (TSW1), and a power supply. Normal
operation consists of the temperature-sensing switch closed (conducting) because the reactor
temperature is below a dangerous limit. Electrical current flows from the power supply through the valve
and the temperature-sensing switch. This electrical current (energy) keeps the valve closed. If the
temperature inside the reactor gets too high, the temperature-sensing switch opens. This stops the flow
of electrical current, and the control valve opens. Cooling water flows from the tank, through the valve,
then the cooling jacket, and finally the jacket drain pipe. This water flow cools the reactor, therefore
lowering its temperature.
The FMEA procedure requires the creation of a table with all failure modes listed for each of the system
components. The "Simple reactor FMEA" table shows the results of this example system level FMEA. The
FMEA has identified six critical items that should be reviewed to determine the need for correction.
The system designer, in the case of a simple reactor, may consider installing two temperature switches
and wiring them in series. Alternatively the system designer may choose a smart IEC 61508 safety
certified temperature transmitter with automatic diagnostics and a relay output. The certified transmitter
would reduce proof-testing effort to detect one temperature-sensing switch failed shorted. A second
drain pipe could be installed in parallel with the first, therefore preventing a single clogged drain from
causing a critical failure. A level sensor on the water tank could warn of insufficient water level. Many
other possible design changes could be made to mitigate the critical failures or to reduce the number of
false trips.
FMEA techniques have continued to evolve over the years. Some of the more recent variations include
using the method for processes as well as designs. Similar to listing components, each step in a process
is listed. Each step includes all anticipated ways in which the step can go wrong, equivalent to listing
known failure modes of each component. Once the list has been completed, the method is the same as a
design FMEA. After these two fundamentally different types of FMEA were created, the "design FMEA"
was then called DFMEA, and a "process FMEA" was called PFMEA in some literature. Similar to a design
FMEA, the process FMEA has been shown to be effective in finding unanticipated problems.
failure rate for that failure mode. When the FMEDA is completed, the diagnostic coverage factor is
calculated based on a failure rate weighted average of the diagnostic coverage of all parts.
Failure rate numbers and a failure mode distribution are required for each component in order to perform
an FMEDA. Therefore, a component database of this information is required, as shown in The "FMEDA
process" figure.
The component database must consider the key variables that impact component failure rates. This
includes environmental stress factors. Fortunately, standards exist to characterize the environments in
the process industries, and profiles can be created. The "Environmental profiles for the process
industries" table shows the profile set for the process industries from Electrical and Mechanical
Component Reliability Handbook, Second Edition (www.exida.com).
The "safety factor" built into each particular product design is another important variable in the failure
rate. This can be determined through a detailed study of each design, including the ratings of each
component and expected stress conditions.
Fortunately for the process industries, some functional safety certification bodies study field failure return
data as part of most product assessments, providing a strong source of field failure data. Some projects
also gather field failure data from end users. After more than 10 billion unit operating hours of field failure
data from dozens of studies, the FMEDA component database is greatly improving, especially for
functional safety. The resulting FMEDA product data is commonly used to do safety integrity verification
calculations.
The FMEDA technique can also be used to evaluate manual proof test coverage of safety instrumented
functions. This number is important when safety instrumented function verification calculations are done
to determine if a given design meets a particular safety integrity level. Any particular proof test procedure
can detect some of the potentially dangerous failures, but not all. The FMEDA can identify which failures
are, or are not, detected by the proof test. This is done by adding another column where probability of
detection during the proof test is estimated for each component failure mode. While following this
detailed, systematic method, it becomes clear that some, potentially dangerous failures have not been
detected by a particular proof test.
Dr. William Goble is a principal engineer and director of the functional safety certification group at
exida, an accredited certification body. He has over 40 years of experience in electronic design, software,
and safety system design. His Ph.D. is in quantitative reliability/safety analysis of automation systems.
Resources
Control Systems Safety Evaluation and Reliability, Third Edition
www.isa.org/link/BK_CSSER
Safety Rated Smart Transmitters - Failure Modes and Diagnostic Analysis
www.isa.org/link/paper_SRST
Field Failure Data - the Good, the Bad and the Ugly
www.isa.org/link/good_bad
Your ISA