Sunteți pe pagina 1din 25

Reliability , Redundancy & Fault tree

Analysis

By Paresh D. Sawant
ME(ETC) Microelectronics
Contents
 Redundancy
 Reliability
 Component Reliability.
 System Reliability.
 Reliability of a network storage architecture (NSA)
 Fault tree Analysis
 Single Point Failure:
 Common Cause Events/Phenomena
 Examples of Fault tree Analysis (FTA)
Redundancy:
A system is redundant if one failure of any of its
components does not affect the system’s purpose.
Reliability:
Reliability is divided into component reliability and
system reliability.

Component Reliability:
The calculation of the component reliability (R) value
starts with the Mean time between failures (MTBF)
value.
 We determine the annual failure rate (AFR), which is
Used to determine the reliability value.

System Reliability:
In a storage system, a component is configured in
one of two ways:
1) Redundant configuration (in parallel)
2) Non-redundant configuration (in series)
 Formula for redundant configurations:

 Formula for non-redundant configurations:


Determining Redundancy:
 list an inventory of components involved in the three
architectures shown in the first column.
 Next, we analyze the three architectures for redundancy.
 Determining Reliability:
 Using the reliability formulas, we can determine which architecture has the
highest reliability value.
 For this we will use sample MTBF values (as obtained by the
manufacturer) and AFR values shown in the table below
 Conclusion:
 When the calculations are complete, we compare
the data:
 Architecture 1 = 98.33%, or a System's
MTBF = 524,551 hours
 Architecture 2 = 99.50%, or a System's
MTBF = 1,752,000 hours
 Architecture 3 = 99.38%, or a System's
MTBF = 1,412,903 hours
 The MTBF figures are the most revealing, and
indicate that architecture 2 is Statistically the most
reliable of all.
Fault Tree Analysis

What is a fault tree?


 not a tree (in the graph-theoretic sense)
 a graphical representation of a logical function
 shows logical relationship between an event
(failure) and its causes
 provides a logical framework for expressing
combinations of component failures that can
lead to system failure
Single Point Failure:
“A Failure of one independent element of a system which
causes an immediate hazard to occur and/or causes the whole
system to fail”

Example:
Common Cause Events/Phenomena:
“A Common Cause is an event or a phenomenon which, if it occurs, will
induce the occurrence of two or more fault tree elements.”
* Oversight of Common Causes is a frequently found fault tree flaw!
Example:
Common Cause oversight correction
Ex1:System undesirable event is : Light Fails Off
Example2:
References:
 Network Storage Evaluations Using Reliability Calculations
by Selim Daoud, Sun Professional Services.
 Reliability Monitor Report , ATMEL PROPRIETARY.
 Atmel Corporation Quality & Reliability Handbook.
 Fault Tree Analysis of Computer-Based Systems by Prof.
Joanne Bechta Dugan , Electrical & Computer Engineering ,
University of Virginia.
 Fault Tree Analysis by P. L. Clemens, 1993.
 Fault Tree Analysis by Clifton A. Ericson II , Sept 2000.

S-ar putea să vă placă și