Sunteți pe pagina 1din 3

SYSTEM RELIAl3ILITY

Loretta Arellano
Hughes Aircraft Company
P.O. Box 92426
Los Angeles, CA 90009
(3 10) 334-4248

summa^ the components and the materials of the


ility to the consumer means board or heatsink contribute to the failure
that products in systems will perform for a rate of the boards.
specified amount of time with minimal or no
failures or maintenance. A manufacturer will Desisn apolication contribution
guarantee that their products are reliable by How the components are used in design
offering a warranty that typically covers greatly affect the system reliability, the major
breakage, lack of performance, and drivers being electrical and thermal stresses
manufacturer's defects. To ensure that their applied. Several design guidelines exist that
products will oudast the warranty, reliability limit junction temperature, power dissipation,
must be designed into the system. and current. Designing for worst case
Designed-in means to understand what the conditions (e.g., temperature, altitude,
reliability drivers are and to eliminate them or humidity and vibration) increase reliability but
minimize their effects. must be weighed against the probability of
worst case occurrence and cost
Several factors and processes contribute to
the reliability of a product the components Architecture contribution
selected, how they are applied and how they A fault tolerant system is one in which a
are manufactured and assembled in the failure may occur, but in which the system
system. Each, once understood, can be detects and self-corrects or compensates.
appropriately designed and planned, and This may involve redundant hardware, smart
thus the effects of failures can be minimized. software or multi functional capability.
Listed below are major contributors to Redundant hardware will increase reliability
product reliability. but may cost and weigh more. Smart
software can be as simple as error detection
Comoonentlmaterialcontribution and correction code available in memory
Electrical component failure mechanisms are components.
becoming generally, well known and, the
inherent reliability of components is steadily Fault detectionlisolationcontribution
increasing. There are several methods to Integrated diagnostics can be a significant
calculate the failure rate of electrical and maintenance cost savings. Some failures,
mechanical components that are provided in when occurring in the system, fail to repeat
handbooks. Solder joint failures caused by during diagnostics. By employing the same
the differences of Coefficient of Thermal built in self test (BET) techniques to detect
Expansion (CTE) of the solder adjacent to and isolate the failure during all levels

I -1 43
(periodically in use or during initiated diagram can be developed which indicates
diagnostics), the probability of repeating the the dependencies. Redundancies, both
failures during diagnostics becomes very physical and functional, can be utilized and
high. taken into account. Several tools are
commercially available to quantify the
Manufacturina contribution probability of system success. Some System
The manufacturing process, if not controlled, Reliability figures of merit include: Mean
can be the largest contributor to high Time Between Failure (MTBF), Mean Time
warranty costs. Environment Stress Between Critical Failures (MTBCF), Mean
Screening (ESS) is a technique used to Time Between Maintenance Actions
eliminate poor workmanship and defects and (MTBMA), A, (Operational Availability), or R,
can be conducted at the part, board, unit or where R is the probability of success in a
system level. ESS involves subjecting the defined mission or period of use.
hardware to several cycles of temperature
extremes and usually includes some level of Once the system reliability is quantified,
vibration. By monitoring ESS failures and trades can be performed to weigh reliabifity
determining the root causes of failure, the against performance, the support concept
manufacturing processes can be improved. and cost A non-repairable or safety critical
In manufacturing large quantities of system requires a high confidence of
assemblies, maintaining high levels of survival, whereas an accessible repairable,
throughput require monitoring and analyses low cost system may tolerate a less reliable
such as Statistical Process Control (SPC) system.
techniques.
Failure Modes, Effects and Criticalla
Repair contribution Analysis (FMECA)
If hardware is returned for repair, the process Evaluating the effect of Failures to minimize
of re-furbishing can be life limiting. their effects is essential for safety critical
Unsoldering and re-soldering components systems. Redundancy mechanisms need to
require heat and solvents to be used. The be evaluated to prevent ‘single point failures’
effects of these should be considered when that void the redundant design. Reliability
re-deploying re-furbished hardware. critical components (new technologies,
temperature sensitive, large quantities per
Once the reliability drivers and objectives are system, etc.) are candidates for a detailed
understood, analyses and tests can be physics of failure analysis or being subjected
performed to balance reliability with cost, to accelerated life tests. Fault detectionffault
performance and supportability. In order to isolation analysis can be combined with the
design in reliability, a thorough Reliability FMECA. If built-in-test (BIT) is used in the
program should include the following types of design, all failure modes identified in the
activities: FMECA should be detected by BIT.

System modelina and prediction


After understanding the system architecture
and the inter-dependencies, a reliability block

437
Electronic PartsICircuits Tolerance Analysis the growth in reliability. Via lognog plots of
JEPICTA) MTBF vs. test time we can project achieving
An EPICTA, also known as a worst case the MTBF required in subsequent product
analysis, is used to ensure that component production. In analyzing the failures during
tolerance build-up does not alter system RDIGT, a FMECA is a crucial tool to identify
performance. It should take into account the failure mechanism causing the failure.
such parameters as vol ge, power, and
aging effects. This analysis can be Conclusion
rmed using simulation or by actual These are several of the critical analyses to
be performed to assure the expected
reliability will be achieved. However, to
Derating Analysis assure the System Reliability will not
Electrical component derating is limiting the diminish as production continues it is
component temperatures and electrical necessary to continue observing failures,
stresses to optimize the inherent reliability of analyzing the root causes, and improving on
the components. Typical rule of thumb is to the process.

less than 11OoC.

-
ESS

of a product, Reliability
Tests (RDIGT) are
performed, also known as Test, Analyze And
Fix (TAAF). The purpose is to test the
hardware under mission critical conditions for
sufficiently long periods to surface and fix
any failures that preclude achieving the
required MTBF. During the test period all
failures are to be analyzed, fixed and testing
resumed. By continuing in this fashion for
many hours (usually 3-10 times longer than
the MTBF of the product), we can observe

S-ar putea să vă placă și