Sunteți pe pagina 1din 46

Two Day Workshop

on
RLA of Thermal Power Plants

DATA DRIVEN APPROACH


FOR RESIDUAL LIFE PREDICTION

Dr. Bijay Kumar Rout


Assistant Professor (Mechanical Engg. Dept)
Birla Institute of Technology and
Science, Pilani
Presentation outline
 Introduction
 Commonly used terms in the field of
Reliability & Maintainability Engg.
 Prediction of reliability and failure rate
 Discussion on Weibull failure (modeling and
estimation)
 Failure due to load and capacity interaction
 Concept of residual useful life and Mean
residual life
 Proportional Hazard Modeling
 Conclusion
OBJECTIVE
 To discuss the prognostics strategies by assessing the
degeneration/ degradation of components.
 To use the Reliability and hazard rate model to
determine Residual Useful Life (RUL) which will
provide the Organization a economic advantage and
safe environment to work
Plant Health Condition
Functioning Degrading Failing
Time
Associated Cost
with Time of
$$$ Replacement $$$$$$$
Prescriptive replacement of Lose of life and/or system
functioning “good” item due to catastrophic failure
$ Optimum
Replace item with maximum usage
before failure
Data-driven methodology
Data-driven approaches can be divided into
two categories:
Artificial intelligence (AI) techniques (neural
networks, fuzzy systems, decision trees, etc.).
Statistical techniques (multivariate statistical
methods(PCA and others), PHM, linear and
quadratic discriminators, partial least squares,
HMM, etc.).
Evolving Maintenance Strategies
Prognostics & Health Management 
(PHM)

Conditioned-Based Maintenance
System Availability

(CBM)
TPM

RCM

Computerized Maintenance Management Systems


(CMMS)

Preventive Maintenance

Inspection
Run to failure

1930 1950 1990 2000

“The function of maintenance in a world-class operating environment is not to


simply maintain, but to provide reliable systems and to extend the life of systems
at optimum costs.” Stanley Lasday, June 1997, Industrial Heating
5
Evolving Maintenance Strategies
1. Modeling of stress and damage in electronic
parts, structures and equipments utilizing
exposure conditions (e.g., usage,
temperature, vibration, radiation) to
compute accumulated damage.
2. Prediction of the expected life of plant and
its major parts.
3. Prediction of the availability of plant
4. Prediction of the expected maintenance
load
5. Prediction of the support system resources
needed for effective operation
Commonly used terms
MTBF: Mean Time Between Failures, µ . When applied
to repairable products this is the average time that a
system will operate until the next failure.
Failure Rate: The number of failures per unit of stress.
The stress can be expressed in various units and is
equal to θ = 1/µ .
MTTF or MTFF: The mean time to failure or mean
time to first failure. This is the measure applied to
systems that can’t be repaired during their mission.
MTTR: Mean time to repair. This is the average elapsed
time between a unit failing and its being repaired and
returned to service.
Availability: The proportion of time a system is
operable. This is only relevant for systems that can be
Commonly used terms
Fault Tree Analysis (FTA): Fault trees are
diagrams used to trace
symptoms to their root causes. Fault Tree Analysis
is the term used to
describe the process involved in constructing a
fault tree.

Derating: Assigning a product to an application


that is a stress level
less than the rated stress level for the product.
This is analogous to
providing a safety factor.

Censored Test: A life test where some units are


removed before the
Different Views about Failure
LEAK STARTS
POOL OF OIL

“FAILED” SAYS SAFETY OFFICER

LEAK DETERORATES
HIGH OIL CONSUMPTION
“FAILED” SAYS ENGINEER

EQUIPMENT STOPS WORKING

“FAILED” SAYS PRODUCTION ENGINEER

TIME
Mechanism of Failure

Distortion Fatigue and Corrosion Wear


Fracture
• Buckling • Ductile fracture • Corrosion • Abrasive wear
• Creep • Brittle facture Fatigue • Adhesive wear
• Yielding • Fatigue Fracture • Stress • Cavitation
• Warped • High cycle corrosion • Fretting wear
• Thermal fatigue • Galvanic • Scoring
relaxation • Low Cycle corrosion • Surface-origin
• Elastic Fatigue • Biological fatigue(pitting)
deformation • Residual stress corrosion
• Brinelling • Torsional fatigue • Chemical
attack
Reliability Modeling
Construction of a model (life distribution) that represents
the times-to-failure (TTF) of the entire system based on
the life distributions of the components, subassemblies
and/or assemblies ("black boxes") from which it is
composed,
Reliability Measures
• Cumulative failure function (The probability of failure by time t)
t
F (t ) = P (0 ≤ T ≤ t ) = ∫ f ( x)dx
Mean-value function(The expected number of0 failures experienced by time t
according to the model)


µ (time
• Reliability Function(The probability that the t ) =toEfailure = ∫ t. than
( f (t ))is larger f (t )t)dx
0


R (t ) = P (T > t ) = 1 − F (t ) = ∫ f ( x)dx
t
12
Empirical Reliability Measures
• n(t) = no. of survivors n(t i ) − n(t i + ∆ t )
(No. still alive or f (t ) =
∆ t i +1 N (0)
functioning adequately)
at time t. n(t i ) − n(t i + ∆t )
h(t ) =
∆t i +1 n(t i )
• empirical failure CDF
ˆ i
F (t ) =
n +1
• empirical reliability Rˆ (t ) = 1 − Fˆ (t ) = 1 − i = n + 1 − i
Function n +1 n +1
13
Empirical Reliability Measures
Let’s say 10 hypothetical component failure times of on life
test are {5,10, 17.5, 30, 40, 55, 67.5, 82.5, 100, 117.5}.
Determine f(t) and h(t) from the data

14
Reliability Measures
•Hazard rate function: The conditional probability that a failure
per unit time occurs in the intervals given that a failure has not occurred
before t.

P(t < T < t + ∆ t T ≥ t ) f (t )


h(t ) = lim∆t → 0 = =
∆t R(t )
•Mean Time To Failure (MTTF): The expected time
during which the system will function successfully without maintenance
or repair

MTTF = ∫ R (t )dt
•Failure intensity function: The instantaneous rate of change of 0
the expected number of failures with respect to time.
Cumulative Hazard fn. dµ (t )
λ (t ) =
t
dt
ln R(t ) = ∫ h(t )dt = H (t )
15
0
Failure Probability Reliability Measures

From constant failure rate From bathtub curve


(exponential)

Infant Mortality Time (t) Wear out

Probability of Failing
16 at a Given Age
Reliability Measures
Eventually everything fails
1

From constant failure rate (1-exponential)

F(t
)
From bathtub curve

0
Infant Mortality Time (t) Wear out

Cumulative Failure
17
Probability
Exponential Distribution Function
• The simplest distribution function, exponential, is
characterized by a constant failure rate over the
lifetime of the device.
• This is useful for representing a device in which
all early failure mechanisms have been
eliminated
– h(t) = λ
– R(t) = exp(- λ t)
– F(t) = 1 - exp(- λ t)
– f(t) = λ exp(- λ t)

– MTTF = ∫ 0 tλ exp(- λ t) dt
18
Weibull Distribution Function
• h(t) varies as a power of the age of the device where α , β and γ are
constants
• For β < 1 the failure rate decreases with time and can be used to
represent early failure
• For β = 1, h(t) is constant and can be used to represent steady state,
the failure rate is constant which is a special case of Weibull distribution
• For β > 1, h(t) increases with time and can be used represent wear out
condition
  t − γ β 
R(t) = R(t ) = exp −   
  α  

β t−γ 
β −1
  t − γ β 
f(t) = f (t ) =   exp −   , t ≥ γ
α  α    α  
β −1
h(t) = f (t )  t   β 
h(t ) = =   
R(t )  α   α 
MTTF = 1 
αΓ + 1
β  19
Weibull Distribution Function
Weibull Cumulative Failure Probability
Plotted on Weibull Graph Paper

β >1 β=1
(Scale based on log log 1/1-F)

.99

F
.63

β <1

t=η

.01
Time (t) (log scale)

20
Weibull Interpretation

β=1
β<1 • Implies failures are random
Implies infant mortality • An old part is as good as a
new part

1 < β < 4 β > 4


Occurs for: Implies rapid wear out in
- Low cycle fatigue
- Most bearing and gear
old age
failures • Occurs for:
- Corrosion or Erosion • Wear-through
21
Weibull Distribution Function

• Given a certain form for cumulative failure:


β
 t
1 − F (t ) = exp  − 
 α
• we can rearrange and take natural logarithms and get:

  1 
− ln ln   = β (lnt − ln α )
• If we plot Log  1 − {1/[1-F(t)]}
 log F (t )   vs log t, the result is a straight
line. Special graph paper exists that does these
transformations
• Cumulative Hazard function is
β
t
H (t ) =  
22
α 
Weibull Plot
COMMAND
MATLAB SOFTWARE

X= [72, 82, 97, 103, 113, 117,


126, 127, 127, 139, 154, 159,
199, 207]
>>wblplot(X)
>>parmhat = wblfit(X)

parmhat = 144.2727 3.6437

α β

23
Bath Tub Curve

Infant Mortality
Wear out
Hazard Rate

Bathtub Curve
h(t)

Constant failure rate


Time (t)

24
Equipment Failure Profile
The burn-in phase (known also as infant morality, break-
in, debugging): During this phase the hazard rate
decrease and the failure occur due to causes such as:
Incorrect use Poor test specifications Incomplete final test
procedures
Poor quality Over-stressed parts Wrong handling or
control packaging
Inadequate Incorrect installation Poor technical
materials or setup representative
training
Marginal parts Poor manufacturing Power surges
processes or tooling
Equipment Failure Profile
The useful life phase: During this phase the hazard
rate is constant and the failures occur randomly or
unpredictably. Some of the causes of the failure
include:
 Insufficient design margins
 Incorrect use environments
 Undetectable defects
 Human error and abuse
 Unavoidable failures
Equipment Failure Profile

The wear-out phase: the hazard rate increases.


Some of the causes of the failure include:

Wear due to aging, degradation in strength


 Inadequate or improper preventive
maintenance Limited-life components
Wear-out due to friction, misalignments,
corrosion and creep, Materials Fatigue
 Incorrect overhaul practices.
Remedies
• Early • Quality manufacture/Robust
Design

• Wear out • Physically-based models,


preventative maintenance, Robust
design (FMEA)

• Chance
• Tight customer linkages, testing,
HAST

28
Failures vs time as a function of Stress/
Load
High Stress
Hazard Rate h(t)

Medium Stress

Low Stress

Time

29
Failure due to Load & Capacity
variations

New units
Probability of Occurrence

After Infant Mortality


load strength load strength

Failure region
Failure region
Applied or Failure Stress

30
Failure due to Load & Capacity
variation

New units After wear-out

load strength load strength

Failure region Failure region

31
strength and stress are normally distributed with respectiv
ombinations of (50,000 and 5,0002) and (30,000 and 3,0002)

factor (difference) has (µ , σ 2) = (20,000 and 5,8312). A cr


32
en the difference < 0 (that is, stress exceeds strength).
2,507 8,338 14,169 20,000 25,831 31,662 37,49

P(critical failure) = P(difference < 0) = P[(0-µ )/σ


< (0-20,000)/(5831)]
= P(Z < 33 ≈ .0002
-3.43) so that reliability
NOTE: “reliability” = “safety”
Repairable and Non-Repairable
Systems
• Non-Repairable
– Only need to track first failure
• Repairable
– Track Mean Time Between Failures (MTBF)
– Time to Repair (T0) Preventive maintenance
schedule
R( jT0 + x) = RM ( x) = [ R(To )] R( x)
j

– The smaller the time period more notable the


improvement
34
RUL and Mean RUL
 To provide a sufficiently distant planning horizon,
remaining machinery useful operating life is
developed.
 Accumulated operating time is t on the probability
that a equipment can survive an additional time x

R ( x) = P( X t > x) = P((T − t ) > x T > t )


*

R(t + x)
R ( x) =
*

R(t )
 If a failure should be prevented, then the system
should be stopped safely before MRL
RUL and Mean RUL
From the reliability model based on the degradation
process, the estimation of the mean remaining
useful life may then be achieved.

M RL = µ (t ) = E ( X t > x) = E ((T − t ) > x T > t )

∫ R(t )dt
MRL (t ) = t
at t = 0 ⇒ MRL (0) ⇒ MTTF
R (t )
RUL and Mean RUL
Ex: A device by has decreasing failure rate characterized by
a two parameter Weibull distribution with α=180 year and
β=1/2. The device is required to have a design life
reliability of 0.90. What is design life if the device is first
subject to wearin period of one month?
Proportional Hazard Model (PHM)
The reliability of an equipment or system is
greatly influenced by the operating conditions
called covariates.
The proportional hazard model (PHM) was
proposed by Cox.(1972)
All reliability models consider failure time to
model the reliability.
It is possible to include the effect of operating
conditions like type of failure, stress etc. in the
reliability function.
Proportional Hazard Model
The proportional hazard model is the most
general of the regression models
h(t, z) = h0(t) hz(t)

This part is a function


of individual x values
This part is constant
for all individuals It adjusts h0 up or down
as a function
degradation
We generally use this parametric approach
h z (t) = e zβ
Proportional Hazard Model
 If Z(t) is a covariate information (measurement) available at
time t (which may also include all past information)
h{ (t ), z1 , z 2 , ........z m } = h0 (t ) exp(b1 z1 + ... + bm z m )
 h0(t) - baseline hazard; it is the hazard for the respective
individual when all independent variable values are equal to
zero.

log[h{ (t ), z1 , z2 , ........zm } / h0 (t )] = b1 z1 + ...+ bm zm


 The failure rate of a system is affected by its operation time
and also by the covariates. ex: a unit may have been
tested under a combination of different accelerated stresses
such as humidity, temperature, voltage, etc.
Proportional Hazards Model
Two Parametric Functional Forms
:
h(t, z) = h0(t) hz(t)

= λe zβ
exponential
β −1
t   β  zβ
=   e Weibull
α  α 
it changes with time whereas in multiple regression
the intercept remains constant.

We call h(t,z)/h0(t) as hazard ratio


Proportional Hazards Model
 Regression coefficients are estimated by
maximizing the partial likelihood.
 The base-line hazard function is not fitted into a
specific model and has a non-parametric form.
 It represents the hazard rate of a system when all
the covariates are equal to zero.
 The model assumes that the covariates act
multiplicatively on the hazard function, so that for
different values of explanatory variables the
hazard function at each time are proportional to
each other.
Key merits of PHM
• Used to investigate to effects of various explanatory
variables on hazard of assets/individuals
• It is distribution free, thus it does not have to assume a
specific form for the baseline hazard function
• Regression coefficients are estimated using partial
likelihood without the need of specifying the baseline
hazard function
• This model is available for both static and dynamic
explanatory variables (more realistic and reasonable
assumption)
• This model handles truncated, non-truncated data, and
tied values, Many goodness-of-fit tests and graphical
methods are available for this model
Key limitations of PHM
• A vulnerable approach when covariates are deleted or the
precision of covariate measurements is changed.
• Mixing different types of covariates in one model may cause
some problems
• An asset/individual life is assumed to be terminated at the
first failure time. In other words, this model depends only on
the time � elapsed between the starting event (e.g.
diagnosis) and the terminal event (e.g. fail) and not on the
chronological time �.
• The influence of a covariate in PHM is assumed to be time-
independent. Due to proportionality assumption, a common
baseline hazard for all assets/individuals has been assumed
in a case in which the assets/individuals should be stratified
according to baseline.
Conclusion
 Engineering components experience a variety of
environmental conditions and stresses while in
service. We need to anticipate the failure by
suitable use of modeling technique!!!
 Though mathematical models attempt to capture
all the degradation and failure mechanism, it is
not end in itself !!!

• Any Questions??

S-ar putea să vă placă și