Documente Academic
Documente Profesional
Documente Cultură
Introduction often appears to require expert knowledge about how the equipment
Law enforcement organizations have this down to a science. was operated, how it was installed, the original design specs,
Arrive at any crime scene, and you’ll find yourself immediately in the changes in the environment, how it was actually being used, etc.
midst of a flurry of activity. After the Generic Food Mart is burgled, Luckily, with just the right combination of repair expertise, root cause
the area is roped off, witnesses are gathered together and segregated analysis, and corrective action implementation, the process does not
from other onlookers, fingerprints are being lifted, and suspects may necessarily have to be harder to get more productive and lasting
already be in custody. More cops are there to guard the area from results. The right systems have usually already been purchased and
accidental or purposeful intrusion. put in place at most production facilities to get the data required for
an accurate and detailed failure analysis. Unfortunately, the
The amount of resources expended on a major (or even many minor) employment of these resources is not always optimum. A smarter
crime scene can be truly mind-boggling. You’ll find the team leader, who approach to the gathering of evidence, the correct interpretation of
directs general responsibilities. The photographer documents visual what that evidence is telling you and the judicious application of
evidence, a sketch artist takes descriptions and draws the crime scene, corrective actions will put those expensive monitoring systems to
and a number of officers guard the area. Investigators interview people work for you.
at the scene, while more patrolmen canvass the local residents for more
data. Specially-trained evidence gathering personnel process the The evidence gathering process
evidence and ensure the documentation is foolproof. Investigators Most companies already have many systems in place that can help
immediately start researching the backgrounds on suspects, looking for the troubleshooter narrow down his focus, but often times the data is no
clues in past history. longer available. The act of repairing the gear has already modified,
moved, or destroyed key pieces of evidence. Although the failure
Why is this immediate effort so massive? When actually analyzed, appeared to be minor at first, these data points can be crucial to finding
the number of man-hours invested, equipment expended and an actual root cause of equipment damage. Where do you get the
depreciated, and the inter-departmental coordination required add up to a evidence you need to determine the actual root cause of the failure?
hefty wad of cash that the taxpayers must pony up. Of course, this must
have been determined to be appropriate, or local law-enforcement efforts A good place to start is with the equipment operators. How often
would be shut down. Is this initial level of investigation really necessary? have you heard (AFTER the gear is down), “Oh, yeah, it’s been doing
In fact, why not wait a few days for everyone to calm down, let the that for a while,” or “It’s always been that way.” This can be one of
emotions die off? After all, we are hurting the business owner by the most frustrating times in the life of the maintenance manager,
restricting access to the shop, bothering his customers, even
listening to an operator describe in detail the tell-tail signs that his
appropriating pieces of his store or inventory. Let him get back on his
gear is about to fail. However, at this point in the failure analysis,
feet. What makes this worth the effort?
this is just INFORMATION TO BE GATHERED. The fact that the
operator did not inform anyone about the previous abnormalities is
The reason this is acceptable is that there is really no other method
yet another data point. Again, this is only data that can be used later
available that can reliably produce the required results. If the
for root cause analysis and corrective actions. Do not draw any
photographer was not there, there would be no record of the actual
environment at the scene. Evidence that is not quickly and conclusions at this time.
accurately recorded will be lost or modified, with no hope of retrieval. Some companies have trained their operators to immediately
We could wait to begin researching background information, but this document the conditions encountered at the time of a failure. The
will just prolong the successful completion of the investigation data is often written on a standard form or in the operator’s log using
beyond reasonable time-limits. Sweeping up and throwing away the an approved format. In either event, the report should include some
broken glass gets the business up and running, but for how long? basic information:
Without this process in place, the crime is almost guaranteed to • Time and Date
happen again. The stricken store may install bars on the windows, • The initial indication of the failure (loud vibration, initial alarm, etc.)
but the criminal still at-large will just find another way in, or move on • Operator’s name
to the next store down the street.
• Operation being performed (start-up, shutdown, capacity test, etc)
The process of determining the cause of an equipment malfunction • Any alarms, indicators, warnings, or other installed indications,
can often seem as daunting as a major crime scene investigation. It including pressure and temperature of the process
/ Reliability Engineering
• Environmental conditions (air conditioning secured for 3 hours) part to determine not only what broke, but how it broke. The failure
• Physical conditions noted (smoke, noise, smell, hot to touch) mode and failure agents must be determined to find and eliminate the
• Actions taken in response to the failure actual cause of the failure.
/ Reliability Engineering
for. They are doing the best they can, but generally they may not remaining causes are now known, and valuable data can be brought
have the expertise or the guidance to look for the right thing. In this to the jobsite to find the actual cause. You now know the right
example, the shop removing the pump is the rigging shop. They are questions to ask during the equipment teardown:
good at what they do, but they are not pump rebuilding experts by
1. Is there a misalignment between the pump and motor?
any stretch of the imagination.
2. Is there casing distortion due to excessive pipe strain?
Before disassembly, you must have a list of probable failure modes.
These can be obtained from many sources. Previous troubleshooting At this point, you can continue the investigation just like any other.
and repair records are a great resource for recurring failures (although Since you know what to ask, you know what to look for. You can go
the fact that they are recurring should catch your attention. If you to the job site and gather the extra data that you need. In this case,
are lucky enough to have that expert on-site, use him. There may before the pump is unbolted from the foundation, you notice the
even be troubleshooting tables available that can give you guidance riggers are connecting chain falls to the discharge piping and the
as to where to start looking. pump. When questioned, the riggers tell you that it took chain falls
to get the piping aligned during installation, and there will be quite a
In this example, the possibilities you have put together may bit of tension as the flange bolts are loosened.
include:
a. A bent shaft The Root Cause?
Our timeline would now look something like Figure 2:
b. Distorted pump casing due to piping strain
c. Pump / motor mis-alignment
We found the root cause!! Those mechanics obviously don’t know
d. Pump imbalance what they’re doing and are flexing the pipe (and the pump casing) too
e. Motor imbalance much. Tell those mechanics to line it up right next time!
f. Motor electrical problems
Sound reasonable? Of course not. Unfortunately, this is the type
You can eliminate many of these causes right away (the pump had of response that is heard over and over again throughout industry.
been verified in balance, the shaft was not bent, etc). The possible “Tell those guys to be more careful.” This has the same effect as
telling your son (after running over the mailbox) to drive more Summary
carefully in the future. You’ll get a half-hearted “OK,” and nothing has Industry is spending large sums of money on predictive
changed. These are not the root causes. After completing the maintenance systems, allowing them to know WHEN the gear is
investigation and fully analyzing the problems, several root causes about to fail, but none of these systems can tell you WHY. It is up to
may be found. For example: the trained investigator, with the right tools, to be able to avoid the
costly repeat failures that continue to plague the manufacturing field.
1. The prints used to fabricate the piping contained a typographical
error, causing the incorrect piping length to be used. By ensuring failures are understood and fixed right the first time,
2. Riggers were not trained on the correct method of rigging enormous amounts of time, effort, and money can be saved, allowing
pumps. your production lines to remain operating at peak capacity.
3. A procedure for rigging the pump was already written, but it
was buried in the notes section of the piping print.
4. No audits had ever been conducted on rigging large pumps and
valves into position.
5. Supervisors were not available during the rigging.
6. The personnel in the pump shop did not communicate effectively
to the riggers.
7. After the first failure, there was no process in place to
determine the actual root cause. (In actuality, this incident was
discovered by an independent supervisor working another job
watching the riggers install the chain falls.)
Corrective Actions
This is another point in the incident investigation process that often
fails. Corrective actions must now be assigned that are meaningful,
achievable, and the results measurable. For example, it does no good
to tell the workers to be more careful. Each of the root causes must
be addressed on its own merit, with corrective actions assigned,
carried through, and audited.
Best Practice
Who has time for this type of analysis? In reality, all best in class
companies have found the time. The time spent properly following up
on equipment failures is rarely wasted time. In fact the savings are
compounded, 2-fold. In this particular case, the time spent
conducting a proper equipment failure analysis would have saved the
shipyard the 3 weeks and over $150,000 in delays after the first
bearing failure. In addition, if the corrective actions are not
implemented, this same issue is almost guaranteed to happen again,
causing repeat equipment failures and delays further down the road.