Sunteți pe pagina 1din 14

The Space Shuttle faced many vehicle control challenges during

Avionics, ascent, as did the Orbiter during on-orbit and descent operations.
Navigation, and Such challenges required innovations such as fly-by-wire, computer
Instrumentation redundancy for robust systems, open-loop main engine control, and
navigational aides. These tools and concepts led to groundbreaking
technologies that are being used today in other space programs
and will be used in future space programs. Other government agencies
Gail Chapline
as well as commercial and academic institutions also use these
Reconfigurable Redundancy
Paul Sollock analysis tools. NASA faced a major challenge in the development of
Shuttle Single Event Upset Environment instruments for the Space Shuttle Main Enginesengines that operated
Patrick ONeill at speeds, pressures, vibrations, and temperatures that were
Development of Space Shuttle unprecedented at the time. NASA developed unique instruments and
Main Engine Instrumentation software supporting shuttle navigation and flight inspections. In addition,
Arthur Hill
the general purpose computer used on the shuttle had static random
Unprecedented Rocket Engine
Fault-Sensing System access memory, which was susceptible to memory bit errors or bit flips
Tony Fiorucci from cosmic rays. These bit flips presented a formidable challenge as
Calibration of Navigational Aides they had the potential to be disastrous to vehicle control.
Using Global Positioning Computers
John Kiriazes

242 Engineering Innovations

Reconfigurable Space Shuttle Columbia successfully time-domain-multiplexed data buses,
concluded its first mission on fly-by-wire flight control, and digital
Redundancy April 14, 1981, with the worlds first autopilots for aircraft, which provided
The Novel Concept two-fault-tolerant Integrated Avionics a level of functionality and reliability
Systema system that represented a at least a decade ahead of the avionics
Behind the curious dichotomy of past and future in either military or commercial
Worlds First technologies. On the one hand, many aircraft. Beyond the technological
of the electronics components, having nuts and bolts of the on-board
Two-Fault-Tolerant been selected before 1975, were system, two fundamental yet innovative
Integrated already nearing technical obsolescence. precepts enabled and shaped the actual
Avionics System On the other hand, it used what were implementation of the avionics system.
then-emerging technologies; e.g., These precepts included the following:
n The entire suite of avionics
functions, generally referred to as
subsystemsdata processing
(hardware and software), navigation,
flight control, displays and controls,
communications and tracking, and
electrical power distribution and
controlwould be programmatically
and technically managed as an
integrated set of subsystems.
Given that new and unique types
of complex hardware and software
had to be developed and certified,
it is difficult to overstate the role
that approach played in keeping those
activities on course and on schedule
toward a common goal.
n A digital data processing subsystem
comprised of redundant central
processor units plus companion
input/output units, resident software,
digital data buses, and numerous
remote bus terminal units would
function as the core subsystem to
interconnect all avionics subsystems.
It also provided the means for the
crew and ground to access all
vehicle systems (i.e., avionics and
non-avionics systems). There were
exceptions to this, such as the landing
gear, which was lowered by the crew
via direct hardwired switches.

STS-1 launch (1981) from Kennedy Space Center, Florida. First crewed launch using two-fault-tolerant
Integrated Avionics System.

Engineering Innovations 243

Avionics System Patterned and only a modicum of off-the-shelf status and continue the mission after
After Apollo; Features equipment was directly applicable. one computer failure. Thus, four
computers were required to meet
and Capabilities Unlike Any
Why Fail Operational/Fail Safe? the fail-operational/fail-safe
Other in the Industry requirement. That level of redundancy
Previous crewed spacecraft were
The preceding tenets were very much applied only to the computers. Triple
designed to be fail safe, meaning that
influenced by NASAs experience redundancy was deemed sufficient for
after the first failure of a critical
with the successful Apollo primary other components to satisfy the
component, the crew would abort
navigation, guidance, and control fail-operational/fail-safe requirement.
the mission by manually disabling the
system. The Apollo-type guidance
primary system and switching over
computer, with additional specialized
to a backup system that had only Central Processor Units
input/output hardware, an inertial
the minimum capability to return the Were Available Off the Shelf
reference unit, a digital autopilot,
vehicle safely home. Since the shuttles Remaining Hardware
fly-by-wire thruster control, and an
basic mission was to take humans
alphanumeric keyboard/display unit and Software Would Need
and payloads safely to and from orbit,
represented a nonredundant subset of to be Developed
the fail-operational requirement was
critical functions for shuttle avionics
intended to ensure a high probability The next steps included: selecting
to perform. The proposed shuttle
of mission success by avoiding costly, computer hardware that was for
avionics represented a challenge for
early termination of missions. military use yet commercially
two principal reasons: an extensive
Early conceptual studies of a available; choosing the actual
redundancy scheme and a reliance
shuttle-type vehicle indicated that configuration, or architecture, of
on new technologies.
vehicle atmospheric flight control the computer(s), data bus network,
Shuttle avionics required the and bus terminal units; and then
required full-time computerized
development of an overarching and developing the unique hardware and
stability augmentation. Studies also
extensive redundancy management software to implement the worlds
indicated that in some atmospheric
scheme for the entire integrated first two-fault-tolerant avionics.
flight regimes, the time required for
avionics system, which met the shuttle
a manual switchover could result in In 1973, only two off-the-shelf
requirement that the avionics system
loss of vehicle. Thus, fail operational computers available for military aircraft
be fail operational/fail safei.e.,
actually meant that the avionics had to offered the computational capability for
two-fault tolerant with reaction times
be capable of graceful degradation the shuttle. Both computers were basic
capable of maintaining safe
such that the first failure of a critical processor unitstermed central
computerized flight control in a
component did not compromise the processor unitswith only minimal
vehicle traveling at more than 10
avionic systems capability to maintain input/output functionality. NASA
times the speed of high-performance
vehicle stability in any flight regime. selected a vendor to provide the central
military aircraft.
The graceful degradation requirement processor units plus new companion
Shuttle avionics would also rely on input/output processors that would be
(derived from the fail-operational/
new technologiesi.e., time-domain developed to specifications provided by
fail-safe requirement) immediately
data buses, digital fly-by-wire architecture designers. At the time, no
provided an answer to how many
flight control, digital autopilots for proven best practices existed for
redundant computers would be
aircraft, and a sophisticated software interconnecting multiple computers,
necessary. Since the computers were
operating system that had very data buses, and bus terminal units
the only certain way to ensure timely
limited application in the aerospace beyond the basic active/standby manual
graceful degradationi.e., automatic
industry of that time, even for switchover schemes.
detection and isolation of an errant
noncritical applications, much less
computersome type of computerized The architectural concept figured
for man-rated usage. Simply put,
majority-vote technique involving a heavily in the design requirements for
no textbooks were available to guide
minimum of three computers would the input/output processor and two
the design, development, and flight
be required to retain operational other new types of hardware boxes as
certification of those technologies

244 Engineering Innovations

Interconnections Were Key to Avionics Systems Success
Shuttle Systems Redundancy
Shuttle Systems Elements Diagram illustrates the eight flight-critical
buses of the 24 buses on the Orbiter.
Four General Purpose Computers
Eight Flight-critical
Connections to Multiplexer/Demultiplexers
Central Input/ Data Buses
Processing Output (24 per computer) Flight Forward 1
Unit Processor
Primary Secondary
General Purpose
Flight-critical Computer 1
Bus 1

Multiplex Interface
Adapters (24 per computer) Flight Aft 1
Primary Secondary

Flight-critical Bus 5
Two Multiplex
Interface Adapters
Flight Forward 2
General Purpose
Primary Port Assortment of Primary Secondary Computer 2
Connections to various modules
two Data Buses selected to
Secondary Port interface with Bus 2
devices in the
region supported
Control by the Flight Aft 2
Electronics demultiplexer. Primary Secondary

Connections Flight-critical Bus 6

Multiplexer/Demultiplexer to Various
Vehicle General Purpose
Subsystems Flight Forward 3 Computer 3
Primary Secondary

Bus 3

Architecture designers for the shuttle Flight Aft 3

avionics system had three goals: provide Primary Secondary

interconnections between the four Flight-critical Bus 7 General Purpose

Computer 4
computers to support a synchronization
Flight Forward 4
scheme; provide each computer access Primary Secondary

to every data bus; and ensure that the Flight-critical

Bus 4
multiplexer/demultiplexers were
Flight Aft 4 = Primary Controlling Computer
sufficiently robust to preclude a single Primary Secondary = Listen Only Unless Crew-initiated
Reconfiguration Enables Control Capability
internal failure from preventing computer Flight-critical Bus 8

access to the systems connected to that

redundancy in the form of two independent reconfiguration capability. The total
To meet those goals, engineers designed ports for connections to two data buses. complement of such hardware on the
the input/output processor to interface The digital data processing subsystem vehicle consisted of 24 data buses,
with all 24 data buses necessary to cover possessed eight flight-critical data buses 19 multiplexer/demultiplexers, and an
the shuttle. Likewise, each multiplexer/ and the eight flight-critical multiplexer/ almost equal number of other types of
demultiplexer would have internal demultiplexers. They were essential to the specialized bus terminal units.

Engineering Innovations 245

well as the operating system software, in the field, their reliability was so poor The Costs and Risks of
all four of which had to be uniquely that they could not be certified for the Reconfigurable Redundancy
developed for the shuttle digital data shuttle man-rated application; and
The benefits of interconnection
processing subsystem. Each of those following the Approach and Landing
flexibility came with costs, the most
four development activities would Tests (1977), NASA found that the
obvious being increased verification
eventually result in products that software for orbital missions exceeded
testing needed to certify each
established new limits for the so-called the original memory capacity. The
configuration performed as designed.
state of the art in both hardware and central processor units were all
Those activities resulted in a set of
software for aerospace applications. upgraded with a newer memory design
formally certified system
that doubled the amount of memory.
In addition to the input/output reconfigurations that could be invoked
That memory flew on Space
processor, the other two new devices at specified times during a mission.
Transportation System (STS)-1 in 1981.
were the data bus transmitter/receiver Other less-obvious costs stemmed from
unitsreferred to as the multiplex Although the computers were the only the need to eliminate single-point
interface adapterand the bus devices that had to be quad redundant, failures. Interconnections offered the
terminal units, which was termed NASA gave some early thought to potential for failures that began in one
the multiplexer/demultiplexer. simply creating four identical strings redundant element and propagated
NASA designated the software as with very limited interconnections. throughout the entire redundant
the Flight Computer Operating System. The space agency quickly realized, systemtermed a single-point
The input/output processors (one however, that the weight and volume failurewith catastrophic
paired with each central processor associated with so much additional consequences. Knowing such, system
unit) was necessary to interface the hardware would be unacceptable. designers placed considerable emphasis
units to the data bus network. The Each computer needed the capability on identification and elimination of
numerous multiplexer/demultiplexers to access every data bus so the failure modes with the potential to
would serve as the remote terminal system could reconfigure and regain become single-point failures. Before
units along the data buses to capability after certain failures. NASA describing how NASA dealt with
effectively interface all the various accomplished such reconfiguration by potential catastrophic failures, it is
vehicle subsystems to the data bus software reassignment of data buses to necessary to first describe how the
network. Each central processor different general purpose computers. redundant digital data processing
unit/input/output processor pair was subsystem was designed to function.
The ability to reconfigure the system
called a general purpose computer.
and regain lost capability was a novel
Establishing Synchronicity
The multiplexer/demultiplexer was an approach to redundancy management.
extraordinarily complex device that Examination of a typical mission profile The fundamental premise for the
provided electronic interfaces for the illustrates why NASA placed a premium redundant digital data processing
myriad types of sensors and effectors on providing reconfiguration capability. subsystem operation was that all four
associated with every system on the Ascent and re-entry into Earths general purpose computers were
vehicle. The multiplex interface atmosphere represented the mission executing identical software in a
adaptors were placed internal to the phases that required automatic failure time-synchronized fashion such that
input/output processors and the detection and isolation capabilities, all received the exact same data,
multiplexer/demultiplexers to provide while the majority of on-orbit operations executed the same computations, got
actual electrical connectivity to the data did not require full redundancy when the same results, and then sent the exact
buses. Multiplex interface adaptors there was time to thoroughly assess the same time-synchronized commands
were supplied to each manufacturer of implications of any failures that and/or data to other subsystems.
all other specialized devices that occurred prior to re-entry. When a
Maintenance of synchronicity
interfaced with the serial data buses. computer and a critical sensor on
between general purpose computers
The protocol for communication on another string failed, the failed computer
was one of the truly unique features
those buses was also uniquely defined. string could be reassigned via software
of the newly developed Flight
control to a healthy computer, thereby
The central processor units later Computer Operating System. All four
providing a fully functional operational
became a unique design for two general purpose computers ran in a
configuration for re-entry.
reasons: within the first several months synchronized fashion that was keyed

246 Engineering Innovations

Shuttle Single Event Upset Environment
Five general purpose computersthe heart of the Orbiters guidance, navigation, and flight control systemwere upgraded in 1991.
The iron core memory was replaced with modern static random access memory transistors, providing more memory and better
performance. However, the static random access memory computer chips were susceptible to single event upsets: memory bit flips
caused by high-energy nuclear particles. These single event upsets could be catastrophic to the Orbiter because general purpose
computers were critical to flights since one bit flip could disable the computer.

An error detection and correction code was implemented to fix flipped bits in a computer word by correcting any single erroneous bit.
Whenever the system experienced a memory bit flip fix, the information was downlinked to flight controllers on the ground in Houston,
Texas. The event time and the Orbiters ground track resulted in the pattern of bit flips around the Earth.

The bit flips correlated with the known space radiation environment. This phenomena had significant consequences for error detection
and correction codes, which could only correct one error in a word and would be foiled by a multi-bit error. In response, system architects
selected bits for each word from different chips, making it almost impossible for a single particle to upset more than one bit per word.

In all, the upgraded Orbiter general purpose computers performed flawlessly in spite of their susceptibility to ionizing radiation.

Earths Magnetic Equator

Single event upsets are indicated by yellow squares. Multi-bit single event upsets are indicated by red triangles.
In these single events, anywhere from two to eight bits were typically upset by a single charged particle.

to the timing of the intervals when That sequence (input/process/output) The four general purpose computers
general purpose computers were to repeated 25 times per second. The exchanged synchronization status
query the bus terminal units for data, aerodynamic characteristics of the approximately 350 times per second.
then process that data to select the best shuttle dictated the 25-hertz (Hz) rate. The typical failure resulted in the
data from redundant sensors, create In other words, the digital autopilot computer halting anything resembling
commands, displays, etc., and finally had to generate stability augmentation normal operation.
output those command and status data commands at that frequency for the
to designated bus terminal units. vehicle to retain stable flight control.

Engineering Innovations 247

majority-voting scheme. A nuance
associated with the practical meaning
of simultaneous warranted
significant attention from the
designers. It was quite possible for
internal circuitry in complex
electronics units to fail in a manner
that wasnt immediately apparent
because the circuitry wasnt used
in all operations. This failure could
remain dormant for seconds, minutes,
or even longer before normal
activities created conditions requiring
use of the failed devices; however,
should another unrelated failure occur
that created the need for use of the
previously failed circuitry, the
practical effect was equivalent to
two simultaneous failures.
To decrease the probability of such
pseudo-simultaneous failures, the
general purpose computers and
multiplexer/demultiplexers were
designed to constantly execute cyclic
A fish-eye view of the multifunction electronic display subsystemor glass cockpitin the background self-test operations and
fixed-base Space Shuttle mission simulator at Johnson Space Center, Texas.
cease operations if internal problems
Early Detection of Failure time in a quiescent flight control profile were detected.
such that those sensors were operating
NASA designed the four general Ferreting Out Potential
very near their null points. Prior to
purpose computer redundant set to Single-point Failures
re-entry, the vehicle executed some
gracefully degrade from either four
designed maneuvers to purposefully Engineering teams conducted design
to three or from three to two
exercise those devices in a manner to audits using a technique known as
members. Engineers tailored specific
ensure the absence of permanent null failure modes effects analysis to identify
redundancy management algorithms
failures. The respective design teams types of failures with the potential to
for dealing with failures in other
for the various subsystems were always propagate beyond the bounds of the
redundant subsystems based on
challenged to strike a balance between fault-containment region in which they
knowledge of each subsystems
early detection of failures vs. nuisance originated. These studies led to the
predominant failure modes and the
false alarms, which could cause the conclusion that the digital data
overall effect on vehicle performance.
unnecessary loss of good devices. processing subsystem was susceptible
NASA paid considerable attention to to two types of hardware failures with
means of detecting subtle latent failure Decreasing Probability of the potential to create a catastrophic
modes that might create the potential Pseudo-simultaneous Failures condition, termed a nonuniversal
for a simultaneous scenario. Engineers input/output error. As the name
There was one caveat regarding the
scrutinized sensors such as gyros and implies, under such conditions a
capability to be two-fault tolerant
accelerometers in particular for null majority of general purpose computers
the system was incapable of coping
failures. During orbital operation, the may not have received the same data
with simultaneous failures since
vehicle typically spent the majority of and the redundant set may have
such failures obviously defeat the

248 Engineering Innovations

diverged into a two-on-two
configuration or simply collapsed
into four disparate members. Loss of Two General Purpose Computers
Engineers designed and tested the
topology, components, and data
Tested Resilience
encoding of the data bus network to
ensure that robust signal levels and
data integrity existed throughout the
network. Extensive laboratory testing
confirmed, however, that the two
types of failures would likely create
conditions resulting in eventual loss
of all four computers.
The first type of failure and the
easiest to mitigate was some type of
physical failure causing either an open
or a short circuit in a data bus. Such a
condition would create an impedance
mismatch along the bus and produce Space Shuttle Columbia (STS-9) makes a successful landing at Dryden Flight Research
classic transmission line effects; Center on Edwards Air Force Base runway, California, after reaching a fail-safe condition
e.g., signal reflections and standing while on orbit.
waves with the end result being
Shuttle avionics never encountered any type (hardware or software) of single-point
unpredictable signal levels at the
receivers of any given general purpose failure in nearly 3 decades of operation, and on only one occasion did it reach
computer. The probability of such a the fail-safe condition. That situation occurred on STS-9 (1983) and demonstrated the
failure was deemed to be extremely resiliency afforded by reconfiguration.
remote given the robust mechanical and
electrical design as well as detailed While on-orbit, two general purpose computers failed within several minutes of each
testing of the hardware, before and after other in what was later determined to be a highly improbable, coincidental occurrence
installation on the Orbiter. of a latent generic hardware fault. By definition, the avionics was in a fail-safe
The second type of problem was not condition and preparations were begun in preparation for re-entry into Earths
so easily discounted. That problem atmosphere. Upon cycling power, one of the general purpose computers remained
could occur if one of the bus failed while the other resumed normal operation. Still, with that machine being suspect,
terminal units failed, thus generating NASA made the decision to continue preparation for the earliest possible return.
unrequested output transmissions.
As part of the preparation, sensors such as the critical inertial measurement unit,
Such transmissions, while originating
which were originally assigned to the failed computer, were reassigned to a healthy one.
from only one node in the network,
would nevertheless propagate to each Thus, re-entry occurred with a three-computer configuration and a full set of inertial
general purpose computer and disrupt measurement units, which represented a much more robust and safe configuration.
the normal data bus signal levels
and timing as seen by each general The loss of two general purpose computers over such a short period was later attributed
purpose computer. It should be to spacelight effects on microscopic debris inside certain electronic components. Since
mentioned that no amount of analysis all general purpose computers in the inventory contained such components, NASA
or testing could eliminate the delayed subsequent flights until sufficient numbers of those computers could be purged
possibility of a latent, generic software of the suspect components.
error that could conceivably cause all

Engineering Innovations 249

four computers to fail. Thus, Development of hundred times the force of gravity over
the program deemed that a backup almost 8 hours of an engines total
computer, with software designed Space Shuttle planned operational exposure. For these
and developed by an independent Main Engine reasons, the endurance requirements of
organization, was warranted as a the instrumentation constituent materials
safeguard against that possibility. Instrumentation were unprecedented.
This backup computer was an identical The Space Shuttle Main Engine Engine considerations such as
general purpose computer designed to operated at speeds and temperatures weight, concern for leakage that
listen to the flight data being unprecedented in the history of might be caused by mounting bosses,
collected by the primary system and spaceflight. How would NASA and overall system fault tolerance
make independent calculations that measure the engines performance? prompted the need for greater
were available for crew monitoring. redundancy for each transducer.
Only the on-board crew had the NASA faced a major challenge in the Existing supplier designs, where
switches, which transferred control of development of instrumentation for available, were single-output
all data buses to that computer, thereby the main engine, which required a new devices that provided no redundancy.
preventing any rogue primary generation capable of measuring A possible solution was to package
computers from interfering with the and survivingits extreme operating two or more sensors within a single
backup computer. pressures and temperatures. NASA transducer. But this approach required
not only met this challenge, the space special adaptation to achieve the
Its presence notwithstanding, the agency led the development of such desired small footprint and weight.
backup computer was never considered instrumentation while overcoming
a factor in the fail-operational/fail-safe numerous technical hurdles. NASA considered the option of
analyses of the primary avionics strategically placing instrumentation
system, andat the time of this devices and closely coupling them to the
publicationhad never been used in Initial Obstacles desired stimuli source. This approach
that capacity during a mission. prompted an appreciation of the inherent
The original main engine
simplicity and reliability afforded by
instrumentation concept called for
low-level output devices. The
Summary compact flange-mounted transducers
avoidance of active electronics tended
with internal redundancy, high
The shuttle avionics system, which to minimize electrical, electronic, and
stability, and a long, maintenance-
was conceived during the dawn electromechanical part vulnerability to
free life. Challenges presented
of the digital revolution, consistently hostile environments. Direct mounting
themselves immediately, however.
provided an exceptional level of of transducers also minimized the
Few instrumentation suppliers were
dependability and flexibility without amount of intermediate hardware
interested in the limited market
any modifications to either the basic capable of producing a catastrophic
projected for the shuttle. Moreover,
architecture or the original innovative system failure response. Direct
early engine testing disclosed that
design concepts. While engineers mounting, however, came at a price. In
standard designs were generally
replaced specific electronic boxes some situations, it was not possible to
incapable of surviving the harsh
due to electronic component design transducers capable of surviving
environments. Although the hot side
obsolescence or to provide improved the severe environments, making it
temperatures were within the realm of
functionality, they took great care necessary to off-mount the device.
jet engines, no sort of instrumentation
to ensure that such replacements Pressure measurements associated with
existed that could handle both high
did not compromise the proven the combustion process suffered from
temperatures and cryogenic
reliability and resiliency provided by icing or blockage issues when hardware
environments down to minus -253C
the original design. temperatures dropped below freezing.
(-423F). Vibration environments with
Purging schemes to provide positive
high-frequency spectrums extending
flow in pressure tubing were necessary
beyond commercially testable ranges
to alleviate this condition.
of 2,000 hertz (Hz) experienced several

250 Engineering Innovations

Several original system mandates were resistance temperature devices being Weakness Detection
later shown to be ill advised, such as an baselined for all thermal measurements. and Solutions
early attempt to achieve some measure
Aggressive engine performance and In some instances, the engine
of standardization through the use of
weight considerations also compromised environment revealed weaknesses
bayonet-type electrical connectors.
the optimal sensor mountings. For not normally experienced in industrial
Early engine-level and laboratory testing
example, it was not practical to include or aerospace applications. Some
revealed the need for threaded
the prescribed straight section of tubing hardware successfully passed
connectors since the instrumentation
upstream from measuring devices, component-level testing only to
components could not be adequately
particularly for flow. This resulted in experience problems at subsystem or
shock-isolated to prevent failures
the improper loading of measuring engine-level testing. Applied vibration
induced by excessive relative connector
devices, primarily within the propellant spectrums mimicked test equipment
motion. Similarly, electromagnetic
oxygen ducting. The catastrophic limitations where frequency ranges
interference assessments and observed
failure risks finally prompted the typically did not extend beyond
deficiencies resulted in a reconsideration
removal or relocation of all intrusive 2,000 Hz. The actual engine
of the need for cable overbraiding to
measuring devices downstream of the recognized no limits and continued to
minimize measurement disruption.
high-pressure oxygen turbopump. expose the hardware to energy above
Problems also extended to the sensing Finally, the deficiencies of vibration even 20,000 Hz. Therefore, a critical
elements themselves. The lessons of redline systems were overcome as sensor resonance condition might
material incompatibilities or deficiencies processing hardware and algorithms only be excited during engine-level
were evident in the area of resistance matured to the point where a real-time testing. Similarly, segmenting of
temperature devices and thermocouples. synchronous vibration redline system component testing into separate
The need for the stability of temperature could be adopted, providing a vibration, thermal, and fluid testing
measurements led to platinum-element significant increase in engine reliability. deprived the instrumentation of
experiencing the more-severe effect
of combined exposures.

Wire Failures The shuttles reusability revealed

failure modes not normally
Prompted encountered, such as those ascribed
to the differences between flight
System Redesign and ground test environments.
It was subsequently found that the
High temperature measurements
microgravity exposure of each flight
continued to suffer brittle allowed conductive particles within
fine-element wire failures until the instruments to migrate in a manner
Pratt & Whitney Rocketdyne. All rights reserved.

condition was linked to operation not experienced with units confined

above the material recrystallization to terrestrial applications. Main engine
temperature of 525C (977F) pressure transducers experienced
where excessive grain growth electrical shorts only during actual
would result. The STS-51F (1985) engine performance. During the
in-flight engine shutdown caused
countdown of Space Transportation
System (STS)-53 (1992), a
by the failure of multiple resistance
high-pressure oxidizer turbopump
temperature devices mandated a
secondary seal measurement output
redesign to a thermocouple-based pressure transducer data spike almost
High temperatures in some engine operating
system that eliminated the wire environments caused fine wires used in temperature triggered an on-pad abort. Engineers
embrittlement problem. devices to become brittle, thereby leading to failures. used pressure transducers screened

Engineering Innovations 251

by particle impact noise detection Expectations Exceeded Unprecedented
and microfocus x-ray examination
on an interim basis until a hardware As the original main engine design Rocket Engine
life of 10 years was surpassed, part
redesign could be qualified.
obsolescence and aging became a
Fault-Sensing System
concern. Later designs used more
Effects of Cryogenic current parts such as industry-standard The Space Shuttle Main Engine
Exposure on Instrumentation electrical connectors. Some suppliers (SSME) was a complex system that
chose to invest in technology driven used liquid hydrogen and liquid oxygen
Cryogenic environments revealed a as its fuel and oxidizer, respectively.
by the shuttle, which helped to ease
host of related material deficiencies. The engine operated at extreme levels
the programs need for long-term
Encapsulating materialsnecessary of temperature, pressure, and turbine
part availability.
to provide structural support for fine speed. At these levels, slight material
wires within speed sensorslacked The continuing main engine ground defects could lead to high vibration in
resiliency at extreme low temperatures. test program offered the ability to the turbomachinery. Because of the
The adverse effects of inadvertent use ongoing hot-fire testing to ensure potential consequences of such
exposure to liquefied gases within the that all flight hardware was conditions, NASA developed vibration
shuttles aft compartment produced sufficiently enveloped by older monitoring as a means of monitoring
functional failures due to excessively ground test units. Tracking algorithms engine health.
cold conditions. In April 1991, STS-37 and extensive databases permitted
was scrubbed when the high-pressure such comparisons. The main engine used both low- and
oxidizer turbopump secondary seal high-pressure turbopumps for fuel and
Industry standards called for periodic oxidizer propellants. Low-pressure
pressure measurement became erratic
recalibration of measuring devices. turbopumps served as propellant boost
due to the damaging effects of
NASA excluded this from the Space pumps for the high-pressure
cryogenic exposure of a circuit board.
Shuttle Main Engine Program at turbopumps, which in turn delivered
Problems with cryogenics also its inception to reduce maintenance fuel and oxidizer at high pressures to
extended to the externals of the for hardware not projected for use the engine main combustion chamber.
instrumentation. Cryopumping beyond 10 years. In practice, the
the condensation-driven pumping hardware life was extended to the The high-pressure pumps rotated at
mechanism of inert gases such as point that some engine components speeds reaching 36,000 rpm on the fuel
nitrogenseverely compromised the approached 40 years of use before side and 24,000 rpm on the oxidizer
ability of electrical connectors to the final shuttle flight. Aging studies side. At these speeds, minor faults were
maintain continuity. The normally validated the stable nature of exacerbated and could rapidly propagate
inert conditions maintained within the instruments never intended to fly so to catastrophic engine failure.
engine system masked a problem with long without recalibration. During the main engines 30-year
residual contamination of glassed ground test program, more than 40
resistive temperature devices used for major engine test failures occurred.
cryogenic propellant measurements.
High-pressure turbopumps were the
Corrosive flux left over from the While initial engine testing disclosed source of a large percentage of these
manufacturing process remained that instrumentation was a weak failures. Posttest analysis revealed that
dormant for years until activated during link, NASA implemented innovative the vibration spectral data contained
extended exposures to the humid and successful solutions that resulted potential failure indicators in the form
conditions at the launch site. STS-50 in a suite of proven instruments of discrete rotordynamic spectral
(1992) narrowly avoided a launch delay capable of direct application on future signatures. These signatures were prime
when a resistive temperature device rocket engines. indicators of turbomachinery health and
had to be replaced just days before the could potentially be used to mitigate
scheduled launch date.

252 Engineering Innovations

catastrophic engine failures if assessed
at high speeds and in real time.
NASA recognized the need for a
high-speed digital engine health
management system. In 1996, engineers
at Marshall Space Flight Center
(MSFC) developed the Real Time
Vibration Monitoring System and
integrated the system into the main
engine ground test program. The
system used data from engine-mounted
accelerometers to monitor pertinent
spectral signatures. Spectral data were
produced and assessed every 50
milliseconds to determine whether
specific vibration amplitude thresholds
were being violated.
NASA also needed to develop software
capable of discerning a failed sensor
from an actual hardware failure.
MSFC engineers developed the sensor
validation algorithma software
algorithm that used a series of rules and

Pratt & Whitney Rocketdyne. All rights reserved.

threshold gates based on actual
vibration spectral signature content to
evaluate the quality of sensor data
every 50 milliseconds.
Outfitted with the sensor validation
algorithm and additional software, the
Real Time Vibration Monitoring NASAs Advanced Health Monitoring System software was integrated with the Space
System could detect and diagnose Shuttle Main Engine controller (shown by itself and mounted on the engine) in 2007.
pertinent indicators of imminent main
engine turbomachinery failure and
initiate a shutdown command within system, supported the main engine This led to the concept of the SSME
100 milliseconds. ground test program throughout the Advanced Health Management
shuttle era. System as a means of extending
The Real Time Vibration Monitoring
this protection to the main engine
System operated successfully on more To prove that a vibration-based,
during ascent.
than 550 main engine ground tests high-speed engine health management
with no false assessments and a 100% system could be used for flight The robust software algorithms and
success rate on determining and operations, NASA included a redline logic developed and tested for
disqualifying failed sensors from its subscale version of the Real Time the Real Time Vibration Monitoring
vibration redlines. This, the first Vibration Monitoring System on System were directly applied to the
high-speed vibration redline system Technology Flight Experiment 2, Advanced Health Management System
developed for a liquid engine rocket which flew on STS-96 (1999). and incorporated into a redesigned

Engineering Innovations 253

version of the engine controller. Calibration of transoceanic abort landing sites
The Advanced Health Management intended for emergencies when the
Systems embedded algorithms Navigational Aides shuttle lost a main engine during ascent
continuously monitored the and could not return to KSCwere
high-pressure turbopump vibrations
Using Global located in Zaragoza and Moron in
generated by rotation of the pump Positioning Computers Spain and in Istres in France. Former
shafts and assessed rotordynamic transoceanic abort landing sites
performance every 50 milliseconds. The crew members awakened at included: Dakar, Senegal; Ben Guerir,
The system was programmed to initiate 5:00 a.m. After 10 days in orbit, they Morocco; Banjul, The Gambia;
a shutdown command in fewer than were ready to return to Earth. By Honolulu, Hawaii; and Anderson Air
120 milliseconds if vibration patterns 7:45 a.m., the payload bay doors were Force Base, Guam. NASA certified
indicated an instability that could lead closed and they were struggling into each site.
to catastrophic failure. their flight suits to prepare for descent.
The system also used the sensor- The commander called for a weather Error Sources
validation algorithm to monitor sensor report and advice on runway selection.
The shuttle could be directed to any one Because the ground portion of the
quality and could disqualify a failed
of three landing strips depending on Microwave Scanning Beam Landing
sensor from its redline suite or
weather at the primary landing site. and Tactical Air Navigation Systems
deactivate the redline altogether.
Regardless of the runway chosen, the contained moving mechanical
Throughout the shuttle era, no other
descent was controlled by systems components and depended on
liquid engine rocket system in the
capable of automatically landing the microwave propagation, inaccuracies
world employed a vibration-based
Orbiter. The Orbiter commander took could develop over time that might
health management system that used
cues from these landing systems, prove detrimental to a shuttle landing.
discrete spectral components to verify
controlled the descent, and dropped the For example, antennas could drift out of
safe operation.
landing gear to safely land the Orbiter. mechanical adjustment. Ground settling
During their approach to the landing and external environmental factors
Summary site, the Orbiter crew depended on a could also affect the systems accuracy.
complex array of technologies, Multipath and refraction errors could
The Advanced Health Management
including a Tactical Air Navigation result from reflections off nearby
System, developed and certified by
System and the Microwave Scanning structures, terrain changes, and
Pratt & Whitney Rocketdyne (Canoga
Beam Landing System, to provide day-to-day atmospheric variations.
Park, California) under contract to
precision navigation. These systems Flight inspection data gathered by the
NASA, flew on numerous shuttle
were located at each designated landing NASA calibration team could be used
missions and continued to be active on
site and had to be precisely calibrated to determine the source of these errors.
all engines throughout the remainder
to ensure a safe and smooth landing. Flight inspection involved flying an
of the shuttle flights.
aircraft through the landing system
Touchdown Sites coverage area and receiving
time-tagged data from the systems
Shuttle runways were strategically under test. Those data were compared
located around the globe to serve to an accurate aircraft positioning
several purposes. After a routine reference to determine error. Restoring
mission, the landing sites included integrity was easily achieved through
Kennedy Space Center (KSC) in system adjustment.
Florida, Dryden Flight Research
Center in California, and White Sands
Test Facility in New Mexico. The

254 Engineering Innovations

Global Positioning Satellite Summary
Position Reference NASA developed unique
for Flight Inspection instrumentation and software supporting
Technologies were upgraded several the shuttle navigation aids flight
times since first using the Global inspection mission. The agency
Positioning Satellite (GPS)-enabled developed aircraft pallets to operate,
flight inspection system. The flight control, process, display, and archive
inspection system used an aircraft data from several avionics receivers.
GPS receiver as a position reference. They acquired and synchronized
Differences between the system under measurements from shuttle-unique
test and the position reference were avionics and aircraft platform
recorded, processed, and displayed in avionics with precision time-tagged
real time on board the aircraft. An GPS position. NASA developed data
aircraft position reference used for processing platforms and software
flight inspection had to be several times algorithms to graphically display
more accurate than the system under and trend landing system performance
test. Stand-alone commercial GPS in real time. In addition, a graphical
systems did not have enough accuracy pilots display provided the aircraft
for this purpose. Several techniques pilot with runway situational
could be used to improve GPS awareness and visual direction cues.
positioning. Differential GPS used a The pilots display software, integrated
ground GPS receiver installed over a with the GPS reference system,
known surveyed benchmark. Common resulted in a significant reduction in
mode error corrections to the GPS mission flight time.
position were calculated and broadcast
over a radio data link to the aircraft. Synergy With the Federal
After the received corrections were Aviation Administration
applied, the on-board GPS position In early 2000, NASA and the Federal
accuracy was within 3 m (10 ft). Aviation Administration (FAA) entered
A real-time accuracy within 10 cm (4 into a partnership for flight inspection.
in.) was achieved by using a The FAA had existing aircraft assets to
carrier-phase technique and tracking perform its mission to flight-inspect
cycles of the L-band GPS carrier signal. US civilian and military navigation
NASA built several versions of the aids. The FAA integrated NASAs
flight inspection system customized carrier-phase GPS reference along with
to different aircraft platforms. Different shuttle-unique avionics and software
NASA aircraft were used based on algorithms into its existing control
aircraft availability. These aircraft and display computers on several
include NASAs T-39 jet (Learjet), a flight-inspection aircraft.
NASA P-3 turboprop, several C-130 The NASA/FAA partnership produced
aircraft, and even NASAs KC-135. increased efficiency, increased
Each aircraft was modified with shuttle capability, and reduced cost to the
landing system receivers and antennas. government for flight inspection of the
Several pallets of equipment were shuttle landing aids.
configured and tested to reduce the
installation time on aircraft to one shift.

Engineering Innovations 255