Memo from NASA MOD (Mission Operations Division)
(End of NASA Memo)As most of you have already heard, we failed to make the runway at KSC
at the end of our STS- xx D/O Prep sim earlier this week. There were
several failures throughout the entire sim, including
the glided entry phase. However, by the time we reached the HAC
[Heading Alignment Cone: the big 300-degree turn that the shuttle
makes to do energy management; the crew "plays" the HAC to account for
high or low energy in the phase of the glide just before lining up on
final], we had configured for landing and the systems failures, etc.
were essentially behind us.
We at MCC [Mission Control Center] had basically stopped talking to
the crew, except for the HAC energy calls - things were relatively
quiet the way you like it on the HAC. At some point on the HAC (or,
earlier?) the SMS [Shuttle Mission Simulator] suffered a real hardware
failure of the CDR [Mission Commander's] ADI [Attitude-Direction
Indicator: the "8-ball" instrument installed on both the left and
right sides of the instrument panel, each containing three error
needles that are essentially flight director command bars for pitch,
roll and yaw] pitch error needle - it failed static [the pitch-axis
needle froze with no error flag]. As a result, about 20 seconds after
going CSS [Control Stick Steering: below about Mach 1.3, the mission
commander takes over from the autopilot and flies the vehicle
manually, because the shuttle's autopilot has no real
redundancy...remember it was designed in 1972] the CDR began a
continuous, gradual pitch down. About 20 seconds after that, guidance
commanded a HAC shrink [playing the cone: if you're low on energy, you
fly a smaller cone with a smaller circumference] as a result of the
altitude error low. At some point during this time, the crew called
the MCC and said that they thought they had some kind of guidance or
nav problem.
The MCC confirmed good nav, good guidance, and good sensors. (The MCC
called that the airspeed was much too high and that they should check
the airspeed; the MCC energy call at the 90 was 5 knots low.)
Within 30 seconds from the time that the CDR started the pitch down,
the situation was very serious and essentially out of hand. The MCC
knew that we were getting much too low and that we needed to pull up.
The crew knew that there was a problem but had not identified exactly
what it was, and thus had not started to correct it (still pitching
down). When the CDR realized that his ADI needle must be failed, he
handed over control to the PLT [Pilot: second-in-command to the CDR].
The PLT managed to get theta and EAS [equivalent airspeed] under
control, but the vehicle stalled (Alpha 20 [angle-of-attack, 20
degrees pitch-up], EAS 155) about one mile short of the runway while
the crew was trying to stretch it in.
What happened? We suffered a single, real-world hardware failure, and
we lost the vehicle (and crew?) in this sim. How is this possible,
with all of our tools on the ground and with the many instruments and
built-in crosschecks onboard?
The short answer is not an easy one to come to, but the consensus is
that we essentially had a breakdown in the cockpit - a cockpit
resource management problem. The crew feels like they were able to
determine that there was a problem, but that they did not identify the
problem as quickly as they could have, and thus their response and
corrective action was too little too late. We think also that perhaps
the MCC could have been a bit quicker and more crisp in our
recognition of the problem and in our response. Additionally, we think
that for these kinds of scenarios, the MCC should be emphatic and
forceful with our calls to the crew in order to accurately reflect the
criticality of the situation.
This is a tough case folks, but we need to be able to sustain a single
real-world hardware failure and make it to the runway. Indeed, in such
a case we depend on the crew to be prime for psyching-out instrument
failures onboard. Everyone believes that this crew and any assigned
trained crew can (and will be able to) determine when one of their
primary instruments has failed, and ultimately recover from any
adverse affects. We certainly depend on this for many cases where we
would not be able to react in time from the ground. In a lot of cases,
the crew is on the scene and the actions are super time critical. We
also depend on the MCC to sing out when we see things that we don't
understand or that look bad, whether it's energy or altitude on the
HAC, or some other problem.
I think the message here is for all of us to remember that most likely
in the real world we will not have the second or third IMU [inertial
measurement unit: gives the crew attitude data; there are three on
board] failure, or have to perform a single APU [Auxiliary Power Unit:
the hydrazine-powered turbine that powers the shuttle's hydraulics;
there are three hydraulic systems and three APUs on board] landing, or
suffer two main buss failures. Rather, we will probably see something
like this, something that perhaps we don't understand, or that we
haven't seen before in a sim. When we do, we must be ready to resort
to our discipline and training, and we must separate those things that
we know to be true from those things that we don't understand, and
then communicate that as accurately and as expeditiously as the
situation requires. Be alert, talk to each other, and be aggressive if
it becomes necessary.
Please take time to think about this case, and others like it. I think
we can learn from this, and that we should take some time to think
about other such scenarios that could "look and feel" like this one.
Additionally, we recorded the run on the GPO w/s [Guidance Procedures
Officer's workstation] in the MCC and I would like every GPO and GSO
[Guidance Systems Officer] to see this run at least once or twice on
our GPO displays. For those of you who have not seen it yet, please
make time to do so in the next week. I encourage you to take time to
do this, and to invite other folks to go with you.