The Experienced Accident Investigator Trap
A Trap for Experienced Accident Investigators
The Accident
On February 24, 1989, United Airlines flight 811 experienced an explosive decompression while climbing out of Honolulu, Hawaii. Nine people were swept out of the resulting hole and died. The 747 made a successful, and quite miraculous, landing back in Hawaii.
The Initial Investigation by the NTSB
The National Transportation Safety Board (NTSB) was responsible for the investigation of this accident. The NTSB’s final report Executive Summary said:
“A year after the accident, the Safety Board was uncertain that the cargo door would be located and recovered from the Pacific Ocean. The Safety Board decided to proceed with a final report based on the available evidence without the benefit of an actual examination of the door mechanism. The original report was adopted by the Safety Board on April 16, 1990, as NTSB Report AAR-90-01.”
It also said:
“Before the recovery of the cargo door, the Safety Board believed that the door locking mechanisms had sustained damage in service prior to the accident flight to the extent that the door could have been closed and appeared to have been locked, when in fact the door was not fully latched. This belief was expressed in the report and was supported by the evidence available at the time.”
From the initial NTSB report, the experienced NTSB investigators concluded:
“There are no reasonable means by which the door locking and latching mechanisms could open mechanically in flight from a properly closed and locked position. If the lock sectors were in proper condition, and were properly situated over the closed latch cams, the lock sectors had sufficient strength to prevent the cams from vibrating to the open position during ground operation and flight. However, there are two possible means by which the cargo door could open while in flight. Either, the latching mechanisms were forced open electrically through the lock sectors after the door was secured, or the door was not properly latched and locked before departure. Then the door opened when the pressurization loads reached a point that the latches could not hold.”
Thus, the NTSB was implying that a ramp service person failed to properly close the door and verify that it was properly closed prior to takeoff. Also that the door latching mechanisms were damaged, probably by prior manual actuation. Finally, the second officer and the dispatch mechanic failed to detect the damage to the door latches.
Thus, in effect, the ramp service person was being blamed for the accident.
Imagine being the ramp service person and having experienced investigators conclude you did something wrong, and they publish it for the world to see, when you knew you had performed the work correctly. You were blamed for the death of nine people.
The Door Is Recovered
The NTSB final report Executive Summary said:
“Subsequently, on July 22, 1990, a search and recovery operation was begun by the U.S. Navy with the cost shared by the Safety Board, the Federal Aviation Administration, Boeing Aircraft Company, and United Airlines. The search and recovery effort was supported by Navy radar data on the separated cargo door, underwater sonar equipment, and a manned submersible vehicle. The effort was successful, and the cargo door was recovered in two pieces from the ocean floor at a depth of 14,200 feet on September 26 and October 1, 1990.”
That’s great news. Now the NTSB could verify their initial conclusions! Unfortunately, the NTSB’s conclusion that “There are no reasonable means by which the door locking and latching mechanisms could open mechanically in flight from a properly closed and locked position.” from their initial report, was wrong. The ramp service person was telling the truth and his statements were right. He had properly closed the door.
The experienced investigators were wrong. The NTSB confirmed this in the Executive Summary of their final report when they wrote:
“However, upon examination of the door, the damage to the locking mechanism did not support this hypothesis. Rather, the evidence indicated that the latch cams had been back-driven from the closed position after into a nearly open position after the door had been closed and locked. The latch cams had been driven into the lock sectors that deformed so that they failed to prevent the back driving.”
The Experienced Investigator Trap
How could experienced investigators with considerable oversight (the Presidentially appointed Board of the NTSB) make such a mistake? Easy. It is a common trap. Investigators look for evidence that support their conclusions (hypothesis) and fail to look for “counterfactuals.” What is a counterfactual? Potential evidence that disproves your theory. Evidence that is counter to what you believe. Evidence that explains an alternative sequence of events that the investigator thinks is “unlikely” or “not reasonable.”
In this case with an electric interlock and a piece of metal preventing the doors from opening in flight, the reasonable answer was that the door must not have been closed correctly. A simple human error in hurrying to get the flight out on time caused the death of 9 people. This seemed entirely likely. Other possibilities were not reasonable.
But when the evidence was recovered, a not reasonable answer was the real cause. Experienced investigators (and their supervisors) had been trapped by their own logic.
Confirmation Bias
This type of mistake by experienced (and inexperienced) investigators has many names. Among the scientific community, it is often called, “Confirmation Bias.” It is one of the many types of bias that can influence and investigation.
This type of bias is the fatal flaw of using the “scientific method” to investigate accidents.
The NTSB’s report’s Executive Summary even mentioned having a hypothesis.
The counter-evidence IS reported in the initial investigation:
“After the accident, UAL ramp service personnel, who had been involved with the cargo loading and unloading of flight 811 before takeoff from HNL, stated that they had opened and closed the forward cargo door electrically. They said that they had observed no damage to the cargo door. The ramp service personnel said that they had verified that the forward cargo door was flush with the fuselage of the airplane, that the master door latch handle was stowed, and that the pressure relief doors were flush with the exterior skin of the cargo door.
The dispatch mechanic stated that, in accordance with UAL procedures, he had performed a “circle check” prior to the airplane’s departure from the HNL gate. This check included verification that the cargo doors were flush with the fuselage of the airplane, that the master latch lock handles were stowed, and that the pressure relief doors were flush or within l/2 inch of the cargo door’s exterior skin. He said a flashlight was used during this inspection.
The second officer stated that, in accordance with UAL Standard Operating Procedures (SOP), he had performed an operational check of the door warning annunciator lights as part of his portion of the cockpit preparation. The second officer also stated that he used a flashlight while performing an exterior inspection, again in accordance with UAL procedures. The exterior inspection was conducted while ramp service personnel were performing cargo loading operations and the cargo doors were open. He stated that he had observed no abnormalities or damage.”
The experienced investigators must have concluded that these individuals were, somehow, wrong. Why? Because any other way this accident could have happened was not reasonable. And the investigators had seen damage to other locking mechanisms. They had seen other ramp service personnel make mistakes. That “confirmed” their bias toward their hypothesis.
Can You Avoid Confirmation Bias?
Some people who believe that the Scientific Method is the ONLY way to perform accident investigations believe that you can think confirmation bias away. How? By applying a method called “consider the opposite.” I’ve reviewed this theory and even tried it, and I think it is highly unlikely that it will work during an accident investigation. (Is that due to my confirmation bias?)
But even if “consider the opposite” might work occasionally, would it be effective enough to ensure accident investigators arrive at the right answers? Therein lies the problem. Occasionally getting the right answer by considering the opposite still leaves many wrong answers that can lead an investigation astray.
Thus, we need a better way to counter confirmation bias during an accident investigation.
The No Hypothesis Method
TapRooT® teaches investigators to start the investigation by gathering and organizing evidence. This is done using a tool called a SnapCharT®.
Items that can’t be proven are put in dashed boxes or ovals. We call these assumptions. They are explicitly called out on the SnapCharT® and investigators know that they need more evidence to confirm or refute these assumptions.
The important part of the statement above is the word refute. In the NTSB investigation, they had evidence (the ramp service person, the dispatch mechanic, and the second officer’s testimony that the door was not damaged and the door was properly shut prior to takeoff.
What evidence did they have to refute these statements?
- That prior flights on that aircraft had trouble latching the door (so it must have been damaged)
- That damage to latches had happened on other 747s
- That testing on doors had shown that their hypothesis could happen
- That testing had not shown any other reasonable way that this could have happened
Thus, it was highly likely that they were right and the ramp service person, the dispatch mechanic, and the second officer were wrong.
On a SnapCharT®, the evidence could be displayed as follows:
This clearly shows that the investigators don’t have proof for either “hypothesis” and that more evidence needs to be gathered to determine if the latches were damaged and therefore, weren’t properly closed or if there is some other failure mechanism that isn’t well understood. Until the dotted ovals can be proven, you can’t say that you know what happened.
Also, the SnapCharT® clearly shows that there are three eyewitnesses that say that the door latches were not damaged and that the door was properly latched prior to takeoff. That IS counter-factual information to the NTSB’s assumption that the latch must have been damaged and improperly latched.
Bad Corrective Actions
Finally, investigators should realize that if you base your investigation on hypotheses that aren’t proven facts, your corrective actions are based on imagined problems. There is a high likelihood that your corrective actions will not fix the real problems that you failed to discover.
For example, here are the applicable recommendations that were made from the initial (wrong) NTSB investigation:
- Issue an Airworthiness Directive (AD) to require that the manual drive units and electrical actuators for Boeing 747 cargo doors have torque limiting devices to ensure that the lock sectors, modified per AD-88-12-04, cannot be overridden during mechanical or electrical operation of the latch cams.
- Issue an Airworthiness Directive (AD) for non-plug cargo doors on all transport category airplanes requiring the installation of positive indicators to ground personnel and flightcrews confirming the actual position of both the latch cams and locks, independently.
- Require that fail-safe design considerations for non-plug cargo doors on present and future transport category airplanes account for conceivable human errors in addition to electrical and mechanical malfunctions. (Class II, Priority Action)
(A-89-94)
After finding the real cause of the accident, the following recommendation was added:
- Require that the electrical actuating systems for non-plug cargo doors on transport-category aircraft provide for the removal of all electrical power from circuits on the door after closure (except for any indicating circuit power necessary to provide positive indication that the door is properly latched and locked) tp eliminate the possibility of uncommanded actuator movements caused by wiring short circuits.
Note that the previous emphasis on human error and improperly closed doors due to wear and the manual operation had nothing to do with this particular failure and would NOT have prevented a failure similar to the one that caused this accident. Thus, the corrective actions based on an investigation influenced by confirmation bias would NOT have prevented future similar accidents.
What Should You Do?
The lesson learned here is simple. To avoid confirmation bias you should use TapRooT® Root Cause Analysis and display assumptions on your SnapCharT® to make them obvious. When confronted with an assumption, you need to consider ALL evidence and look for counter-factual evidence that might provide ideas about additional information that needs to be collected.
If you haven’t taken TapRooT® Training, register for a public TapRooT® Course today.
In this NTSB investigation, the NTSB could suggest their initial recommendations but they also should have put more effort into looking for additional potential failure modes that might have caused the accident.
Note that the NTSB should be praised for continuing to partially fund (along with funding from Boeing, the FAA, and United Airlines) the continuing investigation (location and recovery of the door by the U.S. Navy) that eventually recovered the door from the bottom of the ocean and allowed the actual means of failure to be determined. Sometimes it does take years and lots of money to correctly complete a root cause analysis of a serious accident.
This is a perfect example of ‘confirmation bias’, I would love to call it as ‘I know’ bias! Experience, knowledge etc. are good for everything else but not for fact finding.
Very effective share, Mark. Thanks.
An excellent demonstration