Taking data to a higher plane
FAA tests new analytics software to navigate the finer points of aviation safety
Success has made reducing airline accidents and near misses into a difficult task. The most common, and most potent, causes are known and have been addressed by technology or better procedures. The remaining causes are more subtle, and therefore harder to pinpoint.
So even if the rate of mishaps stays constant, the number of accidents will rise along with flight frequencies and passenger miles flown. And they are rising.
Those facts have prompted one Federal Aviation Administration office to examine data analysis tools for possible use by airlines in rooting out subtle conditions or chains of events that point to potential accidents.
“The focus is on data because we get so much of it,” said Christopher Hart, FAA assistant administrator for system safety. Modern flight data recorders generate enough data on each flight to fill a warehouse—a thousand or more parameters measured eight times per second.
Airlines also generate reams of unstructured text from forms filled out by flight, ramp and maintenance crews. Hart believes that if such data stores could be analyzed for patterns, clues to the causes of near misses and accidents would emerge before they happen.
Hart’s section of FAA doesn’t analyze the data itself. The data and any analysis of it are considered proprietary by the airlines. But what Hart hopes to do, he said, is create a market in the airline industry for analytical tools.
“We don’t analyze data, but help the airlines analyze their own. We want to make the tools widely available at low cost,” Hart said.
You can’t test analysis tools without having large data sets to test them on. So Hart’s office is working with three major carriers and two air industry trade associations that have volunteered to supply large data sets confidentially from their own operations. The airlines, Hart said, already supply large volumes of their free-text data to the associations—the International Air Traffic Association and the Air Transport Association.
Text mining tools at first glance resemble search engines, because both are typically applied to data sets too large for the human mind to examine. But the technologies are fundamentally different.
Search tools simply match words. The user of a search engine typically has an idea of what he is looking for, whereas analytical tool users look for patterns and relationships that are unknown.Put it in context
Text analyzers have the ability to understand context and similar meanings in different words. They can find the frequency of occurrences of words or phrases in proximity to other words. Some are bundled with visualization tools that generate maps of cause-and-effect relationships. The goal in all cases is to find relationships either hidden or impossible to discern from reading because of the size of the data sets.
This is important not only to airlines but to any field in which many small, unnoticed or unaggregated events can accumulate, leading to property- or life-threatening conditions.
But tools must be adapted to specific industries, because specific words have different meanings in different domains.
Sergei Ananyan, president of Megaputer Intelligence Inc. of Bloomington, Ind., gave an example of text analysis from thousands of police reports. Results from his company’s tool, PolyAnalyst, in searching out relationships between weapons, time of day and locations, showed that the word “pike” in this context must be associated with a type of street. Leesburg Pike, for instance, is a prominent Virginia boulevard near Washington. Even though “pike” could be a spear, in police work such an occurrence would be rare relative to the street moniker. In other domains, “pike” might be a fish.
Similarly, words such as “bank,” “tail” and “attitude” in law enforcement have very different meanings from the same words in air transport.
“Each system has underlying semantic dictionaries,” Ananyan said. Tools from his and other companies, he said, are built around an academically generated set of public-domain algorithms called Wordnet. They must be modified for specific industries, and vendors typically build shells around them so average users in a given profession can employ them.
“Airline safety offices tend to be small,” FAA’s Hart said. “They don’t have Ph.D. statisticians or linguists, so we’re looking for easy-to-use tools.”
In all, Hart said, his office is aware of 60 tools that could be used in air safety analysis. He said FAA will evaluate them and work with vendors to improve them, but it will be the airlines’ ultimate responsibility to deploy them.
“The FAA will not specify tools for the airlines. Our goal is raising awareness in the industry that these are good mining tools, and to educate the text mining industry about a market for them,” Hart said.
At this point, he said, “We’re finding we still have a way to go in tools. They still need to be adapted for [the airline] context. None of them are ready for prime time.”
A 2003 report, available on the Global Aviation Information Network Web site, said that among the improvements needed in analytical tools are greater speed and automation of functions, such as data sharing among competing carriers.
Connect with the GCN staff on Twitter @GCNtech.