IBM's Watson breaks new ground in artificial intelligence
- By Henry Kenyon
- Jul 20, 2011
The spectacle of a computer playing "Jeopardy!" on TV might have seemed just a novelty to some, a gimmick to boost the timeworn game show's ratings. But what IBM's Watson was doing was really much more profound than that, said David McQueeney, vice president of software at IBM Research, speaking July 20 at FOSE in Washington D.C.
One of the challenges that the computer science community had been pursuing is automatic open-domain question-answering — or in plain English, the ability for a machine to answer an unmodified human questions.
“We consider it a long-standing challenge in artificial intelligence to emulate a slice of human behavior,” said McQueeney.
Why IBM’s Watson is good news for government
IBM’s Watson vs. the human brain: Tale of the tape
Computer scientists wanted a computer capable of selecting and answering a question with a degree of confidence. For example, one of Watson’s potential applications is supporting doctors’ medical decisions with data in the form of cited medical papers and with quick, real-time responses to questions.
But understanding unfiltered human language is hard for computers. McQueeney said that humans are used to the ambiguities in the language and can adapt to it in communications. When humans communicate with computers, they use discrete, exact programming language. Unstructured sentences are very hard for machines to process, he said.
IBM had previously developed chess-playing computers capable of beating human masters. But chess is a mathematically precise game that requires lots of computing power, but is fairly straightforward. Human language, on the other hand, is ambiguous and full of subtleties in meaning and intent. For IBM, it took much more scientific effort and computational work to tackle human language than to win chess games, he said.
"Jeopardy!" was a good fit because of the difficulty of the questions and its requirement for rapid decision-making. IBM used the show’s format to provide a question-and-answer response platform with a high degree of precision in real time. Questions used in the show can range from very specific to very ambiguous. McQueeney noted that IBM had no control over the questions provided.
In trying to figure out how to accurately answer questions, IBM researchers tried to use a statistical analysis approach to "Jeopardy!" answers, but found that this approach did not work, McQueeney said. Then they tried a combination of statistical techniques to reach a correct answer. Watson does use some structured information for answers and geographic places, but only 15 percent of the time. The rest of the data is unstructured, he said.
Finding an answer from an ambiguous sentence requires more than a key word search. Watson parses meaning from the arrangement of words and their sentence structure.
Watson also has built-in temporal reasoning that allows it to find data through time and geospatial calculations. It can weigh evidence, not just conduct key word searches. It is also capable of understanding the meaning of rhymes. When Watson was playing "Jeopardy!," it was not connected to the Internet; it was pre-loaded with data.
When IBM began its work on Watson in 2007, it built a state-of-the-art Q&A system. However, the system could only answer "Jeopardy!" questions 50 percent of the time, well below what human champions were capable of.
IBM committed a huge amount of time and resources to overcoming this challenge, McQueeney said. The company assembled a cross-disciplinary team of several dozen experts and had them focus on the project for four years. This was a considerable gamble and a commitment of resources for the company, he said.
The team developed Watson’s underlying technology, known as DeepQA, which is a massively parallel probabilistic evidence-based architecture. The team improved the system over several years until it was ready in early 2011.
Unlike earlier work on winning chess games, where IBM had to develop specialized chips, Watson is made up of off-the-shelf components. It is a stock IBM system consisting of several racks of IBM servers running 3.55 GHz chips. While the system is not technically a supercomputer, it is very big and powerful. McQueeney noted that it has a capability of 80 teraflops of processing power.
In developing Watson, IBM scientists learned about deep analytics and how very fast parallel systems operate. In the weeks leading up to the televised event, Watson had 55 sparring matches with the show’s top champions. It won 71 percent of the time, which made its victory in the televised match more compelling, said McQueeney.
The success of Watson has demonstrated that computers can now support human interactivity tasks that they were unable to do before, said McQueeney. Among the areas where IBM is working on applying Watson is health care and life sciences, where the computer can serve as a medical advisor to doctors by providing diagnostic assistance. Other areas include technical support/help desk services, enterprise knowledge management and business intelligence systems, and improved information sharing for government and national security applications.