Software aids search of Arabic documents

Technology is emerging that will help investigators in government agencies swiftly analyze Arabic language documents.

Since the Sept. 11, 2001, terrorist attacks, law enforcement officials have confiscated and searched thousands of computers for leads to terrorist activities. But as they encountered documents in Arabic, their searches slowed without the proper tools to analyze the language.

"When we seized a computer, it was a headache because we didn't have the resources to read that server structure," said Bill Seibert, director of technical services for Guidance Software Inc. and a former U.S. Customs Service special agent.

Last month, the company launched the fourth version of the EnCase Forensic Edition software, which enables searches in Arabic and other languages. The new software will be used by computer investigators in agencies such as the Bureau of Citizenship and Immigration Services and the State, Energy and Treasury departments.

The software creates an image copy of the entire drive, including deleted files and file remnants, and preserves the original evidence. By duplicating the drive, investigators can pick through the information without leaving behind traces that they were there. They can then analyze the data using keyword searches and thumbnail images.

A search that once took three days will now take about an hour.

"What this is all about is speeding up the investigation," Seibert said. "There are more and more drives and pieces of media scooped up in an investigation."

Before, investigators would have to find the Arabic language files, export them and then search, leaving behind deleted files. Investigators relied on several programs to translate the entire document — often compromising their meaning.

"Without this tool, it would be almost impossible to do" searches in Arabic languages, said Don Masters, assistant to the special agent in charge of the U.S. Secret Service's Electronic Crimes Task Force.

Guidance Software had the Arabic language search tool on the "to-do" list for a while, Seibert said, but after the terrorist attacks, officials saw an immediate need. The software will likely be used in a war with Iraq. With the launch last month, Pasadena, Calif.-based Guidance Software shipped more than 2,400 orders to government and corporate investigators.

Cambridge, Mass.-based Basis Technology Corp. has also tackled Arabic translation needs, but for a different function. Basis this month launched the Rosette Arabic Language Analyzer, which plugs into mainstream search engines to aid searches in Arabic languages.

Although not a forensic tool like EnCase, the language analyzer searches documents in their source language, rather than relying on the English translation to return relevant results. The language analyzer enables a user to enter the English phonetic version of an Arabic word and receive a handful of relevant results. Because the data is searched in its native form, meanings aren't lost and variations aren't overlooked.


  • Image: Shutterstock

    COVID, black swans and gray rhinos

    Steven Kelman suggests we should spend more time planning for the known risks on the horizon.

  • IT Modernization
    businessman dragging old computer monitor (Ollyy/

    Pro-bono technologists look to help cash-strapped states struggling with legacy systems

    As COVID-19 exposed vulnerabilities in state and local government IT systems, the newly formed U.S. Digital Response stepped in to help.

Stay Connected