Software aids search of Arabic documents

Technology is emerging that will help investigators in government agencies swiftly analyze Arabic language documents.

Since the Sept. 11, 2001, terrorist attacks, law enforcement officials have confiscated and searched thousands of computers for leads to terrorist activities. But as they encountered documents in Arabic, their searches slowed without the proper tools to analyze the language.

"When we seized a computer, it was a headache because we didn't have the resources to read that server structure," said Bill Seibert, director of technical services for Guidance Software Inc. and a former U.S. Customs Service special agent.

Last month, the company launched the fourth version of the EnCase Forensic Edition software, which enables searches in Arabic and other languages. The new software will be used by computer investigators in agencies such as the Bureau of Citizenship and Immigration Services and the State, Energy and Treasury departments.

The software creates an image copy of the entire drive, including deleted files and file remnants, and preserves the original evidence. By duplicating the drive, investigators can pick through the information without leaving behind traces that they were there. They can then analyze the data using keyword searches and thumbnail images.

A search that once took three days will now take about an hour.

"What this is all about is speeding up the investigation," Seibert said. "There are more and more drives and pieces of media scooped up in an investigation."

Before, investigators would have to find the Arabic language files, export them and then search, leaving behind deleted files. Investigators relied on several programs to translate the entire document — often compromising their meaning.

"Without this tool, it would be almost impossible to do" searches in Arabic languages, said Don Masters, assistant to the special agent in charge of the U.S. Secret Service's Electronic Crimes Task Force.

Guidance Software had the Arabic language search tool on the "to-do" list for a while, Seibert said, but after the terrorist attacks, officials saw an immediate need. The software will likely be used in a war with Iraq. With the launch last month, Pasadena, Calif.-based Guidance Software shipped more than 2,400 orders to government and corporate investigators.

Cambridge, Mass.-based Basis Technology Corp. has also tackled Arabic translation needs, but for a different function. Basis this month launched the Rosette Arabic Language Analyzer, which plugs into mainstream search engines to aid searches in Arabic languages.

Although not a forensic tool like EnCase, the language analyzer searches documents in their source language, rather than relying on the English translation to return relevant results. The language analyzer enables a user to enter the English phonetic version of an Arabic word and receive a handful of relevant results. Because the data is searched in its native form, meanings aren't lost and variations aren't overlooked.


  • Defense
    Ryan D. McCarthy being sworn in as Army Secretary Oct. 10, 2019. (Photo credit: Sgt. Dana Clarke/U.S. Army)

    Army wants to spend nearly $1B on cloud, data by 2025

    Army Secretary Ryan McCarthy said lack of funding or a potential delay in the JEDI cloud bid "strikes to the heart of our concern."

  • Congress
    Rep. Jim Langevin (D-R.I.) at the Hack the Capitol conference Sept. 20, 2018

    Jim Langevin's view from the Hill

    As chairman of of the Intelligence and Emerging Threats and Capabilities subcommittee of the House Armed Services Committe and a member of the House Homeland Security Committee, Rhode Island Democrat Jim Langevin is one of the most influential voices on cybersecurity in Congress.

Stay Connected


Sign up for our newsletter.

I agree to this site's Privacy Policy.