Moving into a digital world

Optical character recognition turns text, images into digital documents

Officials at most agencies and departments have long since forsaken the idea of creating a paperless office in the near future. Instead, their focus has been on easing the translation of hard-copy documents to electronic documents.

A major piece of the puzzle for many agencies and departments is optical character recognition (OCR), a technology that allows you to scan documents and digitize text, tables and images.

Document-management projects frequently spur the use of OCR technology because they require the conversion of hard-copy paper files to electronic records. Although document conversion remains a primary reason to examine OCR, the technology might make sense in other instances as well.

For example, if an agency has employees in the field who take digital pictures, one could use OCR to quickly load their images into business documents. Or perhaps one has documents arriving from external users via an FTP site. You could use OCR to quickly and automatically pull, scan and route those documents in an automated workflow. Other reasons to consider OCR include the ability to quickly update Adobe Systems Inc. Acrobat PDFs. Or you may want to convert the text of a hard-copy document to an audio file.

In short, OCR technology can help agency workers get a

better handle on managing hard-copy and electronic information while increasing productivity.

We recently examined three OCR solutions: Abbyy Software House's FineReader 7.0 Corporate Edition, IRIS Inc.'s Readiris Pro 9 Corporate Edition and ScanSoft Inc.'s OmniPage Pro 14 Office. In addition to evaluating the products as possible agency purchases, we also found that, given the fact that all three products have gone through several versions, these solutions are quite mature and robust. That can be both good and bad for agency evaluators.

All three solutions would be acceptable for a given agency. Although the programs share many features, each has unique capabilities. Agency evaluators will need specific criteria for choosing the right OCR solution before beginning a proof-of-concept test. For example, if one of your must-have features is converting hard-copy documents to audio, your choice of OCR solutions would be limited to those that offer this capability. If you merely need to scan, edit and save documents, your choices are much broader.

Another consideration is the option to integrate OCR functionality into a heterogeneous environment. If your agency uses many platforms, you may need to do some additional homework. The solutions we tested here support the most widely used Microsoft Corp. Windows platforms. The three companies also offer versions of their solutions for Apple Computer Inc. Macintosh systems. But if your agency is making the move to Linux, or if you use one or more flavors of Unix, you'll need to seek other options. We examine your choices for mixed-platform settings in the box on Page 35.

FineReader: Leading the pack

Although the race was close, we find that for most everyday OCR tasks Abbyy Software's FineReader has a slight edge over its competitors. Like ScanSoft's OmniPage Pro, FineReader can be installed across the network, and automated installations can be done using network utilities, such as Microsoft's Systems Management Server.

We especially like that FineReader uses a concurrent licensing scheme and provides a network license manager, which tracks license usage. This should help IT staffers track and use licenses wisely. The other solutions are priced based on per-seat usage.

FineReader also offers support for network-based scanning devices, such as multifunction digital copiers, and workstation-based scanners. The Abbyy solution correctly identified and connected to all of our test scanners. Other useful features include the ability to monitor a network folder or an FTP location for new documents and then open and process them. FineReader also lets agency officials execute OCR processes in a distributed manner across the network, which will prove especially useful for high-volume OCR sites.

When opening the FineReader interface for the first time, users can launch a wizard, access documentation or open demonstration material. The help menu also includes a tutorial. The interface is customizable: You can select which toolbars you want to see and which items in each toolbar you want to be visible.

We were able to quickly scan and read individual and multipage documents. We also opened and edited image files. The recognition and retention processes — converting documents to editable form — worked flawlessly with FineReader. This was true for text-only documents, those with text and tables, and documents that blended text, tables and images.

FineReader supports 177 languages. Although we didn't test all of them, the 10 we tried proved accurate. Aside from written languages, FineReader also can recognize some programming languages. We scanned in some hard-copy Java code, and we liked how quickly and easily we could edit the code. FineReader also recognizes regular and 2-D bar codes.

Unlike Readiris, FineReader retained the format of our documents every time. These included page layouts, flow, color retention and so on.

Once our documents were in FineReader, we used the automated proofing tools to check spelling and grammar. FineReader can check spelling in 34 languages and includes special dictionary support for legal and medical terms in addition to the capability to import user-defined dictionaries from Microsoft Word. Once proofed, we found FineReader's editing tools to be on par with its rivals.

Each of the OCR solutions we inspected offered numerous formats with which to save and export documents. Moreover, some of these tools allow you to export and launch the target application while others allow you to save to a file for later editing using another application. FineReader, like its competitors, offers a great degree of save and export capability, with the ability to use formats that include BMP, JPEG, TIFF, PNG, RTF, DOC, CSV, XLS, TXT, HTML, PPT and PDF.

One of the toughest parts of evaluating OCR solutions includes deciding what formats you will need for saving and exporting files. Check what each solution offers closely. FineReader, for example, can handle Extensible Markup Language output for use in Microsoft Word 2003.

Another interesting FineReader feature lets you drag and drop an e-mail attachment into OCR. The solution is also accessible via Windows Explorer, and it supports duplex (two-sided) and book scanning in addition to the scanning of odd and even pages. We scanned duplex documents into FineReader without incident.

Readiris: Five easy steps

The Readiris solution is user-friendly. After accessing the user interface for the first time, we were presented with an OCR wizard, which asked us five easy questions about document sources, correction options, scanner types, languages needed, and the options for save and export formats.

We noticed that, in some cases, Readiris did recognize up our scanner and recommended another. We also noticed that, with some scanners, Readiris was not able to make the connection to the scanner the first time. When we tried again, it contacted the scanner.

Although Readiris was easy to install, we didn't see a way to execute network-based or automated installations. Administrators who choose Readiris will likely need to do some additional work to get the solution into standard agency end-user images.

Readiris seemed to scan our documents slightly faster than the other solutions. The product also supports input from digital cameras, and we used two digital cameras as sources. In both cases, our input was successful. We also scanned single, multipage and duplex documents.

The Readiris product effectively recognized and retained formatting of textual data and tables. In most cases, it also did the same for images. However, sometimes when our documents contained text, tables and images, Readiris would not retain the original document's image layout. This flaw was remedied when we edited the document, but retaining image layout would be preferred.

Like FineReader, Readiris supports a large number of languages — more than 100 — and this also compares favorably to ScanSoft's OmniPage language support. Using our 10-language test, Readiris accurately recognized and retained textual and table content.

When we selected a document format prior to scanning, such as Microsoft Word, Readiris automatically launched the application following its scanning and OCR processing.

Readiris does not include automated proofing like FineReader does, but once our application was launched, we were able to check spelling and grammar.

Meeting its rivals head on, Readiris supports numerous save and export formats. In addition to application formats — such as PDF, RTF, DOC, HTML and TXT — Readiris also supports formats for Byword, Open Office, Sun Microsystems Inc. Star Office and browsers from the Mozilla Organization and Netscape Communications Corp. We also liked that we were able to e-mail documents after scanning and OCR processing was completed.

OmniPage: Stiff competition

OmniPage provides some stiff competition when compared to its rivals. It offers agency administrators good support for network installations and automated deployments. But OmniPage took much longer to start up and initialize than the other solutions we tested. In addition, a default installation of OmniPage took up more than twice the disk space compared to FineReader.

Once opened, OmniPage will default whenever any product updates are needed. This feature can be disabled.

When setting up our scanners, we were able to download the latest scanner database from ScanSoft. This is useful given the constant release of new scanners.

Like Readiris, OmniPage had some trouble connecting to several scanners on the first try. However, after retrying the process, it was able to communicate with the scanners. The company includes a scanner setup wizard to assist users.

During scanning, we liked that the solution prompted us to determine if we had more pages to scan. Although OmniPage did not scan as quickly as Readiris, it performed the recognition process rapidly. OmniPage, like FineReader, includes built-in proofing tools. We were able to scan and process individual, multipage and duplex documents. For sites without duplex-capable scanners, OmniPage can help you merge pages once all the pages have been scanned.

OmniPage meets rival FineReader head-on in recognizing content and retaining document layout. During all of our tests, OmniPage correctly processed images, text and tables. Moreover, its built-in editing tools were easy to use.

The solution also provides solid support for languages and save and export formats. It supports 114 languages and passed our 10-language document test with ease. Like its rivals, OmniPage can save results to e-mail, the clipboard or an FTP location. It also offers save and export capabilities in common formats.

OmniPage has a couple of unique features. If agency workers need to convert documents to speech, they will be pleased to find that OmniPage offers built-in support for this function. After turning on the speech mode and selecting which voice we wanted — the British accent was most compelling — we processed several documents. Following OCR, we could have the voice read back all or parts of the document data. In addition, we could save the results as a WAV file. A useful addition to OmniPage would be the ability to also save in audio formats such as MP3 and OGG.

Officials with many document-handling requirements would also appreciate OmniPage's workflow capabilities. The product comes with some built-in workflows, such as one to process documents using PDF and RTF formats. More important to agency officials, however, is the ability to create customized workflows. We created several to load or scan documents or obtain them from an FTP location. We then selected how we wanted to process them and what we wanted to do once OCR processing was completed.

OCR choices

For broadest agency appeal, we find that FineReader is an excellent choice. Officials with unique requirements, such as conversion to audio or input from digital cameras may prefer a different solution. As with any technology, precise requirements and a proof-of-concept tests are needed to ensure the best product selection.

Biggs, a senior engineer and freelance technical writer based in Northern California, is a regular Federal Computer Week analyst. She can be reached at maggiebiggs@acm.org.

Blending in OCR

Most agency information technology infrastructures use Linux operating systems, Microsoft Corp. Windows and Apple Computer Inc. Macintosh platforms. Traditional Unix servers or midrange or mainframe systems are also heavily used. If a good bit of your document management strategy involves these platforms, you have plenty of options when it comes to optical character recognition (OCR) technology.

Shops that utilize Linux might consider open-source solutions, such as GOGR, Ocrad or Kooka. Most Linux distributions include one or more OCR solutions, so you can compare them.

Unix and Linux users who want to purchase OCR software might consider Vividata Inc.'s OCR Shop software. The company has formed a partnership with ScanSoft Inc. to bring OCR to a wider audience. Supported platforms include IBM Corp. AIX, Hewlett-Packard Co. HP-UX, Silicon Graphics Inc. IRIX, Sun Microsystems Inc. Solaris and Linux (x86).

Sites using midrange platforms, such as IBM's iSeries, might consider Vanguard Systems Inc.'s IMS21 solution. Moreover, agency officials with lots of clout can choose the base support in the operating system, which includes OCR functionality, or explore third-party options when expanded OCR support is needed.

Regardless of platform or document management strategy, OCR is a good bet for officials who want to reduce costs and increase productivity.

— Maggie Biggs

NEXT STORY: Soldiers as sensors