OmniPage Web bridges gap between paper, the Web
- By Patrick Marshall
- Jul 25, 1999
Every so often we come across a forehead-slapper. That's a program that makes us collectively slap palms to forehead and exclaim, "Why didn't somebody come up with that before?"
Caere Corp.'s OmniPage Web is such a program, and it will be appreciated immediately by any agency Webmaster who needs to convert paper documents to attractive World Wide Web pages. OmniPage handles most of the work in hands-off fashion, freeing up the Webmaster to deal with higher-level design issues.
Anyone who has used a recent version of OmniPage Professional - Caere's optical character recognition program - will feel at home in OmniPage Web. The new product's interface offers a similar look and feel; in fact, the program incorporates the OmniPage OCR engine.
Six large buttons on OmniPage Web's main toolbar control the process of converting paper documents to Web pages. The Scan/Load Image button lets you load an existing image file or scan a page using an attached scanner. The Zoning button lets you select the types of documents you're setting up for OCR: single-column pages, multicolumn pages, spreadsheet pages or complex pages that include tables, boxed text and the like.
Zoning is a complex process. While OmniPage Web does a good job of performing the task automatically, it also provides a strong set of tools for manually correcting zoning, including those for joining and dividing zones, reordering zones and drawing zones from scratch.
The next button on the toolbar is the OCR button, and it enables you to choose between OCR alone or OCR followed by proofing. The proofing utility is well-designed. The proofing window offers a view of the original hard copy characters and suggests spelling corrections. All you have to do in most cases is point and click to make the appropriate correction.
It's after OCR and proofing that OmniPage Web's special features really become apparent. The Outline utility employs what Caere calls "logical structure recognition" to analyze the document and automatically generate a table of contents. It does so by interpreting the language used in the document and reading the formatting used in creating the document, including headings, headers and footers, captions, graphics and body text. OmniPage Web then pops the table of contents into a panel to the left of the document, where you can edit it. When the resulting document is viewed in a Web browser, this navigable table of contents will appear in its own frame.
The final button on the toolbar is for saving the document to Hypertext Markup Language format. You can save the file or save it and automatically launch it in a Web browser for testing. Bear in mind when designing documents that OmniPage Web supports not only standard HTML but also Dynamic HTML and cascading style sheets.
We found OmniPage Web to be easy to use, and it produces impressive results. In fact, the only knock we have on the product is that - apart from the spell-checker that snags suspicious words during OCR - the software provides no tools for editing recognized documents before processing them to HTML. Nor can you open document files from word processors. That means there's no way within OmniPage Web to add data from electronic documents to the Web pages. To perform either kind of editing, you'll need to load the results of your OmniPage Web work into an HTML editor.
That said, if you have hard copy documents that you want to move to the Web, you won't find an easier way to recognize them and export them to HTML.
OmniPage Web 1.0Caere Corp.(408) 395-5498www.caere.com
Price and Availability: Currently available from a number of GSA resellers for about $500.
Remarks: OmniPage Web offers an easy and accurate way to convert documents from hard copy to attractively designed Web pages. If the documents need editing, however, you'll still need to load them into an HTML editor.