MetaCarta merges text, geographic searches
- By Michelle Speir
- Jan 26, 2003
For some government agencies — not only those tracking criminals and terrorist activity, but also those keeping tabs on trade or environmental issues, for example — geography can be critical. Unfortunately, most search engines aren't designed to detect and sort on geographical markers.
MetaCarta Inc. has set out to change that with its Geographic Text Search (GTS) Appliance, which the company is focusing on selling to the military and other government agencies.
The key concept behind MetaCarta GTS is the integration of text search data with geography. Search results appear as points on a map instead of as a list of documents. Each point, in the form of a numbered rectangle, represents a document or a stack of documents related to that place.
Using this technology, agencies can, for example, track patterns of criminal activity and identify spots of intensity. They can also limit searches to certain geographic areas.
The GTS uses sophisticated algorithms to make sense of geographic references in documents. For example, the name "Noriega" could refer to a person or a place. The GTS can determine context to make the most logical determination, such as noting that if "Mr." or another salutation precedes the name, it is probably referring to a person, not a place.
The system also uses data-mining techniques to find the most common contexts for certain geographic references. These contexts are then used to build patterns for recognizing or ignoring subsequent references. In addition, the GTS will "pre-tune" a system prior to delivery so that results are more relevant to the customer's needs.
The GTS appliance is a Dell Computer Corp. 2650 server with 4G of memory that can be integrated into an agency's existing network. It comes preloaded with the MetaCarta software, National Imagery and Mapping Agency base maps and several third-party mapping applications.
A standard GTS appliance — which runs on the Debian GNU/Linux operating system — can hold 5 million documents and multiple appliances can be linked together to accommodate an unlimited number of documents. Depending on the size of the documents, up to 50 million can reside on a single server if it is configured with an extended disk array and 6G of memory.
Agencies can "populate" the appliance with internal data and keep the system closed, or they can use it to search the Internet. It can run the following protocols: HTTP, HTTPS, Simple Object Access Protocol, Network File System (NFS) and Network News Transfer Protocol. The appliance uses X.509 security and Lightweight Directory Access Protocol for authenticating users.
Various input methods can be used to update the system with new documents. Common methods include HTTP and HTTPS to retrieve information from Web sites, NFS to mount remote file systems and ODBC to query Structured Query Language databases. Information can also be taken from CD-ROMs.
To get a feel for how MetaCarta GTS works, we accessed an Internet demonstration system. The interface consists of a world map and several fields for typing in keywords or names and an address.
To search documents all over the world, we could type in a keyword, a person's name, an address or all of the above. To focus on a specific geographic area, we zoomed in by clicking on that area of the map. Multiple zoom levels are available, and a search is always limited by the area visible in the map window. You can also simply type in the name of a city or country.
For example, when we entered "Chile" into the country field and used "coup" as a keyword, the system returned a list of documents relating to the 1973 government overthrow in that country. Documents included essays, historical information and a review of a film about the coup.
Documents are returned in order of relevance as determined by MetaCarta's algorithms. Each result is marked with a "confidence value" — a percentage value estimating the likelihood that a human analyst would select that document.
Government agencies that need to analyze text information in a geographic context for intelligence or other purposes should consider purchasing this unique, sophisticated system.