Building the digital library

While the World Wide Web has given federal agencies unprecedented access to historical, scientific and reference data, it has also created a new challenge: finding the tools that make sense of that information. Whether they are homing in on key intelligence data or supplying educational materials f

While the World Wide Web has given federal agencies unprecedented access to historical, scientific and reference data, it has also created a new challenge: finding the tools that make sense of that information. Whether they are homing in on key intelligence data or supplying educational materials for teachers, agencies ultimately want the capability to pull text, audio, video and images from virtual shelves anywhere in the world as easily as one picks books from a library shelf today.

The Web "is what most people consider their digital libraries,'' said Michael Lesk, author of the recent book Practical Digital Libraries and director of the National Science Foundation division that manages the interagency Digital Libraries Initiative (DLI). Now curators of online collections are looking for ways to mine this information explosion.

"One of the classic functions of a library as an organization is they collect and acquire, they organize and make accessible, and they preserve,'' said Clifford Lynch, executive director of the Coalition for Networked Information, an interest group that represents universities and research libraries on technology issues. "If you look at many Web sites, they're about publishing information but not really so concerned about long-term retention and organization.''

Librarians in the Map and Imagery Laboratory at the University of California at Santa Barbara (UCSB) began looking for ways to offer greater accessibility to their holdings of more than 5 million topographical maps, satellite data, photographs and other resources more than a decade ago. Back then, "no one even knew what we were talking about,'' said Larry Carver, the lab's director, when the library's staff said they needed a way to catalog, search and distribute their data online. "The technology was not there yet.''

With help from federal grants, the university last month took the first step toward opening its holdings via the Internet. UCSB's Alexandria Digital Library (ADL) became the first link in an effort throughout California to provide electronic research materials on the Web— first within the state university system and, after a year of testing, to the public, including agencies, universities and private companies around the world.

Other government-backed projects, at universities and within federal agencies, are chasing related ends. Through the DLI, during the past four years, NSF, the Defense Advanced Research Projects Agency and NASA jointly have poured $24.4 million into ADL and five other academic projects that aim to make digital libraries as user-friendly as their physical counterparts.

Any organized collection of electronic documents that is set up "for human beings to use'' can be a digital library, according to NSF's Lesk, and some technologies to support those collections are well-established. Databases, software for capturing images, tools for searching and retrieving text, and CD-ROMs and networks for distributing data are widely used by federal agencies today.

But technologies for tapping sound and video files, or for parsing data in multiple formats, have begun to emerge only recently. Developing search tools, including better ways to tag and index data, has been a major focus of digital libraries research.

"I need to have ways to describe what I'm looking for,'' said Nand Lal, who oversees a NASA research project, called Digital Library Technology, that is separate from the joint effort with NSF. That means making satellite imagery and other space data currently organized for in-house use "intelligible'' on the Web for scientists and the general public. "It's the whole idea of universal access in the sense of being able to deliver things that are of interest to the user, not necessarily to the producer, using facilities and language and terminology that are adapted to the user,'' Lal said.

"Search and retrieval is no longer about text,'' said Mark Demers, director of marketing and corporate communications with Excalibur Technologies, which this week is releasing software for searching video archives. Demers thinks the software could be used by federal agencies to set up libraries of training materials, surveillance tapes and historical records. "It's about all assets everywhere, and it's even about metadata,'' which is the text or software codes used to index online materials. Robust, easy-to-use search tools are "an enabling technology that is almost at the core of a digital library system,'' Demers said.

"An ideal goal for these technologies would be to make them disappear,'' said Stephen Griffin, the DLI program director, who is preparing to award a new round of grants, totaling $40 million to $50 million, beginning this fall. "If these technologies were invisible to the user and the user could work directly with the [information],'' then the user could more easily create, and learn from, new virtual environments.

Meanwhile, agencies are building basic digital libraries with software already on the market. Brand Niemann, digital librarian with the Environmental Protection Agency's new Center for Environmental Information and Statistics (CEIS), recently collected more than five dozen links for the center's Web site— links that enable users to access reports and data about local, national and international environmental conditions. Users can search these links, or only a portion of them, using the Topic search engine from Verity Inc.

One feature that makes the site a digital library, Niemann said, is that it offers visitors a single query form to search many sources, even documents hosted on other agencies' Web sites. Niemann also has helped the U.S. Geological Survey develop a "Web-connected CD-ROM'' that gives users a set of documents they can use offline but that contains links to a Web site where users can obtain updates. The USGS application is based on digital publishing software from Folio Corp. that allows access to documents in different formats through a common interface.

Funding: The Biggest Barrier

Niemann said funding, more than technology, has limited what CEIS has been able to include in its online collection. "The content is endless. You'll never have it [all], so just like building the Web, you've got to get so many people involved.''

According to NSF's Lesk, economic constraints, together with legal obstacles faced by agencies that want to distribute copyrighted data, form the main barriers to setting up digital libraries.

"Government libraries hold lots of materials to which they don't have the intellectual-property rights,'' said Bob Zich, director of electronic programs with the Library of Congress' National Digital Library program. Legislation pending in Congress aims to set rules governing copyrights online, but librarians and researchers, including an LOC official, have testified that these proposals could hamper public access to electronic materials in library collections.

Accessibility is another hurdle that agencies face. LOC started distributing copies of historic photographs and documents on disc eight years ago and now makes about 500,000 images, maps, film clips, audio files and texts available through its American Memory Web site. The goal of the $60 million project, which is funded mainly through private donations, is to provide broader public access to historic and cultural artifacts from library collections around the country— collections that otherwise would be accessible only by visiting the places where they are stored.

But although software such as Real-Audio and QuickTime theoretically put sound and video clips within reach of anyone with Internet access, Zich noted that unless someone has a high-speed link, he might not have the time or the patience to download these files.

"We are waiting for when millions of people will have real wideband access," Zichsaid. "We have some films of the [1906] San Francisco earthquake that are 100M.''

Digital librarians also want more robust tools behind their collections' home pages. One such tool, under development by the NSF-backed San Diego Supercomputing Center and IBM Corp., offers a method for retrieving documents from different storage platforms. Much scientific data is stored as flat files, said Chaitanya Baru, senior principal scientist for enabling technologies at SDSC. "We haven't seen too many people trying to address this issue of trying to get to data on heterogeneous storage devices,'' he added.

The project, part of a test bed for a paperless system to apply for patents, integrates IBM's High Performance Storage System, which is used to manage large data files for supercomputing applications, with a relational database. To find files, users query a database that holds a metadata directory, and a middleware application that the team has developed communicates the query to remote storage systems.

There is no single technology that appears to be driving the deployment of digital libraries today. But researchers and industry experts agree on the ultimate goal: that people must be able to find what they seek.

"Things can be put in formats that are lost forever,'' said Eugene Miya, a NASA electronics engineer who is reviewing DLI grant proposals. "If you don't have the protocols and formats [to retrieve it], your data is just about useless."

*******

AT A GLANCE

Web Info on Digital Libraries

* NSF/DARPA/NASA Digital Libraries Initiativewww.cise.nsf.gov/iis/dli_home.html(includes links to the Alexandria Digital Library and other current research projects)

* Digital Libraries Initiative II program announcement www.nsf.gov/pubs/1998/nsf9863/nsf9863.htm

* Library of Congress American Memory Project/National Digital Library Program lcweb.loc.gov

* Library of Congress Internet Resource Pagelcweb.loc.gov/loc/ndlf/digital.html(includes links to other digital library information)

* EPA Center for Environmental Information and Statisticswww.epa.gov/ceis

X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.