Michael Daconta: The semantic Web

Michael Daconta: The semantic Web<@VM>Question and Answer<@VM>Submit a question

View transcript

Need help sorting through Web services? What is the semantic Web? Internet expert and researcher Michael Daconta, author of a book by the same title, answered questions Feb. 12 in a GCN.com Forum.

Daconta is director of Web and technology services for systems integrator APG McDonald Bradley Inc. of McLean, Va. As part of that job, he is chief architect for the Defense Intelligence Agency’s Virtual Knowledge Base, a project to compile a directory of Defense Department data through Extensible Markup Language ontologies.

Daconta has written a number of technical papers and books. Most recently, he co-wrote the 2003 book The Semantic Web, along with Leo Obrst and Kevin Smith. The book is a primer on how XML, Web services and the emerging semantic Web fit together.

Before working on the Virtual Knowledge Base, Daconta helped create a set of electronic mortgage standards for Fannie Mae. In the Army, Daconta worked as a programming section chief on combat and intelligence simulation software at Fort Huachuca, Ariz.

Daconta received a bachelor’s degree in computer science from New York University and a master’s in computer science from Nova Southeastern University.

Read Daconta's interview with GCN Associate editor Joab Jackson
GCN.com: Welcome. We'll start the forum in a few moments. Stand by.

Joab Jackson:

Welcome to today's forum. Today we have Michael C. Daconta, an author of The Semantic Web, a recently published book by Wiley Publishing.

We have found Mr. Daconta to be extremely helpful in making sense of the soup of acronyms that is Web services and the Semantic Web. Daconta is also chief architect for the Defense Intelligence Agency’s Virtual Knowledge Base, a project to compile a directory of Defense Department data through Extensible Markup Language ontologies. He is chief director of Web and technology services for systems integrator APG McDonald Bradley Inc. of McLean, Va.

Thanks for logging in today and welcome Mr. Daconta.

Joab Jackson: Blogging [Weblogging] has become really popular on the Internet. What can government agencies learn from the blogging community?

Michael Daconta:

Agencies should learn that decentralized sharing of data is the best way to go. Things are too dynamic and fluid to attempt to centralize data sharing.

The Intelink Management Office [an interagency office that oversees top-secret, secret and sensitive but unclassified intranets for intelligence organizations] is rapidly adopting RSS for implementing
publish/subscribe of site feeds/changes for the Intelligence community.
Other agencies should rapidly follow suit.

One thing *NOT* to learn from the blogging community is the tremendous
amount of standards "thrashing" that has gone on with RSS. Our project
uses the RDF Site Summary which is RSS 1.0 because we want a
seamless migration path to have our RSS feeds references OWL ontologies. [Editor's note: OWL is the Web Ontology Language. RDF is the Resource Description Framework] RDF and OWL have just become W3C recommendations. See
http://www.w3.org for details.

Alec in DC: How persistent do you anticipate ontologies will be? Put a different way, will we end up with legacy ontologies as the real world that they were mapped to changes?

Michael Daconta: Good question Alec. We certainly will. That is why the OWL ontology specification
which JUST became a recommendation (see www.w3.org for the announcement) has
features for versioning ontologies, stating that one is superceded by another and
mapping between them.

Wash DC: What is the Departmnet of Defense Discover Metadata Standard? How is it different from the Dublin Core [a set of metadatastandards for business use]?

Michael Daconta: I am part of the team creating the DDMS XML Schema. DDMS is standard discovery metadata for DOD resources that is very similar to Dublin Core. In fact, we will be creating an XSLT stylesheet to convert from both DDMS to Dublin Core and vice versa. The key difference between DDMS and Dublin Core is that DDMS is more detailed than Dublin Core; however, we reused all the key concepts from Dublin Core.

Larry McCay - Philadelphia, PA:

What if any aspects of security do you see as being within the scope of the Virtual Knowledge Base and how do you ensure the integrity of the knowledge as it is accessed across the network?

Do you plan on leveraging only the security mechansims provided via the chosen platform?

Michael Daconta: Well, I usually defer these types of security questions to my friend and co-author Kevin Smith. Kevin is our security expert. I can tell you that we are not just leveraging the security on the platform. Kevin has come up with some innovative techniques for Web services security. He has spoken at JavaOne and recently at the Net-centric warfare conference on these techniques.

Joab Jackson: Can you describe the split between RPC and SOAP? How are they different? Which one, in your opinion, is one better?

Michael Daconta: The difference between traditional RPC (which stands for Remote Procedure Call) and SOAP (which stands for Simple Object Access Protocol) is that RPC was a binary specification that allowed a client to call a procedure (also called a method or function) on a remote computer and retrieve the results from the function. SOAP allows that same functionality but using XML over H.T.T.P. Thus, SOAP gives you the same functionality (and a little
more) with open standards. The genius of RPC is allowing programmers to perform network programming in the same way they write regular programs. The genius of SOAP was making that capability web-friendly, cross-platform and cross-programming environment (for example both .Net and J2EE).

Randy Spears, Bethesda MD: Can I download code that you wrote that demonstrates this concept in a 'real world' application.

Michael Daconta: Yes. There are many real-word examples. You can go to the "mangrove" project at Washington University for some good applications. Also, the company Applied Semantics has good real-world applications. So good, in fact, that the company was purchased by Google. On my web site (www.daconta.net) I have some ontologies and the Wordnet ontology and Web services will also be posted there under the Projects folder.

Washington: How doable is it, really, to parse an infinitely flexible language like English as you describe? What languages would be easier?

Michael Daconta: Parsing English is not very difficult at all. It is easy to break up an english sentence into words. It is a little harder but possible to then break the words down into parts of speech. We do this now for our government projects and there are several open source packages on the internet that also do it. Additionally, Google has stated that its next major enhancement for its search engine is natural language question answering. As for other languages, there are certainly other ways to receive information on what user a wants to discover--everything from Structured Query Language to graphical browsing.

Susan, Bethesda: What do you believe will be the most important career skill of the next 20 years?

Michael Daconta: Boy that is a good question especially in regards to the current offshoring problem. From my own career experience, I would say that "innovation" and "motivation" top the list. Especially innovation with an aggressive attitude towards solving problems.

Dave in VA: In our XML development, it is difficult to agree on tags. How difficult is it to create what you describe as smart data?

Michael Daconta:

Representing knowledge correctly is not easy; however, as semantic web tools mature and as we get more good examples -- it will become easier. This year, I will be participating on the W3C Semantic Web Activities Best Practices group. Our job will be to come up with advice, guidelines and patterns for creating data using RDF and OWL.

One more thing, even the move from unstructured data (like HTML) to semi-structured data like XML can be difficult. I highly recommend that agencies get a senior team of XML experts and functional experts to hammer out a draft standard BEFORE opening it up to a large group. Why? Because you need experienced people to easily settle the basic debate questions that stall large groups. This method has worked for customers who I have supported to develop mortgage standards, military markup and now NCES standards.

Nampa, Idaho:

Many IT professionals working to bring to market associative search engines find themselves stymied, even baffled, by how difficult it is to program a search engine that is an almost exact match with how users view the data and their needs. One group created preset keyword queries that, when shown to the customer, were almost a 100% mismatch from how the user defined the keyword.

So, what is the realistic chance that search will one day be as easy and successful as asking Rover to fetch the eyeglasses case that only you and he know how to find?

Michael Daconta: Excellent question. Part of the answer lies in the fact that you cannot solve the problem with only half of the metadata equation. In other words, a server side search engine has to know something about the user in order to provide the user with relevant results. In fact, a good definition of relevance is the intersection between the user context and the data source. This fundamental mismatch is evidenced when search engine companies try to take a probabilistic query approach (a few key words that "probably" refer to what you want) and marry it up with a deterministic search space (associative search engines as you discuss). The flip side of the coin can be seen in the failure of "Ask Jeeves" that tried to marry up a deterministic query (like "How many people live in Virginia?") with a probabilistic search space (traditional keyword matching). Again, won't work. So the solution is to marry up determistic models of both the user context and the user queries with a deterministic search space. This is the approach that the modern search engines are now aggressively exploring.
Also, the time for us to get this right is running short. Once we have location-aware cell phones and voice-recognition, the requirements for "real-time" relevance will increase dramatically.

Joe B, Washington DC: How is [semantic Web] different from what the [Object Management Group] calls model-driven architecture?

Michael Daconta: They are similar efforts that are working together on certain aspects of data modeling. The semantic Web technologies are being driven by the W3C while the MDA is driven by the OMG. The roots of the semantic Web are the Web while the roots of MDA are the Unified Modeling language (UML). The OMG is working on a UML profile for OWL. This will be a major milestone in the adoption of these technologiess as there are many programmers familiar with UML. So, bridging these technologies is a major step in the right direction.

Ed , Maryland: What government agencies are taking the lead in evaluating semantic Web technologies? When do you expect we will begin to see production government applications using semantic Web technology?

Michael Daconta: I work with many of the agencies moving out aggressively in this space. I would say that in my opinion: DIA, OSD, DISA, the Army and EPA are being the most forward-thinking. However, those are just the organizations that I personally know are involved in these technologies. I am sure there are many I do not know about. The horizontall fusion program is spearheading many semantic web technologies and are proving the concepts to work in improving information sharing and interoperability.

Larry McCay - Philadelphia, PA: What platform is the Virtual Knowledge Base being implemented on?

Michael Daconta: The primary platform for server side of the Virtual Knowledge base is Sun hardware and the Solaris operating system.

Alec in DC: Debugging has to be a challenging process in a full-fledged smart data "product" like the one you have proposed for the application of Semantic Web technology to network-centric warfare. How, for example, do you test for completeness?

Michael Daconta: While debugging is always a challenging process, it is actually the same for a semantic Web product as for any large enterprise system. However, I believe the process will actually get simpler by adopting semantic Web technologies because they rely upon "declarative languages" like XML. The more software components exchange human-readable XML at the interfaces, the easier debugging will be because MOST errors occur at the interfaces.

Joab Jackson: Well, that will wrap up this forum Q&A. Thanks again, Mr. Daconta, for joining us today.


About the Author

Connect with the GCN staff on Twitter @GCNtech.


  • Management
    shutterstock image By enzozo; photo ID: 319763930

    Where does the TMF Board go from here?

    With a $1 billion cash infusion, relaxed repayment guidelines and a surge in proposals from federal agencies, questions have been raised about whether the board overseeing the Technology Modernization Fund has been scaled to cope with its newfound popularity.

  • IT Modernization
    shutterstock image By enzozo; photo ID: 319763930

    OMB provides key guidance for TMF proposals amid surge in submissions

    Deputy Federal CIO Maria Roat details what makes for a winning Technology Modernization Fund proposal as agencies continue to submit major IT projects for potential funding.

Stay Connected