Mappa.Mundi Magazine
Feature
Introduction
Space
Distribute
OSIfication
Current Status
References
Spacer Image
Spacer Image Download This Article
Spacer Image
This Mappa.Mundi feature article is available in the following formats:

ascii
html
xml
Spacer Image
Spacer Image
Carl Malamud currently collaborates with webchick at media.org. He was the founder of the Internet Multicasting Service and is the author of eight books.

Marshall T. Rose is Chief of Protocol at Invisible Worlds, Inc. where he is responsible both for the Blocks architecture and the server-side implementation. Rose lives with internetworking technologies, as a theorist, implementor, and agent provocateur. He formerly held the position of IETF Area Director for Network Management, one of a dozen individuals who oversee the Internet's standardization process.
Spacer Image
Maps, Space, and Other Metaphors for Metadata By Carl Malamud and Dr. Marshall T. Rose



Avoiding the OSIfication of Space

      The architecture of mixers, builders, and servers seemed like a promising one for chopping up the search engine functionality into a distributed system. But, what language to use to describe the spaces?

      Here, we entered a world that is very old yet highly immature. Schemas for the description of spaces date back to X.500, and a variety of efforts throughout the years have attempted to create the ultimate global directory.

      For our spaces, we envisioned something a little different than the X.500 concepts of country servers, state servers, city servers, and institutional servers, all working together to format information about our global population into one framework. Whereas X.500 organized the world in terms of geography and people, we saw spaces as a much more abstract, flexible construct. In other words, we needed a language for describing spaces that helped define a schema, yet was schema agnostic enough to accommodate a wide variety of different kinds of metadata.

      While HTML didn't serve this purpose very well, it was quickly clear that XML had emerged as the data description language for the next millennium. XML has some properties that make it quite attractive. First, the model of nested documents works well for our world of objects that contained objects and relationships to other objects. The XML committee's decision to simplify SGML, yet still support proper characters through the use of the UTF-8 and UTF-16 subsets of Unicode make XML a simple but very powerful language.

      XML is the generic underpinning, a language for describing data. We then looked at a variety of other XML-based initiatives to see if they added power to our ability to describe spaces. The most promising initiative is the Resource Description Framework. RDF evolved out of the earlier PICS platform of the W3C, but serves a much broader role than simply blocking out pornography sites. Indeed, documents[5] from the W3C explain that this framework serves a large number of goals, including:

  • interoperability of metadata
  • machine understandable semantics for metadata
  • better precision in resource discovery than full text search
  • future-proofing applications as schemas evolve
  • a uniform query capability for resource discovery
  • a processing rules language for automated decision-making about Web resources
  • language for retrieving metadata from third parties

      In addition, Tim Berners-Lee, in a technical note[6] further explains that metadata (and it's instantiation through the RDF framework) "will allow huge amounts of information in databases and existing applications to be put on the web, not just for human browsing but for machine understanding: searching, reasoning, and analyzing."

      Given these technical goals, it seemed to make sense to leverage the RDF effort for our own application. We thus looked at a variety of RDF specification documents and examples. A typical example is the following by Eric Miller:[8]

<?xml:namespace ns = "http://www.w3.org/RDF/RDF/" prefix ="RDF" ?>
<?xml:namespace ns = "http://purl.oclc.org/DC/" prefix = "DC" ?>

<RDF:RDF>
  <RDF:Description RDF:HREF = "http://uri-of-Document-1">
    <DC:Creator>John Smith</DC:Creator>
  </RDF:Description>
</RDF:RDF>

      The example illustrates a variety of concepts from the XML world. First is the concept of namespaces[7], defined by Bray et. al. for the W3C as a "collection of names, identified by a URI reference" which provides a mechanism for software to "recognize and act on these declarations and prefixes." In other words, the namespace is a scoping mechanism. In this example, there are two types of names: RDF names, and the Dublin Core. The Dublin Core is an earlier mechanism for tagging metadata.

      This example illustrates the concept of "triplets" on which RDF is based (e.g., "URI" "about" "person"). The target here is a URI, the action is a "description" and the Dublin Core Creator is John Smith. What is interesting is the mixing of different schemas and schemes.

      A further example serves to illustrate the mixing of schemes, in this case based on the popular VCARD,[9] a digital business card that is often attached to email messages:

   <?xml version="1.0" encoding="UTF-8"?>
   <!DOCTYPE vCard PUBLIC "-//IETF//DTD vCard v3.0//EN">

   <vCard
        version="3.0">
   <fn>Frank Dawson</fn>
   <n><family>Dawson</family> <given>Frank</given></n>
   <tel tel.type="WORK MSG PREF">+1-617-693-8728</tel>
   <tel tel.type="WORK MSG">+1-919-676-9515</tel>
   <adr del.type="POSTAL PARCEL WORK">
        <street>6544 Battleford Drive</street>
        <locality>Raleigh</locality> <region>NC</region>
        <pcode>27613-3502</pcode> <country>US</country></adr>
   <label del.type="POSTAL PARCEL WORK"><![CDATA[6544 Battleford Drive
   Raleigh, NC 27613-3502
   US]]></label>
   <email email.type="INTERNET">Frank_Dawson@Lotus.com</email>
   </vCard>

      Finally, we look at a third example, this one from the Microsoft BIZTALK framework:[10]

<BizTalk xmlns=
  "urn:schemas-biztalk-org:biztalk-0.81.xml">
<Body>
<PurchaseOrder xmlns=
  "urn:schemas-biztalk.org:Betterdogfood/purchaseorder.xml">
<POHeader>
<PONumber>12345</PONumber>
<PaymentType>INVOICE</PaymentType>
<POShipTo>
<street1>betterDogFood.COM</street1>
<street2>1179 N. McDowell Blvd</street2>
<city>Petaluma</city>
</POShipTo >
<POBillTo>
<street1>betterDogFood.COM</street1>
<street2>1179 N. McDowell Blvd.</street2>
<city>Petaluma</city>
</POBillTo >
</POHeader>
<POLines>
<Item>
<partno>Alpo</partno>
<quantity>1</quantity>
<unitPrice>14.00</unitPrice>
</Item>
</POLines>
</PurchaseOrder>
</Body>
</BizTalk>

      One can argue that a VCARD or a BizTalk purchase order are not metadata. However, it became clear to us that there would be a variety of schemes advanced for the description of metadata and that any mechanism we put into place should be agnostic to those schemes, allowing space makers to use the mechanism that fits most naturally into their particular space.


Next » The Blocks Architecture - Current Status



 Copyright © 1999, 2000 media.org.


Mappa.Mundi
contact | about | site map | home T-O