Visions of Aestia

07 Apr 2005

The RDF typing epiphany

Filed under: PlanetRDF, XML — JBowtie @ 2:03 pm

Late last night, I finally had an epiphany regarding typing in RDF.

You see, a couple a months ago, when I first started the refactoring to RDF series, I had a fairly clear model in my head about how it all worked. Then, while writing, I started going back to the standard to check facts (particularly after some readers picked up on some mistakes) and got horribly, horribly confused.

The XML Schema people screwed me over again. See, what some people just don’t seem to get is that static typing simply doesn’t mesh well with RDF. Objects are (frequently) incompletely specified and can have zero or more types.

Think about that - they are not restricted to one type; they can have as many as you can dream up. This is key to the power of OWL - I can essentially define new types as queries. I can create objects whose type can only be determined by inference, and I can create an object that has no type. Finally, I can multiple values for a property, and each value can potentially be a different type.

This is (usually) not a problem for dynamically typed systems - this is really the reason why so much semantic web work is happening in Python. Python lets you add and/or redefine methods and properties at runtime, and even change an object’s class by assigning to the __class__ attribute.

But the elegant RDF model suddenly special cases handling of types, and you get lost in the spec and the number of ways to constrain it. New converts to RDF find themselves lost trying to serialize things to XML (no wonder n3 is popular).

Bah, I say. Look at this:

<foaf:Person rdf:about="http://example.org/JBowtie" />

If I load this into my RDF store, and ask for the type of JBowtie, I get back foaf:Person. Even if I never pull in any schema information. That is all 80% of applications need, really.

Most applications don’t even need XML schemas, either, they just key off element and attribute names.

I see people looking at Flickr and the rel attribute and folksonomies, thinking that’s the low hanging fruit that will get people to create useful metadata. It works to a point, then becomes useless because advertisers start stuffing every keyword they can think of into them.

RDF at its simplest:

<rdf:RDF [namespace declarations go here]>
  <core:ability rdf:about="http://example.org/SRD#Strength" />
        
  <core:skill rdf:about="http://example.org/SRD#Jump">
        <core:name xml:lang="en">Jump</core:name>
        <core:keyAbility rdf:resource="http://example.org/SRD#Strength" />
  </core:skill>
</rdf:RDF>

The element name is the type. The rdf:about is the unique ID. And rdf:resource points to some other object. If I understand XML, this is intuitive - I can create this with my favorite tool and consume it with any RDF parser. This is all my app needs and I don’t really need to express it as RDF, I’m just doing that to get distributed XML.

You know what? Someone else can write a tiny bit of OWL to allow useful inferences and/or interface this to other vocabularies, and suddenly the world is my oyster. All I had to do was add a namespace and use URLs for my identifiers. Everybody wins. And it’s harder to abuse than simplistic and arbitrary tagging, since you can create defensive ontologies to filter out bad RDF.

Powered by WordPress