Visions of Aestia

19 Feb 2005

XML vs RDF

Filed under: — JBowtie @ 10:15 pm

Edd Dumbill’s Weblog: Behind the Times

Edd discusses why he chose RDF over XML for the DOAP project.
It’s really a very good post and I urge you all to read it. There are two points in his point I want to run with.

First, his example of RDF/XML markup looks like normal XML. This is a very, very important thing - it looks absolutely nothing whatsoever like the horrible, horrible ugly syntax the W3C documents use for their examples.

Secondly, in the list of “For XML” points, Edd says:

It’s hard to lock an RDF vocabulary down, should you want to.

Locking down an RDF vocabulary is counterproductive. AIs and ontologies and all those other things you personally don’t care about can slurp up and annotate and derive useful information from an RDF vocabulary, and you get the benefits when you can pick up all those terms you understand in data all over the web, learning new things and perspectives as you go.

Everything we do involves change; our understanding of our data and the things we are trying to model changes, our requirements change, new people accessing our data bring exciting new viewpoints to our model. Locking down a vocabulary means you can’t assimilate new ideas. RDF is far more agile than, say, a SQL database, because it handles change so much more gracefully.

I think Edd’s work is a really good example of RDF as distributed XML. Note how DOAP defines all its own terms, then uses an ontology to map to other well-known vocabularies.

17 Feb 2005

Upgrade complete

Filed under: — JBowtie @ 12:00 am

The upgrade went relatively painlessly. I am very happy.

All my posts (including the old ones) now have sane permalinks that appear in the RSS feeds. I will need to regenerate some FOAF information now that the old plugin no longer works, and I will be looking to add the CC licensing info to future posts.

16 Feb 2005

Apology in advance

Filed under: — JBowtie @ 11:13 pm

I’m about to upgrade my blogging software, so please ignore any insanely strange things that happen in the next 24 hours.

In the meantime, watch the hilarious reactions to the announcement of IE7 on the day when Firefox saw 25 million downloads.

08 Feb 2005

RDF as database

Filed under: — JBowtie @ 10:49 am

Yesterday at a job interview, someone asked me about O/R mapping tools. It’s not really a problem I’ve actually given a lot of thought to recently, and I realized why this morning. I promise part 4 of the refactoring next post.

Relational databases are going to be dead in five years. OK, not really dead, everyone has way too much invested, too useful for small problems, etc. But the real action will be in RDF-as-database.

As others have pointed out over on Planet RDF, RDF can be seen as a database. You have primary keys (the rdf:ID) and foreign keys (rdf:resource) and lots and lots of relationships. But RDF is far more flexible, automatically distributed, and allows your data to become agile. Add to that multiple serialization formats, the ability to evolve ontologies and schemas, and the fact that you can use AI reasoners to do your data mining and deduce new properties and relationships, and we have a winner.

Now, this is a bold prediction, so let’s try and back it up with a little more detail.

One - XML allows for the natural expression of hierarchies and parts explosions; this is one of those things that relational models have trouble with. My little refactoring series hopefully shows that moving from pure XML to RDF is trivial.
As more and more of the world’s data moves towards XML, tools for indexing and querying it are becoming more and more powerful. These same tools can be used for RDF serialized as XML.

Two - Agile development methodologies are permeating languages such as C# and Python. But as our code becomes more flexible and sees higher rates of change, the relational models have difficulty keeping up. Refactoring the database is complicated by the general rigidity of the model, the need to transform data and stored procedures using DDL, and the difficulty in distributing changes.
RDF+OWL, on the other hand, makes it very easy to evolve your data. Data can be added and removed incrementally, the usual formats are text-based, meaning diff and patch and version control systems all play nicely with it, and the metadata becomes just more data.

Three - Currently, data warehousing and data mining require specialized tools and knowledge. RDF allows general-purpose reasoners to work with data anywhere on the web, extracting new relationships and spotting trends. Thanks to OWL, domain knowledge can now be written down in a form that allows a general-purpose program to make use of it - and that definition can also evolve over time.

Four - SPARQL is close enough to SQL that a lot of developers will be able transfer their hard-won skills to the new medium. Yes, you need to learn to think in triples, but if you already think in terms of JOINs that’s not a big leap.

Five - The triple translates nicely into relational space. One to three tables, depending on your favorite representation, a handful of stored procedures and you can use today’s engines. Serialize as XML and you can use tomorrow’s engines. And in a few years high-performance triple-based engines with native SPARQL support will be in wide use.

RDF has some natural advantages that make it contender. But it is OWL and SPARQL that give it real traction, because those allow the data people to transfer their current skills into the new space. And once you have your head around the RDF/OWL combination, you finally understand just enough to actually make effective use of an AI - maybe those will be a wide-spread reality this time around.

01 Feb 2005

More on assertions

Filed under: — JBowtie @ 10:58 pm

I wandered pretty far off-topic in my last post, so let me add some more thoughts.

It is possible, through the addition of metadata about statements, to negate, change, or override existing statements. However, I do not believe there is a well-documented, carefully thought out consensus on how to do so.

This consensus needs to come about, and it needs to be a core part of RDF or OWL (probably the latter) so that it is widely supported by reasoners. In the next few years, there will be hordes of newcomers to this technology, and they all will be struggling with the same issues. If there is not a ready answer, they will declare RDF a failure and go back to their XML files with OO wrappers.

Here’s a couple minor proposals to start the ball rolling.


  John C Barstow

This is the opposite of rdf:Description. Instead of telling the parser to add the assertion, it instructs the parser to remove any such statement and ignore it in future.


  

This tells the parser to expect a contradiction and accept this version as true. I would imagine most reasoners would actually add some metadata or adjust definitions to avoid unqualified contradictions.

When Do Assertions Become Facts?

Filed under: — JBowtie @ 1:21 pm

In his post, Semantic Wave: When Do Assertions Become Facts?, Jamie Pitts struggles with some of the same issues I have been struggling with recently.

Jamie notes:

As time passes, the latest assertions about role will inevitably contradict previous assertions.

For some relationships, such as an individual’s role, the addition of a property indicating “effective dates” is an appropriate way of avoiding contradiction.
All individuals fulfill a role for limited time; that’s why we put dates on resumes. It’s entirely possible and desirable to reflect this in RDF.
As with the database, the “current” understanding is really an implicit query looking for statements that are effective today. Thankfully, it’s much easier to model statements about statements in RDF than in relational models.

It goes without saying that interpretations of reality are formed in the mind through an ongoing process of re-assessment. We qualitatively compare present impressions with recent impressions. Enough contradiction, and we form a new working state of understanding.

Well, this is really AI territory, isn’t it? An AI needs to recognize assertions that contradict its current knowledge base, decide whether to resolve the contradiction by throwing out the new or old assertion, and re-initialize its deduction/inference engine to recreate all derived knowledge.

Hmmm…let’s run with that for a minute (thinking out loud here).

Here’s an AI; it manages one or more knowledge bases in RDF/OWL and serves up answers to SPARQL queries, possibly formulated by some NLS parser.
It gathers information by spidering or exchanging RDF with other AIs. It reasons by tapping external deduction/inference engines.
One of the knowledge bases contains trust metrics, used to weight the new RDF statements.
A reasoner is set up to check the new information for internal consistency - if the information contradicts itself the source is presented a proof and metrics may be adjusted.
A reasoner then compares the new information with the one or more existing knowledge bases looking for contradictions. Multiple reasoners should be consulted.
If no contradictions, the new information is added. If contradictions are found, the stronger assertion is persists; strength depends on many metrics, including trustworthiness of original sources, degree of corroboration from other sources, metrics concerning number of other assertions affected, and so forth.
One or more reasoning daemons takes the new knowledge base and deduces as many new facts as possible. The strength of the underlying premises determines the strength of the new assertions.
A couple of things we can add - an imagination. Create random triples, then use a reasoner to look for contradictions or proofs. Add surviving assertions to the knowledge base. We can try less random things such as making assertions about superclasses or removing restrictions.
How about experimentation? Some sensors and motors, and now we can start adding experimentally derived facts to the knowledge base.

Of course, for Jamie’s example, he can accomplish what he wants simply by adding new statements. All that’s needed is a shift in thinking; you’re not making assertions about the current state of things, you’re making assertions about a date range that happens to include today (and possibly not making any assertions, yet, about the end of the date range). Now facts are not changing, you’re merely acquiring new facts.

What I want to do with RDF

Filed under: — JBowtie @ 9:46 am

Some of the tasks I want to accomplish with RDF, in no particular order.

Keep track of the relationships between characters in a novel, movie, or role-playing game. (FOAF?) Maybe I will finally figure out who shot Mr. Burns.

Create or query for NPCs with some stated constraints, such as “a level 4 character with the Quick Draw feat and at least 5 ranks in the Tumble skill.” (OWL or SPARQL) A good random character generator. No, really.

Represent the semantics of a snippet of Python code with OWL. Generate Python code from an OWL Full description (since both Python and OWL Full allow classes to be treated as instances). Because I am a geek.

Generate human-readable RPG rules from RDF. I can now find and eliminate contradictions in the rule sets before publication.

Write a reasoner that can play a card game like Rokugan, with each card and its rules represented in RDF. No longer do I have convince my wife to play.

Encode enough semantic meaning into a screenplay that I can change any detail about a character and have all dialogue and descriptions reflect this. This will of course displace the standard Hollywood script writer, especially when the program is smart enough to handle replacing a human character with a talking dog.

27 Jan 2005

Evil Software Patents

Filed under: — JBowtie @ 10:24 am

ongoing � More Patent Funnies

This is NOT funny. Google has just this month received a patent on “highlighting matches to search terms in retrieved documents”.

How could this even happen? One of the criteria for a patent is supposed to be that an idea is non-obvious. There are plenty of RDF tools that could be considered to be covered by this.

There is no longer any justification, in my mind, for a 20-year patent (what business plan goes out more than 5 years, anyway?). And I’m seriously thinking of joining the camp that says all patents are worthless.

The point of a patent is that if you invent something, you, rather than some copycat, get to make money off of it. The problem is that in today’s world, you don’t - either you spend all your time and money in monitoring and litigation, or you use it as a weapon against competitors.

This patent should never have been granted. I personally had prior art in 1991, nine years before the patent was even filed. And I was still in high school at the time!

25 Jan 2005

Overrides in OWL

Filed under: — JBowtie @ 12:58 pm

One of the issues I’m struggling with at the moment is the fact that OWL doesn’t really handle subclassing very completely. I’d be pleased to shown I am wrong about my assertions here.

In the OWL model I can:
- Add new properties
- Refine the domain or range of a property.

However, I cannot:
- Remove a property.
- Change the domain or range of a property to something completely different.

Frankly, these are not onerous restrictions. However, I think it’s important to remember that RDF is meant to be distributed; these restrictions make it difficult to code around errors or model poorly-defined domains.

If there is an error in a document, I have no standard way of using the definition sans error. Consider the following scenario.

animals.rdf


  ...
  
    
      
      
    
  

australia.rdf


  

Here, the first document encodes a common, if misleading, definition of a mammal. Mammals (usually) give birth to live young, and the original author mistakenly included this assumption in his ontology.

However, this definition is wrong. The platypus is a mammal that lays eggs. Clearly, the definition needs to be changed by removing the restriction, and if I in fact have control over both documents, this is easy to accomplish.
Thanks to the distributed nature of RDF, however, this may not be possible. What if I can’t correct the document or convince the author he is mistaken? I need a way to assert that the platypus is indeed a mammal, but an exceptional one that lays eggs instead of giving birth to live young (note that without cardinality restrictions I can say an animal does both).
The tack I would like to take is to define a special subclass of mammal, say egglayingMammals, that removes the restriction. But I don’t know of any way to do this short of defining my own, non-standard properties.

Now, the current fact is that most ontologies are currently being created by domain experts. But as RDF trickles down to the common developer, more and more ontologies are going to be hastily created without the input of domain experts, and will contain errors and contradictions that need to be coded around. I think we need a way to create definitions that lifts or ignores the restrictions laid down by a superclass. We need a way to assert that some parts of a definition or certain data must be ignored. We need a standard way to assert that something is wrong, or that a perceived contradiction can be ignored. It may be messy and inelegant, but frankly so is most code (and data) in the real world. Ever try to normalize address data?

19 Jan 2005

rdf:ID and relative URIs

Filed under: — JBowtie @ 11:43 am

One thing which consistently confuses me is the the fact the rdf:about and rdf:resource follow the usual rules for resolving relative URIs, but rdf:ID does not.

Specifically, as Ed Davies points out in his comment to Refactoring to RDF, step 1, a relative rdf:ID is always interpreted as an anchor within the current document.

This means that within the document “http://example.org/orders”:

“http://example.org/customer#12″, being an absolute URI, is handled identically by rdf:ID, rdf:about, and rdf:resource.

The relative URI “customer#12″ is interpreted as “http://example.org/customer#12″ by rdf:about and rdf:resource, but as “http://example.org/orders#customer#12″ by rdf:ID.

This difference in interpretation of relative URIs extends to cases where xml:base is specified.

The best bet for me, personally, is to either always use rdf:about or to always use absolute URIs.

18 Jan 2005

Hi, PlanetRDF!

Filed under: — JBowtie @ 9:56 am

Dave Beckett has looked over my RDF data and kindly chosen to add me to the list of semantic bloggers. This is very, very cool since I read Planet RDF on a daily basis.

I’ve been working with XML since the spec was released and have recently become an RDF convert. For the longest time I was completely put off by both the unreadability of the specs and the hideous, hideous serialization format. Thankfully, OWL came along with a readable document that built on RDF, and I found a little python module called sparta that made the scales fall from my eyes.

My version of the RDF elephant is quite simple - it is a distributed form of XML. I know that’s an oversimplification on many levels, but it also makes it really easy to make the transition from producing pure XML to producing ‘dumb’ RDF - that is, pure data just below the semantic layer. Thanks to its distributed nature, some of you smart folks can produce ontologies, triple stores, and inference engines.

Anyway, for the next few weeks I’ll keep working on my series on refactoring XML to RDF.

17 Jan 2005

Refactoring to RDF, step 3

Filed under: — JBowtie @ 2:35 pm

Step 3 of our refactoring is moving from an XML parser to an RDF parser. This can get a bit tricky, so we’re going to a make a few assumptions to avoid outlining every possible scenario.

Let’s assume that your load/save logic is reasonably isolated, and looks something like this:

class order:
    def load(self, node):
        self.id = int(node.id)
        self.customer = customer(node.customer)
        self.product = [product(p) for p in node.product] # a list of products

Here we’re relying on something vaguely-DOMish to traverse down the XML tree and creating objects that correspond to the various elements and attributes we encounter. Type inference could easily happen if some sort of schema description existed.

Now, RDF parsers don’t use the DOM as such, because they may be pulling together information from multiple documents. Instead, most parsers use the triple store as their interface.
Conceptually, this is quite simple. Each object we’re going to create has an rdf:ID, and it may be serialized across multiple documents (using rdf:about to add to the original XML). The fields we are serializing are either references to other objects (rdf:resource links) or simple types (element/attribute values).
All you need to know for this step is that you can get the fields values by issuing a query - the rest will attend to itself. Since we no longer have a context node, the id will need to be passed in to our load function.

class order:
    def load(self, store, id):
        self.id = id #this will be the rdf id, so it is a URI like \"order#17\" instead of an int
        self.customer = store.query(id, \"order.schema#customer\", None)
        self.product = store.query(id, \"order.schema#product\", None) # a list of products

See how similar this is? Instead of assuming that the interesting elements/attributes are children of the current node, we ask the RDF parser for the children of the current node. This is why RDF can be distributed - the parser hides the fact that some of the data may once have lived in another document.
Now, I have deliberately left calling the object constructors to the store.query( ) method. Most RDF stores can use schema or type information and do the right thing. However, some stores cannot create objects and only return the rdf:ID of the children we are interested in.
In this case, the code becomes:

class order:
    def load(self, store, id):
        self.id = id #this will be the rdf id, so it is a URI like \"order#17\" instead of an int
        self.customer = customer(store.query(id, \"order.schema#customer\", None))
        self.product = [product(p) for p in store.query(id, \"order.schema#product\", None)] # a list of products

Which is even more like our original sample.

One final note - some RDF parsers may require your document’s root element to be rdf:RDF. Live with it or find a more liberal parser.

In part four we’ll get into that type information deferred from this step, show a helper class to create objects of the correct type, and start looking at the more interesting things we can do now that we are RDF-enabled.

Refactoring to RDF, step 2

Filed under: — JBowtie @ 11:26 am

You might want to review Part 1 before proceeding.

Our document fragment last looked like this:


  John Smith


  
  

Remember, this is a fragment. It assumes there is a root element declaring the RDF namspace.

Step 2 of our refactoring is to add RDF-style namespaces. Technically we do not need to do this, and there are plenty of existing RDF namespaces you might want to use, including FOAF, RSS, and Dublin Core.

An RDF-style namespace is just like any other XML namespace, except by convention it ends with a hash(#) sign or slash(/).


Just in case it hasn’t clicked, the reason for this convention is to allow us to write RDF that describes the elements. When we combine the namespace and local name, we get an RDF ID - now we can write our schema in RDF!


  

You’ll note both styles of namespace here - the RDF Schema namespace ends with a hash sign, while the FOAF namespace (http://xmlns.com/foaf/0.1/) ends with a slash. This example says that the customer element has type foaf:Person.

Actually specifying and using type information is something we’ll cover in step 3; for now the important thing is that we’ve added a namespace. Any existing serialization code will need to be updated to handle the namespace. Here’s our fleshed out sample showing the root element.


    
      John Smith
    
    
      
      
    

To recap:

  • Step 0: Make sure your XML parser understands namespaces
  • Step 1: Replace any existing ID values in your XML with RDF-specific IDs. That is, the value becomes a URI and the attribute becomes rdf:ID, rdf:resource, or rdf:about.
  • Step 2: Add RDF-style namespaces. The namespace URIs should end in a hash(#) or slash(/). All elements and attributes should be in an RDF-style namespace.

10 Jan 2005

instancesOf for sparta

Filed under: — JBowtie @ 12:24 pm

I’ve been using sparta for my RDF serialization lately, and have come up with a useful patch. Turns out a frequent use case (for me, anyway) is getting all objects of a specific type.

I’ve found the following addition to ThingFactory does exactly what I want it to do:

def instancesOf(self, typeUri):
    return [self(s) in self.store.subjects(TYPE, URI(typeUri))]

What it’s doing is getting all objects in the store that have the given rdf:type. I use this all the time when hooking things up to the UI.

My usage looks like this:

def loadPeople(self, store):
    self.create = ThingFactory(store)
    people = self.create.instancesOf(example_person)

06 Jan 2005

RSS 1.0 Feed

Filed under: — JBowtie @ 1:58 pm

I’ve changed the template to generate an RSS 1.0 feed by default instead of the RSS2 or 0.92 feeds. In theory nobody will notice.

This is prerequisite to getting syndicated on some Planet feeds.

16 Dec 2004

Refactoring to RDF, step 1

Filed under: — JBowtie @ 12:52 pm

OK, let’s say you’ve decided to take the plunge and switch from plain XML to the distributed kind you get with RDF. How are you going to get there?

Step 0: Make sure your XML parser understands namespaces. You’re going to be needing them.

The first refactoring should be pretty easy. You need to replace any existing ID values in your XML with RDF-specific IDs.
Let’s look at an example snippet:


John Smith






Probably this XML comes from a database somewhere. Doesn’t really matter. The thing to recognise here is that the “id” attribute is actually being used as both an anchor and a reference.

Pass one, we replace the anchors.


John Smith






Remember, RDF just uses the ID value as a unique key. It doesn’t have to point to a real document.

In pass two, we replace the references.


John Smith






We also have to update any existing XML parsing we do to use the new attribute values instead of the old ones, but that should be straightfoward.

UPDATE: Fixed broken example tags. Yeesh - and the characters WordPress ate.
UPDATE (27-Dec-2004): Fixed embarrassing typo in response to Dave’s comment below.

Dumb RDF and Smart RDF

Filed under: — JBowtie @ 12:06 pm

Once you understand that RDF is really just a distributed form of XML (see RDF/OWL == XML/XSLT, part 1 if needed), you’ll probably want to start taking advantage of it. Unfortunately, most of the existing tools seem to be written by AI specialists.

Dumb RDF is the kind most of us actually want to write. We don’t want to bother about transitive properties, triples, or inductive logic. All we really want is the ability to split our XML blocks across multiple files and magically piece it together. The other stuff is cool, but utterly irrelevant to most of us at this stage.

Examples of Dumb RDF includes Dublin Core (for tagging documents), FOAF (for tagging people), and RSS (for tagging syndicated content). All these examples involve some fairly simple XML that doesn’t require any special knowledge. Notice how popular these formats are.

Smart RDF is where we get into topic maps, ontologies, triples, and other complicated areas. This is where the AI people add all sorts of tags to constrain relationships, definitions, and carefully-defined vocabularies. Smart RDF can deduce that my father’s brother’s only brother’s daughter is my sister, is female, and has only one paternal uncle.

Dumb RDF is just data. Everybody can undertand it and generate it. This is where most of us will always want to be. But Smart RDF is where the active practitioners are. That’s why it seems so complicated and unapproachable.

You know what? That division is just fine, because RDF is distributed. You and I can go off and write Dumb RDF, W3C implementors can write Smart RDF, and we never have to know about each other. An RDF parser that stumbles across Smart RDF will suddenly get smarter and be able to answer more questions, but it still works if it sticks to Dumb RDF.

10 Dec 2004

RDF/OWL == XML/XSLT, part 2

Filed under: — JBowtie @ 3:13 pm

Sometimes, when I’m working with XML I need to change vocabularies. That’s why we invented XSLT, so that I can take XML in one flavor, modify it a bit, and get out some other flavor. Probably the most common transformation is {your favourite XML variant} to XHTML.

When we’re doing transforms, what we’re really doing is exploring the relationships between chunks of XML, mostly with XPath. So we discover that “relatives/uncle” in my example is the same as “relation[type=’uncle’]” in your example, and we write some transforms to convert back and forth.

XSLT lets me take the things you write in XML and turn it into XML that I understand. So instead of rewriting my program to recognize “foaf:name”, I can write an XSLT filter that turns it into “person/name”. I can do lots of other things with it, but that’s the core use case.




OWL lets me do the same thing on an RDF level. In other words, take some XML you understand and map it to some XML I understand.



So this is a description block that should be added to the XML that describes the name element in the FOAF schema. And instead of manually mapping it to an element that we understand, we are telling the OWL processor to do the mapping for us. It’s the same thing, but at a higher level.

With XSLT, we stop talking about the data contained in an XML document and start talking about the elements and attributes that make it up. Often we look at the data in an XPath to figure out what to do, but mostly we move it around between elements.

OWL also concentrates on the elements and attributes that make up a document. Instead of using XPaths to pick out nodes, we put nodes into categories.








is conceptually the same as:




It looks a little more complicated in OWL because everything needs an ID, even your elements and attributes. In a lot of cases you’d already have some sort of XML-based schema somewhere defining your terms.

But if we’re composing XML from blocks scattered around in multiple documents, OWL shines, because XSLT is miserable at it.
Imagine the following:


25


Pointing our transform at either of these documents is not going to produce results, because the color attribute is not actually defined on the wine element. OWL, however, builds on the distributed nature of RDF and can easily output the WhiteWine element.

I know I’ve left a lot out, but hopefully this will help someone else start grokking the Semantic Web.

RDF/OWL == XML/XSLT, part 1

Filed under: — JBowtie @ 2:43 pm

Once upon a time, you used to have to explain XML to a group of IT people who had never heard of it. You’d give a little speech starting “XML is just like HTML except….”, followed by the five or six key points about XML. It was of course a drastic oversimplification and lots of details were glossed over, but everyone understood the basic concepts.

This is my attempt at a similar stab at getting everyone to understand RDF/OWL.

RDF and OWL are just like XML and XSLT. Let me illustrate.

A chunk of XML describing a person might look like this:


Clifford

123 Main St


If you wanted to link to this chunk of XML, you would use something like:






Notice how the ID is used to identify the chunk of XML you care about, and how a URL is used to refer to it from somewhere else. In this example, we might use some XSLT to create a web page with links to the XML describing my relatives. Most of us have seen stuff like this, this is how HTML links work.

RDF works exactly the same way. We identify some chunk of RDF with an ID, and use a URL to refer to it from somewhere else. We use rdf:resource in place of the href, and rdf:id so that RDF recognizes the id.






Cool. I can still turn this into a web page without thinking too hard. Any RDF-aware program can follow the links and get the XML describing my uncle Cliff.

I’m not just confined to linking to my uncle’s XML, though. I can add to it pretty easily by using a rdf:Description element.








Cliff
United States

As far as any RDF application is concerned, this is exactly the same as if I edited the original document to look like this:


Clifford

123 Main St

Cliff
United States

What the RDF parser is doing (conceptually) is following the link to get the original XML, then adding anything contained in the description block to get the final XML it ends up showing you. I could do this manually in XSLT by writing a transform, at least for simple cases.

There is only one other thing you need to know about RDF.

Most RDF browsers don’t actually care if a document you link to exists; they just won’t be able to show you much information. If example.org/family.xml is deleted, the browser will only show my uncle’s nickname and country. But that’s fine, it can still combine any other description blocks it finds that link to him. From an RDF perspective, the URL is just a unique identifier, like a database key.

The bit about adding description blocks together is really key. When I don’t have access or permission to modify an XML document, I can create a description block so that it appears (to an RDF browser) that I’ve made the changes anyway.

On to OWL in my next post.

07 Dec 2004

Python and RDF

Filed under: — JBowtie @ 10:51 pm

Sparta is a small python library I came across on Planet RDF. Basically it parses RDF into Python objects.

It’s pretty simple, gets the job done and will require very little work to serialize things the way I want to.

The main thing I need is an instances collection for rdf.type and owl.class objects. I would think this a fairly obvious starting point for a lot of application processing - get me all the person instances so I can display them in a contact book, for example.

This should be trivial to implement as a predefined query against the underlying triple store; on a side journey some of the owl relationships, such as subclasses, may need to be addressed.

Finally, mapping an RDF type to a specific, preexisting Python type is a must for decent refactoring of existing code. I need to investigate Sparta a bit more to understand if this is currently possible.

Powered by WordPress