Visions of Aestia

05 Jul 2006

OpenOffice code is ugly

Filed under: Programming — JBowtie @ 12:39 am

I’ve always been annoyed by RSS feeds that hide information “below the fold”, where you have to click on a link to read the whole thing. Sometimes, especially in Planet feeds, it can be downright deceiving.

Stuart Yeate’s latest post on why developers don’t love OpenOffice.org is a case in point. Thanks to a combination of a lead that cuts off halfway through his list and an unordered list, the most important reason that open source developers don’t embrace OpenOffice.org is invisible.

* OOo is not built using the open source development methodology. OOo is not planned, structured, implemented or run as a typical open source software development project, which makes it much harder for open source developers to to contribute on a casual basis

When I got into translation work, there were three main targets for me - OpenOffice.org, Firefox, and GNOME. I quickly got commit access to GNOME and have the odd spate of activity when I actually get stuff done. This is casual involvement at its finest.

Unfortunately, the entire process around OpenOffice is completely opaque in comparison. The agenda is (still) largely controlled by Sun managers rather than individual community members; the build process is complex, involved, and underdocumented; it has a needless dependency on the still-proprietary Java machine which is increasing rather than decreasing; it does not build on the common internationalisation framework used by nearly every other project; and so forth.

I’m a dabbler, but I have my hand in many places. I’m a (recently added) member of the OpenDocument standards committee, do GNOME translation, help maintain a kernel driver and an X.org driver, a community member of the Ubuntu Laptop Testing Team, contribute when I can to Redland, rolled my own linux distro once upon a time, and have several projects in the queue for public unveiling.

I did have an intention of contributing to OpenOffice once. After seeing the code, I’d much rather contribute to AbiWord (and in fact will when I’m no longer overcommitted on other projects). I know that Sun is still learning to play nice with the FOSS world, and I expect someday I will be a dabbler in that code base, too. But not now.

16 Jun 2006

RDF+SPARQL is Data 2.0

Filed under: PlanetRDF — JBowtie @ 4:45 pm

Alex James had an interesting post about the future of data, in which he equates foreign keys to hyperlinks.

So a foreign key is a hyperlink (or url), but it has one MASSIVE limitation: the foreign key must point to a row in the SAME database. Not much good for the web I think you will agree!

Continuing with the webpage analogy: this is a hyperlink to another page on the same site. Or somewhat more formally, but less accurately, you can think of it as a RELATIVE hyperlink, i.e. something like this: ‘/Table/Key’ rather than something like this: ‘http://Server/Database/Table/Key’.

Running blindly with this analogy some more: What we need to create ‘a virtual internet database’ is the ability to use ABSOLUTE hyperlinks too: i.e. allowing the foreign key to get *really* foreign and point anywhere that data might exist on the internet.

This is a really good analogy for the RDF-as-agile-database view of the world. It’s particularly nice since it picks up on the distributed aspect of RDF at the same time.

As I’ve previously asserted, SPARQL is key because it allows people to transfer their database skills to the semantic web instead of learning RDF-as-AI methods of querying (ie; cwm or LISP-style queries). But it also makes for a compelling analogy that I for one wil be using when talking about RDF in future.

25 May 2006

Learning how to parse PDF

Filed under: Programming — JBowtie @ 6:38 pm

Since I joined OASIS, I’ve decided I’d better bone up on existing document formats in order to maximise my contribution to the OpenDocument spec. So I jumped headfirst into the wonderful world of PDF parsing.

Actually, the Adobe spec is really well written and organized. Sadly, the same cannot be said of some of all the poppler code; the newer stuff is all right but the older XPDF-based code is pretty hard to slog through.

After dealing with the poorly implemented/documented ball of lint that is RTF, I was bracing myself for the worst. PDF is actually not that hard to follow - has some odd quirks, but nothing too daunting. I’ve written an analysis over at my company site. Hopefully I will be able to implement some of this in the near future.

15 May 2006

ARGH!! Sticker shock!

Filed under: PlanetRDF — JBowtie @ 6:22 pm

Normally, I would jsut complain about something this in muted mutters under my breath. But frankly this is so outrageous I can’t even see straight at the moment.

I have internet access through my local cable company (TelstraClear for those in NZ). They have a monthly transfer limit of 10GB/month - with a charge of 0.20/MB after you hit the cap. The rate itself is obscene. but not much of a worry most of the time, since when you go over you pretty drastically curtail your usage.

Unfortunately, since TelstraClear chose to de-peer from the local exchanges, all local traffic has essentially become international traffic. The result when I spent last month downloading various Ubuntu ISOs without seeing the single warning email - $800 for 4GB of traffic.

The thing that really burns me is that the difference between their 10GB and 20GB plans is $20. They could have cut me off, throttled my connection, upsold me to the 20GB plan, or called, or sent a letter, or *something*.

Even better - because I called about the caps when I first broke them (by 100MB) back in November, they say they can’t do anything about the bill. So no warning and no mercy.

I’ll complain to Commerce Commission when their office opens in the morning, but damn am I angry right now.

02 May 2006

Taking the plunge

Filed under: PlanetRDF — JBowtie @ 3:35 pm

First of all, I want to thank everyone who responded to my post about being available for work; I’ve been overwhelmed - to the point where I’ve decided to work for myself.

Now, I promised myself not to be a Web 2.0 company; there are already enough of them failing to deliver as it is. The hard part when you get into any kind of technology consulting is picking a focus, particularly when you get such an interesting range of offers (and such a varied background!)

What I really want to do is focus on moving people towards the future - moving towards open source, open standards, and open data. That means explaining why and when RDF is better than XML, why and when OpenDocument is better than DOC, and why and when Linux is better than Windows. That also means helping port things from Windows to Linux and vice versa.

I liked porting Redland to Windows - it’s a royal pain at times, but it means (or at least I hope it means) that at least one more Windows program will be cross-platform, which means future migrations to Linux are that much more easier.

Unfortunately, I’ve lost my access to a Windows machine for the short term; until I have enough work to pay the bills I don’t dare buy an expensive proprietary license - if I’m going to spend money, it’s to join OASIS as an individual contributor so I can get in on the OpenDocument metadata work!

Now I’m off the Flickr to try and find some nice CC-licensed fern pictures that I can use as the basis of a logo.

21 Apr 2006

Available for work

Filed under: GNOME, PlanetRDF, Python — JBowtie @ 9:40 am

My employer is looking to move all development work to Australia. As a result, I’ll become available in the very near future.

I’m looking to continue working with RDF, Python and Linux (I prefer GNOME, Ubuntu and Debian personally). I have a strong background in C#, C++, and Windows, so would consider Mono work, including porting things away from Windows. I am really interesting in escalating my involvement in free software - in fact I’m about to join OASIS as an individual.

I won’t leave the Wellington, New Zealand area, but am certainly willing to entertain offers to work remotely. If you’re interested, e-mail me at jbowtie AT amathaine.com and I’ll drop you a CV to peruse. In the meantime it looks like I’ll have a chance to catch up on some of my deferred writing and projects.

Redland Windows 1.0.3 ready

Filed under: PlanetRDF, Python — JBowtie @ 9:20 am

Built and signed the files some time ago, but due to other time commitments didn’t get around to announcing them. You can download them from their temporary location: http://nzlinux.virtuozzo.co.nz/files/redland/1.0.3/

I’ve made some changes to the packaging. I use deferred linking for the MySQL backend, so no longer include the mysql client dll. I’ve statically linked the sqlite3 backend, so it’s built in. The rapper and rasqal programs are included in the DLLs package.

Finally, I’ve only produced Python and C# bindings this go round. I will be releasing an update with Postges support, but I’m not sure of the timeline right now. PHP bindings will require me to update to the latest SWIG to build properly, and I need to investigate options for producing Ruby bindings as a single statically linked DLL (because Gems packaging for bindings is broken on Windows).

29 Mar 2006

Neat Python trick

Filed under: Python — JBowtie @ 10:46 pm

This is mostly as a reminder to myself, in case I forget again.

If you want to put all your unit tests in a subdirectory called “tests”, do so and add “..” to the python path.

import sys, os, re, unittest, os.path
        
def regressionTest():
        path=os.path.abspath(os.path.dirname(sys.argv[0]))
        files=os.listdir(path)
        test=re.compile(\"tests.py$\", re.IGNORECASE)
        files=filter(test.search, files)
        filenameToModuleName = lambda f: os.path.splitext(f)[0]
        moduleNames=map(filenameToModuleName, files)
        modules = map(__import__, moduleNames)
        load = unittest.defaultTestLoader.loadTestsFromModule
        return unittest.TestSuite(map(load,modules))
        
if __name__ == \"__main__\":
        sys.path.append(os.path.abspath(\"..\"))
        unittest.main(defaultTest=\"regressionTest\")

Now I am happy.

20 Mar 2006

Why we need explicit temporal labelling

Filed under: PlanetRDF — JBowtie @ 11:50 am

Some good feedback on my rdf-lite post from Jeen Broekstra and Seth Ladd.

In Jeen says:

Anyway, whether the solution works well in all cases or not (and I’m sure there are other modeling solutions possible for the example I just gave), a tacit assumption in general seems to be that in order to incorporate both provenance and time in the RDF model, we need to extend from triples (subject, predicate, object) to not just quads but quints (subject, predicate, object, source, time) or something like that.

Perhaps I should have been more explicit. The RDF model already allows for reification to allow us to make assertions about a statement; and certainly most stores provide a context that can be leveraged for this information. I’m not saying we need to extend the model for these reasons.

The reason I asked for temporal labelling is that in the real world, no-one explicitly models time intervals for all properties; yet almost all data actually varies over time.

For example, consider dc:title. Roughly 99.9% of the time, we have a triple like so:

:article dc:title "I like Cheeses"

However, on the web, titles change all the time. I may look at that article tomorrow and see:

:article dc:title "I like Cheese"

In the current model, I would end up with two titles for this article. While technically correct, it is intuitively wrong - and that difference is what holds back RDF for most developers. They expect to see a single title with the updated value.

In the real world, people do not update their models when data starts changing. They update their document instances to have the new, current values. That’s why developers need version control and that is what all RDF consumers really need to handle.

What I’m suggesting is that we build versioning of statements directly into the model. That we make it easy to say:

:article dc:title "I like Cheeses" [-'19-Mar-2006']
:article dc:title “I like Cheese” [’20-Mar-2006′-]

I know that from a technical point of view this is not needed. I’m saying from the point of view of a real-world developer this kind of addition makes it far more simple to correctly specify the kind of data that consumers need to derive intuitively correct entailments. It’s an artificial constraint on the logical model to make the whole thing more tractable for non-KR people. If you combine this with the suggestions in the temporal RDF paper (PDF), you get better data, more completely specified. You replace the need to explicitly model changes over time (as Seth seems to suggest) and/or manage contextual data with the ability to temporally constrain the scope of an assertion.

Time affects every ontology in unexpected ways. Even the sample wine ontology is making assertions that can change over time; the assertions about what is a French wine changes over time because political borders are not static. And the ontology as written doesn’t account for the fact that corrections make take place or mistakes make (oops, this year’s batch of chardonnay is actually a mislabeled Reisling). For an RDF-lite I want a super-simple way to fix this; and the simplest way I can think of is to build date ranges into the basic model.

17 Mar 2006

Thinking about RDF-lite

Filed under: PlanetRDF — JBowtie @ 1:32 pm

In a lot of ways, I think RDF finds itself in the same place SGML was a decade ago. It’s extremely powerful, poorly understood by developers in general, and mostly sees limited or extremely vertical application.

And I put that down to the same factors; it’s a little too difficult to write a consumer application. With SGML you had just a little too much wiggle room in the spec; as a result it was really hard to write a good parser. With RDF/OWL, the open world assumption and lack of a unique name assumption combined make it very difficult to write normal business apps.

So, I imagine for a moment that I am writing a RDF-lite spec. What needs to happen?

  • Lists need to not suck. Keep your LISP out of my serialization format. Either let me specify that order matters in the schema (allowing my data to be implicitly ordered when serialized) or give the equivalent of xhtml:ol to order it with.
  • Formally include provenance and temporal labelling in the model without requiring reification. There’s no reason I can’t have optional who and when parameters that default to “source document” and “now”, respectively.
  • Following on the above add a unique name assumption to the model by default, and allow me to turn it off in my schemas (and/or override it in my reasoner).
  • Add a closed world operator to the model that can be turned on in queries.

There’s probably more, but in my mind this stuff fixes a lot of problems that people run into when building real-world applications. Those who need to full expressive power can always use the full, more powerful model.

Lists we know are a problem because everybody and his brother keeps “fixing” RSS with proprietary extensions. The truth of the matter is that people find XML’s implicit ordering is too convenient to not use.

Per my previous post on context, it’s pretty clear that we need to track sources for all real applications, so we might as well add it to the model. And time is a big problem because nobody explicitly includes it in their ontologies until they’ve been burned by it. So add that to the model; it’ll be that much less painful when we discover that people can change their names over time.

The lack of a unique name assumption is really powerful, allowing us to infer all kinds of useful relationships. But it blows big holes in attempts to work with real-world data. Actually what we want to do in most domains is “assumes-true”; presume that names are unique unless explicitly told otherwise. This follows the priniciple of least surprise because this is what most of us do in reality.

The open world assumption is something you can’t realistically turn off for RDF as a whole; doing so effectively removes one of the biggest strengths. However, there are plenty of domains where reasoning in the context of a closed world assumption can produce material benefits; document validation immediately springs to mind.

Readying Redland for Windows

Filed under: PlanetRDF — JBowtie @ 12:15 pm

Just so people don’t think I’ve completely dropped the ball, I am hard at work porting the 1.0.3 release over to Windows. I’m nearly there but will probably take until this time next week to actually release.

Besides upgrading to the latest version, I’m putting together a FAQ (because I’m slow about responding to email), a proper project page, patches against the svn tree and correcting previous oversights.

At the moment, everything compiles, the unit tests for the Python bindings are passing, and the rapper and roqet utilities have finally been added to the Windows binaries packages.

Currently I’m working on enabling PCRE support (need to link against the correct MSVC runtime) and enabling the various database backends (with MySQL and Postgresql at the top of the list). I suspect I still have one or two configuration bugs left but progress is being made.

I’m also going to try to get the Ruby and PHP4 bindings working for this release but won’t hold it up just for them.

08 Mar 2006

URIs are essential

Filed under: PlanetRDF — JBowtie @ 11:26 am

At first, I wasn’t sure what point this post from Phil Dawes was trying to make. Was he saying that local identifiers scale?

So, I had a bit of think on it, and concluded that he’s not making the argument he thinks he’s making. :)

His assertion that some properties are time dependent does not logically mean that global identifiers don’t scale. It means that he does need to think more about his model; the same would be true in a SQL database or OO program.

Ways to improve his model abound in RDF; let’s look at a few:

  • He could design a more specific URI. He alluded to this; it means a URI like id:2006/03/07/philDawes
  • He could add a date range to his model to indicate when the value was known to be valid.
  • He could actually use the context information available in most RDF stores.

Next he makes the point that amibugity is inevitable; I agree with this because ambiguity is a feature of the RDF model; an extremely desirable one that reflects the messy reality of communication.

The fact is that a URI like id:PhilDawes maps unambiguously to a specific individual, rather than any of the thousands of people that share that name (although a real URI based on his email or domain would be a better example URI). Sure, I can say ambiguous things about him, but that’s a function of the vocabulary I choose for properties.

I think the blind men and the elephant is a good example here. The blind men agree they are describing the same creature, but they have different information available and therefore conflicting accounts. If they each publish an RDF document describing the elephant, we have one creature (the elephant’s URI) with multiple values for some properties and presumably divergent properties. Untangling this mess is left to the hapless user.

HOWEVER, the beauty of RDF is that the sighted man can come along and create another RDF document, and use the same URI to describe the elephant, resolving the conflict. He can also look at the existing properties and create a unified, consistent description using a standard ontology; which uses URIs to unambiguously identify the properties, which the OWL developer can use in his inference rules…

As for Phil’s belief that you can use database primary keys without worrying about global namespace collision…how do you do that again when a record has different ids in different databases? Oh, you prepend a namespace so I know which database I’m talking about? And you map between the IDs by creating a dictionary (see OWL again for functional propeerties, equivalence, etc)?

All that said, Phil is right when says you need to track context - see my previous post. But that doesn’t lessen the usefulness of the global identifier; if anything, that global identifier makes it far easier to spot issues with data that has come from multiple places.

So to get back on message - Phil makes a very good argument that context is required when working with RDF. Stores that throw away information on the origin of data make it hard to work with data. But he doesn’t make a good case for his contention that global identifiers don’t scale.

07 Mar 2006

Essential RDF context

Filed under: PlanetRDF — JBowtie @ 12:52 pm

I always have too many projects on; today is no different. In addition to my semantic wiki prototype, I’m also working on a little app I’m calling Diogenes (for an XTech 2006 presentation I probably won’t be giving, as they seem to have far too few speaker slots).

Diogenes uses assumption-based logic and the GnuPG web of trust to try and determine the truth of RDF assertions. And that ties into today’s topic, because these assertions come from various sources.

Right now, I suspect most reasoners out there pretty blindly accept any given RDF. And there are plenty of places that don’t even process the RDF; they simply aggregate it for further distribution.

Unfortunately, that’s not really useful at all for real applications that are going to get popular. As soon as you move into the distributed world, you have all kinds of liars and hackers giving you mistaken, inconsistent, misleading, and/or outright wrong data. So at minimum we need to track the source of a given statement, if only so we can blacklist them in the future.

So, what do I consider an essential context for statements in an RDF store?

  • The source - who made the assertion? Note that a reasoner is its own source for inferred statements.
  • Timestamp - when was the assertion made? Because people edit pages, correct data, and sell domains. Here I mean the moment the data was picked up for entry into the store, not the actual authoring date (though, if available, that might be interesting too).
  • Trust - well, this is still being researched, but raw info such whether the data was signed, over a secure link, or encrypted can be helpful when trying to figure out why your wine agent ordered spam-pill-of-the-month instead of a nice merlot.

Now, there’s no reason that these values couldn’t be expressed as triples, and in fact I would also expect to be able to export the information as triples for reporting or aggregation. But since we need them for every statement, it makes more sense to make them properties recorded by the store.

The reason I call them “essential” is that I think any RDF application needs to record these values. Once data is in the store and you start reasoning with them, these values will be the only reliable way to correct issues without dumping and recreating the whole store.

Finally, any advanced reasoners will want to include “truth value” as part of the context. Values Diogenes assigns are ‘unknown’, ‘true’, ‘false’, ‘assumes-true’, and ‘assumes-false’. A fuzzy-logic reasoner would probably have assign values between 0 and 1. Keeping false and assumes-false data in the store can be handy for certains kinds of constraints and proofs. More research needed.

03 Mar 2006

Scoping a Semantic Wiki

Filed under: PlanetRDF, Python — JBowtie @ 12:15 pm

One of the reasons I’m implementing my own wiki instead of sticking with MediaWiki is that I hate PHP. Another, more valid reason is that I want to experiment with better approaches to dealing with structured information.

I picked Django for my implementation, because most of the code I’m going to integrate (such as sparta) is already in Python. I also think the model closely fits what I want to do.

Here’s some samples to help make it clear what types of things I want to do.

Normal XHTML wiki stuff - for starters, we can just do the normal wiki magic. Some subset of this stuff will also work for other mime types but the focus is on text.

  • /pageName
  • /pageName/+edit
  • /pageName/history

Structured document export formats - It won’t be perfect until we modify the input language, but we can get something quite reasonable in most cases.

  • /pageName/format/odt
  • /pageName/format/DocBook
  • /pageName/format/pdf

RDF - Of course, this is the real motivation. Here we’re dealing with all the abitrary RDF you want to provide. The first one will be an XHTML page displaying the data; the others will be served with more appropriate mime-types.

  • /pageName/metadata
  • /pageName/metadata/rdf
  • /pageName/metadata/n3
  • /pageName/metadata/+edit
  • /pageName/metadata/history

Explicitly supported subsets - This is the kicker. We explicitly provide display and edit support for specific ontologies (especially ones that are easy to implement).

  • /pageName/metadata/foaf/
  • /pageName/metadata/DublinCore/+edit

I’m still ramping up on Django, so I’m still looking at the basic XHTML stuff, but already I have XHTML strict output for straightfoward MediaWiki markup, including tables, and tomorrow I hope to have the full page lifecycle implemented.

I’m also very much cheating by just grabbing some MediaWiki output and using that as the basis for my initial templates; that way I have something familiar-looking to experiment with.

Learning Django backwards

Filed under: Python — JBowtie @ 11:01 am

I’ve been planning to learn Django lately, but after working through the tutorial, was struggling a little to figure out ow to model what I wanted.

Then I had a small epiphany - by doing model-first design, I was approaching the problem backwards. As soon as I realised that, I was able to start making real headway on my first project.

Obviously, if you already have a model worked out, model first is the way to go. What is far more valuable for many (especially when prototyping) is to start with the screens - then figure out what model is needed to support them.

So here’s my “backwards” methodology. It’s not a real tutorial, so you’ll need some to look at the actual Django docs.

  1. Name your app (the ever popular “myapp” for this exampe) and run “manage.py startapp myapp”
  2. Figure out what URLs you’d like and work out the regular expressions to support them. This will tell you which views you need to create.
  3. Prototype your screens as raw XHTML + CSS with some placeholder content.
  4. Stick the prototypes in the templates directory, then write basic views that invoke the appropriate template.
  5. Take a first pass at a model based on the placeholder content.
  6. Use “manage.py install myapp” to initialize the database.
  7. Fill in some starter content with the shell or the admin app.
  8. Modify the views to get model objects and pass them to the templates.
  9. Modify the templates to use the passed-in objects.

The nice part of this is that after step 4, you (should) have enough info to build a realistic model, and at the same time you have a static prototype to show to clients. After step 9 you can strip out unimplemented bits from the templates and be in a shipping state; from there on in it’s normal development methodology; including incrementally adding new screens, revamping the URL structures, or evolving the model.

It’s not the best approach for everyone; there are a lot of people who prefer the ‘bottom-up’ approach shown in the tutorial. But I suspect a lot of Web developers prefer to start with the screens and work backwards to the underlying model. It’s certainly more agile since you only need to build the model bits needed to support your screen.

Blogging again

Filed under: General, PlanetRDF — JBowtie @ 10:34 am

Back in September I started transitioning to a new (virtual) server. I moved everything over, upgraded all the software and never got around to updating DNS.

I figured I’d stop blogging until the new server was up - which was when I got the distro mastered. Which, of course, has stil not happened.

So, I’ve started blogging again, today. Lots of writing in the queue and I really need to get moving. But I’ve disabled comments until the transition is complete. That way I have a reason to actually, you know, finish the server transition.

28 Sep 2005

Getting in deep

Filed under: GNOME, Programming — JBowtie @ 6:07 pm

Not too long after I upgraded my Ubuntu box to Breezy Badger, I started having issues with applications writing to my DVD+RW drive. Serpentine was the first; it actually told me there was no burner in my machine. Eventually I tracked this to the underlying nautilus-cd-burner library which all my apps except cdrecord had been migrated to.

Over the last couple of days I actually did some deep digging, which resulted in a bug report for the HAL team. As a result, I ended up digging really, really deep - looking at the nautilus cd code, looking at kernel drivers, and eventually looking at the hal code.

This is all low-level C code, which I never really understood very well; the semantics have always been so foreign to me and I absolutely hate macros. I was pleasantly surprised that in all the cases I encountered, the code was actually, well, legible. Things had reasonable names, macro use was kept to a minimum, and I could follow the flow without a huge effort.

The really nice thing is that I was able to use the hald verbose output to locate and diagnose the bug in CVS code; I went from being intimidated by the idea of getting down to that level of detail to actually understanding a great deal about how HAL and the kernel work (at least with respect to CD-ROM drives).

In days gone past I would have been content to wait for my distro to fix a problem like this, rather than do any hacking at the system level. It’s part of the knock-on effect working with open source has on you - eventually, you discover that no part of the system needs to be a black box. If you really want to know how something works, you can find out, with just a little effort.

I mean, I always knew that, it’s just that I would have been passive in the face of all that code before; now, I’m becoming more proactive simply because I can. It’s sort of like having the open-source epiphany all over again.

22 Sep 2005

SWIG vs PHP

Filed under: PlanetRDF — JBowtie @ 11:08 am

So, the experimental PHP bindings I put up dont’ work. It took some digging, but I think I’ve figured out why.

Most of the bindings are SWIG-generated; that’s the simplest way to maintain multiple language bindings to C libraries, and it doesn’t require you to know much about the target languages.

However, it turns out that PHP support in SWIG has been largely unmaintained. So the Redland bindings are generated for PHP4, the last version supported by SWIG; unfortunately, I went ahead and compiled against the PHP5 libraries. Oops - my fault for grabbing the latest stable version, apparently.

Now, I’ll of course spend the next couple of days attempting to build some PHP4 bindings on Windows. However, if someone wants to use this with PHP5, they’ll have to step up and roll something by hand. I’m not a PHP developer by any stretch of the imagination, nor do I want to learn; I have enough trouble just configuring the PHP apps I do use (Wordpress and MediaWiki).

20 Sep 2005

Contrarian

Filed under: PlanetRDF — JBowtie @ 11:17 am

Danny Ayers has lots of links to the various posts on the “RDF crisis” point, and his own response in the post “Danny Ayers, Raw Blog : ยป Wrong heel“.

My personal view is that RDF doesn’t need simplifying; on the contrary, it needs to add some of the OWL complexity.

See, the real issue is not the RDF model or even the RDFS model. It’s that people haven’t transferred their existing skill sets over yet. Most of them are still thinking in XML terms rather than in triples.

My hypothesis is this: all RDF parsers need to understand OWL semantics. Sure, OWL came out of the AI space and was originally intended for formal ontology management. But at a far more practical level it’s a really useful way to model your RDF as objects.

If SPARQL is what makes RDF usable for the relational-oriented crowd, OWL is what makes RDF usable for the object-oriented set (bad pun intended). OWL lets you write constraints in terms of classes and properties; it gives you sophisticated query-by-example semantics, and if properly implemented it can deduce the existence of implicit relationships. When you’re working with OWL, you define classes, properties, and enumerations. While it doesn’t map *exactly* to true OO semantics, it certainly maps to very, very similar space - one that’s close enough for a native interface in most languages - mnot’s sparta is a great example of leveraging this (though it uses very few OWL constructs).

Of course, this adds a lot of complexity to the underlying parser. But my contention is that it actually simplifies things for users by giving them the chance to build up an RDF skill set and mapping well into their tool set.

10 Sep 2005

Redland/Windows refresh (1.0.2-2)

Filed under: PlanetRDF, Python — JBowtie @ 12:02 am

Redland 1.0.2-2 for Windows is out! The .sig files are detached GPG signatures (use gpg -v to verify).

Signed by:
John C Barstow (Redland Win32 port signing key) [1024D/548D7543] with fingerprint: 369C 9E25 4FDB 55AC 6D62 70F4 D33D B3F6 548D 7543

I’ve updated the Redland/Windows packages to incorporate a bug fix for file URI parsing. This should fix the problem a number of people have reported. Other than this bug fix, the packages are identical to the previous 1.0.2 packages.

In addition, I’ve added an EXPERIMENTAL build of the PHP bindings. It compiles cleanly but I have no idea how useful or usable it is.

Powered by WordPress