Visions of Aestia

28 Sep 2005

Getting in deep

Filed under: GNOME, Programming — JBowtie @ 6:07 pm

Not too long after I upgraded my Ubuntu box to Breezy Badger, I started having issues with applications writing to my DVD+RW drive. Serpentine was the first; it actually told me there was no burner in my machine. Eventually I tracked this to the underlying nautilus-cd-burner library which all my apps except cdrecord had been migrated to.

Over the last couple of days I actually did some deep digging, which resulted in a bug report for the HAL team. As a result, I ended up digging really, really deep - looking at the nautilus cd code, looking at kernel drivers, and eventually looking at the hal code.

This is all low-level C code, which I never really understood very well; the semantics have always been so foreign to me and I absolutely hate macros. I was pleasantly surprised that in all the cases I encountered, the code was actually, well, legible. Things had reasonable names, macro use was kept to a minimum, and I could follow the flow without a huge effort.

The really nice thing is that I was able to use the hald verbose output to locate and diagnose the bug in CVS code; I went from being intimidated by the idea of getting down to that level of detail to actually understanding a great deal about how HAL and the kernel work (at least with respect to CD-ROM drives).

In days gone past I would have been content to wait for my distro to fix a problem like this, rather than do any hacking at the system level. It’s part of the knock-on effect working with open source has on you - eventually, you discover that no part of the system needs to be a black box. If you really want to know how something works, you can find out, with just a little effort.

I mean, I always knew that, it’s just that I would have been passive in the face of all that code before; now, I’m becoming more proactive simply because I can. It’s sort of like having the open-source epiphany all over again.

22 Sep 2005

SWIG vs PHP

Filed under: PlanetRDF — JBowtie @ 11:08 am

So, the experimental PHP bindings I put up dont’ work. It took some digging, but I think I’ve figured out why.

Most of the bindings are SWIG-generated; that’s the simplest way to maintain multiple language bindings to C libraries, and it doesn’t require you to know much about the target languages.

However, it turns out that PHP support in SWIG has been largely unmaintained. So the Redland bindings are generated for PHP4, the last version supported by SWIG; unfortunately, I went ahead and compiled against the PHP5 libraries. Oops - my fault for grabbing the latest stable version, apparently.

Now, I’ll of course spend the next couple of days attempting to build some PHP4 bindings on Windows. However, if someone wants to use this with PHP5, they’ll have to step up and roll something by hand. I’m not a PHP developer by any stretch of the imagination, nor do I want to learn; I have enough trouble just configuring the PHP apps I do use (Wordpress and MediaWiki).

20 Sep 2005

Contrarian

Filed under: PlanetRDF — JBowtie @ 11:17 am

Danny Ayers has lots of links to the various posts on the “RDF crisis” point, and his own response in the post “Danny Ayers, Raw Blog : ยป Wrong heel“.

My personal view is that RDF doesn’t need simplifying; on the contrary, it needs to add some of the OWL complexity.

See, the real issue is not the RDF model or even the RDFS model. It’s that people haven’t transferred their existing skill sets over yet. Most of them are still thinking in XML terms rather than in triples.

My hypothesis is this: all RDF parsers need to understand OWL semantics. Sure, OWL came out of the AI space and was originally intended for formal ontology management. But at a far more practical level it’s a really useful way to model your RDF as objects.

If SPARQL is what makes RDF usable for the relational-oriented crowd, OWL is what makes RDF usable for the object-oriented set (bad pun intended). OWL lets you write constraints in terms of classes and properties; it gives you sophisticated query-by-example semantics, and if properly implemented it can deduce the existence of implicit relationships. When you’re working with OWL, you define classes, properties, and enumerations. While it doesn’t map *exactly* to true OO semantics, it certainly maps to very, very similar space - one that’s close enough for a native interface in most languages - mnot’s sparta is a great example of leveraging this (though it uses very few OWL constructs).

Of course, this adds a lot of complexity to the underlying parser. But my contention is that it actually simplifies things for users by giving them the chance to build up an RDF skill set and mapping well into their tool set.

10 Sep 2005

Redland/Windows refresh (1.0.2-2)

Filed under: PlanetRDF, Python — JBowtie @ 12:02 am

Redland 1.0.2-2 for Windows is out! The .sig files are detached GPG signatures (use gpg -v to verify).

Signed by:
John C Barstow (Redland Win32 port signing key) [1024D/548D7543] with fingerprint: 369C 9E25 4FDB 55AC 6D62 70F4 D33D B3F6 548D 7543

I’ve updated the Redland/Windows packages to incorporate a bug fix for file URI parsing. This should fix the problem a number of people have reported. Other than this bug fix, the packages are identical to the previous 1.0.2 packages.

In addition, I’ve added an EXPERIMENTAL build of the PHP bindings. It compiles cleanly but I have no idea how useful or usable it is.

07 Sep 2005

Trouble with PHP bindings

Filed under: PlanetRDF — JBowtie @ 3:45 pm

Ian Davis [foaf] asked for PHP Redland bindings on Windows in response to my latest adventures.

Having had other such requests in the past, I decided to give it a shot. PHP is not one my core languages, even though I use applications written in it. Luckily there is a test PHP script in the Redland bindings package that I could use.

Long story short, it’s not happening today. I’m close, but there are still some issues:

error LNK2019: unresolved external symbol _librdf_php_get_world referenced in function __wrap_librdf_php_get_world
error LNK2019: unresolved external symbol _librdf_php_world_finish referenced in function __wrap_librdf_php_world_finish
error LNK2019: unresolved external symbol __imp__compiler_globals referenced in function _zm_startup_redland
php_redland.dll : fatal error LNK1120: 3 unresolved externals

I need to resolve these three symbols before I can get any further. Any advice would be appreciated - I’m especially stumped by the librdf_php ones, as I’ve never seen those functions defined anywhere and would have assumed they were SWIG-generated.

06 Sep 2005

Exploring RDF with XPath

Filed under: PlanetRDF, XML — JBowtie @ 9:15 pm

If I keep up this string of posts I risk turning into Danny Ayers (not necessarily a bad thing). Most I’ve written in ages.

It’s been my observation that XPath is one of the things keeping people chained to XML. It’s familiar, easy to explore with, and heavily supported. So why not co-opt it for exploring RDF? If we do that, we might be able to show people some of the virtues of the semantic model.

This is a thought experiment, not a concrete, fully-thought out proposal. In fact, it’s probably not even cohesive. I just wanted to get it down in writing while I thought of it.

So here’s a bit of RDF in abbreviated XML format:

<rpg:skill rdf:about="https://nzlinux.org.nz/srd#acrobatics">
<rpg:name>Acrobatics</rpg:name>
<rpg:keyAbility>DEX</rpg:keyAbility>
<rpg:blurb>You can flip, dive, roll, tumble, and perform other acrobatic maneuvers.</rpg:blurb>
</rpg:skill>

Assume we have namespace definitions for rpg and srd and that we’re positioned on the srd:acrobatics node. What kind of XPath expressions could we write?

“rpg:name” would follow the arc and return the value “Acrobatics”.
“rdf:type” would follow the arc and return the “rpg:skill” node. “rdf:type/rdf:label[@xml:lang=’en’]” would return the value “Skill”. Without the qualifier it wound return all the labels.

Assuming owl:instanceOf is the inverse of rdf:type -
“rdf:type/owl:instanceOf” would return all nodes with the “rpg:skill” type.
“rdf:type/owl:instanceOf[rpg:name=’Acrobatics’]” would return the “srd:acrobatics” node.

A couple of XPath constructs would need special interpretation.
There’s no root as such, so “/” would not be a node of itself, but rather simply indicate that the next part of the path should be interpreted as a subject. So, “/srd:acrobatics” positions us on the “srd:acrobatics” node.
“/srd:acrobatics/rpg:name” is therefore the same as the triple “srd:acrobatics rpg:name ?”, whereas “/rpg:name/rdf:type” would presumably return the “rdf:Literal” node.

There’s no parent node as such, so “..” would best be interpreted as following the arc back to the original node. If there is no arc to follow back, it doesn’t do anything.
“/srd:acrobatics/rpg:name/../rpg:keyAbility” would return the value “DEX”.

I think this would be a fruitful area for research. XPath is implicitly graph-oriented and a reasonably well-understood. By using it to explore RDF graphs, it opens up new opportunities to bring people on board and exposes some of the power of the underlying model.

Finally figured it out

Filed under: General, PlanetRDF — JBowtie @ 8:44 pm

So, I finally figured out the problem with file URI parsing on Redland/Windows.

Basically, the problem boiled down to this - the Windows-specific raptor code was assuming a leading slash, but that was blowing up when there wasn’t one (as in the shipped examples). I misdiagnosed the problem and removed the attempt to skip that character. This however blew up in other circumstances. I couldn’t reproduce it because all my test cases left off the leading slash.

So, after taking a nice break from the code, the problem was immediately obvious. Now the code looks at the first character; if it’s a leading slash it skips it. Everything works again.

New packages are being generated - I should be able to sign and upload them tomorrow.

Autotools are bad, m’kay?

Filed under: Aestia, PlanetRDF — JBowtie @ 1:06 pm

I finally mucked in and added autotool support to my character generator (which now has a name: Lorax-RPG). It’s ugly as sin but at least it’s done. Required a massive code re-org to get things into the “right” places; I probably will end up doing another one once I actually understand why I needed to make some of those changes.

It’s really skeletal support in the sense that most of the files don’t get deployed yet, but I have convinced myself that things are going to the right places, and the whole configure/make/make install triune works. Now I’m fixing up all the paths to wipe out assumptions about file layout.

In another rev or two XML support will be finally dropped completely in favour of the new Redland-based RDF loader. All my rules will be expressed entirely in RDF, which allows me to stop maintaining serialization code. It also allows me to start seeing the benefits of distributed data, though I still need to work out a strategy for handling conflicting assertions.

Condolences to Dave Beckett

Filed under: PlanetRDF — JBowtie @ 12:52 pm

Dave announced he is leaving the UK for the USA.

California is probably the least of evils given the circumstances, but this refugee from the old country find it hard to believe that anyone would voluntarily immerse themselves in the culture there when they don’t have to. And they don’t have Doctor Who over there.

Ah well. I should probably start a pool on how long Dave will last before returning to civilization. [smiley face - yes, I’m avoiding UTF8 art]

In the meantime perhaps I should get my act together and finish fixing that showstopper in the Windows port. I’ve been trying to avoid that platform in principle, but even Windows users need the Semantic Web.

23 Aug 2005

Why SPARQL is important

Filed under: PlanetRDF, XML — JBowtie @ 11:47 am

At first glance, a query language like SPARQL doesn’t actually make that much sense to me. Why? Because the RDF model has some tremendous properties that make it amenable to other types of analysis, such as graph traversal and AI reasoning.

People are still coming to grips with handling large RDF stores. I suspect this process will go on for a couple more years. In the meantime, while we work out access patterns and how to use graph-oriented algorithms effectively with very large data sets, people are turning to that old familiar friend, the relational database.

That’s where SPARQL comes in, and why it is important. People who manage databases learn how to use SQL to ask for data, and make use of highly optimized set-oriented algorithms. SPARQL enables knowledge transfer - it allows people with a longstanding investment in database skills to become familiar with RDF; using it in familiar applications and allowing for a controlled transition to the brave new world of agile, incomplete databases.

Of course there are going to be issues - SQL never really dealt with NULLs effectively, and what is RDF but a database full of NULLs? SQL is built around sets, and has real problems with effectively handling graph-oriented data like hierarchies and parts explosions. RDF, like XML before it, is a problem child for such technology. But the next generation of query tools will use directed graphs as the underlying organizational principle.

When that happens, SPARQL will likely be replaced by something more useful; maybe something that resembles XPath or something else entirely. But people who manage data will now understand the new structures and models, and that will make all the difference in the world.

18 Aug 2005

Using Pangocairo

Filed under: GNOME, Python — JBowtie @ 4:46 pm

I’ve been experimenting with the new Pango/Cairo integration in the latest PyGTK and have figured out a few things (cairo is somewhat sparsely documented at the moment).

I decided to try my hand on PDFs to start with, since printing is my biggest issue at present. Presumably other backends work in a similar fashion.

The first challenge was figuring out the units for specifying the surface width and height. After some work, I discovered that this should be in Postscript points. A4 paper is 595 x 842 points; I found this and other well-know page sizes defined in the Scribus online documentation.

So, the workflow seems to be as follows.

  • Create a surface, specifying size in Postscript points. Each backend seesm to define its own surface type.
  • Create a rendering context. Pango has its own context that wraps a standard cairo context.
  • Create a pango layout for each paragraph, preferably using the Pango markup format to apply formatting within the paragraph. Pango exposes a bunch of other ways to format the text, have fun. Postscript points are 3/4 the size of font points (fontSize/0.75=Postscript points). That’s because fonts assume 72pt/inch, while PS assumes 96pt/inch.
  • For a given paragraph, call layout.set_width() so that the paragraph wraps correctly. It’s not obvious, but width is expressed in thousandths of a point, so basically multiply the desired width by 1000.
  • Line spacing is supposed to be controlled by layout.set_spacing() [also in thousandths of a point], but the version I have installed seems to ignore this - I’ll assume this a bug that will be fixed in short order.
  • Position the paragraph using cairocontext.move_to() or cairocontext.rel_move_to(). Again, the coordinates shound be in Postscript points. Calling layout.get_pixel_size() will give you the width and height in Postscript points.
  • Render the paragraph by calling cairocontext.show_layout().
  • Finished laying out a page? Call cairocontext.show_page() to create the actual page. Any work you do after this point will be on a new page.
  • Finally, call surface.finish() to write out the actual PDF file and release your handles.

If you want to lay out the individual lines yourself, you can get individual lines using layout.get_line(index) and use the resulting object to access the line measurements. You will probably want to do this when spanning page or column breaks. Use cariocontext.show_layout_line() to render individual lines.

08 Aug 2005

Random updates

Filed under: General, Aestia, PlanetRDF — JBowtie @ 2:21 pm

I haven’t blogged in quite some time - since I couldn’t take a vacation from work I took one from all my personal projects. Did you miss me?

I’m catching up on my reading/e-mail now, so you might start seeing a few things from me. Here are some random updates on various things.

The parsing of Redland file: URIs on Windows does not appear to follow the standard (see RFC 1738), which seems to cause problems on some systems. See also the proposed update. I need to investigate further.

A somewhat obvious if not necessarily easy fix is to allow the language bindings to pass in something that Redland considers a file-like object (can this even be done with SWIG?), or punt off the handling of file URIs to something like CURL. That moves the pain to someone who has already solved it, but may not play nicely with underlying code assumptions.

During a remote SSH session, I attempted to open evolution. It failed since I wasn’t tunnelling X - but it erased my email store!! Three years of emails and all my contacts gone. And in the spirit of true disaster I had erased my email backup for some storage space, intending to recreate it on a new device a bit later. Thank goddess for the archiving of mailing lists.

Work on my character generator was stopped for a bit. I did work up a pretty nice character sheet using Reportlab, but really want to see GnomePrinting/cairo integration. The Gnomeprint roadmap does not seem to exist. Next on the list is transitioning the races/classes to RDF and some refactoring to handle the lack of default logic in OWL.

I agreed to review two role-playing products in exchange for free copies. Have read both of them, working on the reviews now.

The Aestia campaign setting is largely converted to True20 mechanics. As a byproduct I’ve been able to put together several articles on handling the transition for a “live” campaign. May or may not be ready for the Green Ronin setting search, but I’m still uncertain about signing up for that anyway. Not really in the spirit of open gaming (mind you, Green Ronin has an impressive track record of open content).

Thanks to some links on PlanetRDF, I’m beginning to understand why there are certain holes OWL’s coverage. You can work around those holes, but we need more OWL-based reasoners to explore the space of really solving them. I’ll write more on those later.

Also, filed my first Ubuntu bug; fixed in a impressively brief timespan. Malone is shaping up to be a pretty slick interface over the UI horror that is Bugzilla. They really need to get their HTTPS certificates fixed, though.

26 Jul 2005

Using Rosetta

Filed under: General, GNOME — JBowtie @ 3:39 pm

Well, I’m now running the Ubuntu Maori Translators group, and have had my first attempt to use Rosetta for translation purposes.

You can only see ten items at a time. No obvious way to to increase the number you’re working on at once.

There doesn’t seem to be any search facility. It would be really useful, for example, to be able to translate all the variants on “font” at once.

The tool shows other translations of the phrase in question, but you can’t click on the displayed items to determine context (which can be quite important when there are conflicting possibilities).

It would be nice to know which items have been translated upstream vs translated ‘locally’ in Rosetta.

The organization of files is less than obvious. It would be really nice if one could trace the dependencies to figure out which libraries would provide the biggest bang for the buck.

Otherwise I’m happy; it’s really convenient to just do a few items at a time and not have to worry about CVS issues until I’m ready to commit a batch of translations.

07 Jul 2005

Semantic skills

Filed under: Aestia, PlanetRDF — JBowtie @ 11:47 am

A fairly significant milestone has just been crossed (actually, a week ago, but I’ve been too busy to blog) with my character generator - all skill information is now represented in RDF, parsed by Redland and Sparta.

Skills were chosen to go first since they have the smallest number of properties - just the ones common to all rules and a key ability score. I knocked together a small Python program to generate the RDF, using Redland+Sparta for reading and writing files.

Patching the campaign reader to load (only) the skill data from an RDF file while the remainder continues loading from XML was slightly trickier, but helped immensely by Sparta’s pythonic API.

This is really cool to me for a number of reasons. I have a specific desktop application that gets measurable decrease in code complexity by adopting RDF; I can move to distributing rules across multiple files without any extra coding; I’ve successfully migrated an XML-centric workflow to an RDF-centric workflow with minimal effort; and I can use OWL to interface to other, interesting vocabularies such as FOAF. One of my longtime goals is to keep track of fictional social webs with social networking software.

27 Jun 2005

RDF vs control

Filed under: PlanetRDF — JBowtie @ 1:01 pm

Lately I’ve been struggling with a bit of a dilemma. I’m not sure there is an answer or even a consensus that it’s a problem, but I’ve been thinking about it nonetheless.

If we’re serious about bringing about the Semantic Web, there’s a couple of problems that we will have to contend with, some obvious, some less so.

The first issue we wil face is that liars, scam artists, advertisers, and zealots of various persuasions are going to start contaminating our machine-readable data. In other words, we need to find a reasonable, easy-to-implement solution to the trust problem before we are drowning in useless sea of data.

Currently, people who filter their RDF (if they do so at all) use blacklists, whitelists, or spam-processing code. But as the amount of machine-readable data reaches epic proportions, all of these mechanisms start to break down. We need to well and truly distribute the work and build the processing in at the parser level, or we will never get a handle on it. I mean, what good are software agents going to be if you ask them to restock the wine cabinet and they order herbal supplements?

Even assuming we can eliminate spam, there are other, more subtle problems that creep in. People will lie on their FOAF files (or even serve them up selectively) to attract potential dates or deflect attention. RDF feeds will end up carrying propoganda or advertisements. Wikipedi-type wars will rage (where two sides make contradictory assertions). Triplestores will fill up with inconsistent, misattributed data.

There’s also the issue of sensitive data. Personal information may be serialized into the wrong files. If your bot wrongly sucks up my tax ID number, how do I ask it to forget it or not disclose it? And if I can make that request, what keeps me from asking it forget or prevent disclosure of public information, like a Senator’s voting record?

Secrecy and privacy are already under serious threat due to data aggregation. What happens when an autonomous software agent discloses information under court seal? What happens when a computer intelligence is able to infer the identity of a protected witness or victim?

As long as “real” AI is still 20 years off, we can (and have) deferred thinking about these issues. But once we have powerful and reasonably autonomous reasoners harvesting triples and drawing conclusions, the data becomes a black box. We no longer keep track of where the data comes from, how connections are made, or get involved in weighting or filtering information. Instead, we start relying on the computer to do it for us - in fact, we need the machine to do the filtering because otherwise we end up completely overwhelmed by the mountains of data.

I can’t manually process the amount of spam or spam comments I get anymore. I get so much e-mail, I don’t even have time to manually sort it anymore; if I did, I’d never read it - as it is I only cope by scanning the subject lines in the pre-sorted folders. I need a general purpose AI available to me in the next 10 years, because I am barely keeping up with the things I care about as it is. I need people to start publishing machine-readable metadata, or they will become invisible to me. I need planet aggregators and categorized posts. Like most democratic citizens, I need information about various candidates summarized and their positions analyzed, because I don’t have the leisure time to sift through the raw material and cannot rely on the media to do so reliably anymore.

But that circles back to my original issues. How do I know which sources of information to trust? How do I track trustworthiness over time? How do I verify information? How do I detect and weed out mistakes and falsehoods? How do I know when to throw out my assumptions? How do I find bugs in a reasoning engine, and what do I do when multiple reasoners disagree?

Look - these are all the same issues we are struggling with in regards to people, and it’s silly to think we can solve it definitively anytime soon. Just - before we start relying too heavily on our software, we should build in what safeguards we can. I know one day the computer will know more than me, be able to reason more rigourously than I can, write and maintain programs that I couldn’t touch, and be off having refined conversations with other AIs (probably in some n3-derivative language). When that day comes, I want it to be able to exercise critical thinking and have some thought for humanity’s welfare.

23 Jun 2005

Converting sparta to use redland

Filed under: PlanetRDF, Python — JBowtie @ 10:42 am

One of the TODO items for Sparta is to make it work with a Redland back end. I hacked one in (very, very messily) some time ago, but since Danny Ayers poked me some time ago I figured I should look at the latest version.

The new version cleans a few things up, and it’s been pretty straightfoward to get things sorted. The way I’ve approached it is far less hackish, and could actually be refactored into a real solution over the next couple of weeks.

Basically, I’ve provided an implementation of TripleStore that wraps a Redland Model and aliased all the classes Sparta was importing - with that I can avoid touching 90% of the remaining code. It works well enough to pass the provided test script and for my other, more nefarious purposes.

Slightly better would be to refactor to allow switching between the implementations; once that is done hopefully Mark will consider a patch to enable both backends.

16 Jun 2005

Redland 1.0.1 and 1.0.2

Filed under: PlanetRDF — JBowtie @ 10:43 pm

If you’re seeing this post, you’re looking at my new server. In which case, Windows binaries are available for both of Dave’s new releases.

Aside from the features mentioned in the official Redland announcements, I’ve also turn on the Berkeley DB storage. The Python and C# unit tests now pass with the new packages.

Redland 1.0.1 binaries (DLLs, dev packages, Python and C# bindings) can be found in:
https://nzlinux.org.nz/files/redland/1.0.1/

Redland 1.0.2 binaries can be found in:
https://nzlinux.org.nz/files/redland/1.0.2/

UPDATED: Made links clickable. Sorry about that.
UPDATED: Fixed link typos. Sorry about that.

07 Jun 2005

Solved BDB problem

Filed under: PlanetRDF — JBowtie @ 10:20 pm

It took me longer than it should have for me to twig to the problem. As with other problems, it was down to the run-time library fiasco that is Microsoft. Eventually I built a version of the libdb DLL that didn’t lock up my debugger.

That means binaries are in the offing; unfortunately my pitiful hosting does not have space so I will have to do some housecleaning before I can upload them.

Beyond offering the new features promised by the point release, I can now say that the Python and C# unit test suites run successfully with the new binaries. I’m not sure how to package the Ruby stuff correctly, so no binaries for that this time.

In the meantime, I’ll be sending new patches off to redland-dev for both the bindings and the core libraries. These are a lot smaller and tighter this time. With any luck, all I’ll have to do next release is update the macro definitions and run my packaging script.

03 Jun 2005

Redland slides

Filed under: PlanetRDF — JBowtie @ 10:16 pm

I gave a brief presentation on Redland/Windows to my employer earlier this week. With their permission I’m reposting the slides for that presentation here.

Hopefully it will be of some use to those trying to explain RDF to others.

01 Jun 2005

Working on 1.0.1

Filed under: PlanetRDF — JBowtie @ 10:55 am

I’ve not been posting for a week because my wife was down with the flu. Now that things are more or less back to normal, I’m looking at porting the latest Redland 1.0.1 release and the associated bindings.

Things are getting much faster; it took me only a short time last night to get things working roughly in parity with the last release I did (helpful that most of my patch from last time was accepted). Of course, I’m a little more ambitious this time and would like to get the BDB storage working - this is the only stumbling block for the Python bindings unit test suite.

I think I have things mostly sorted on that front, but I’m getting a crash when BDB attempts to print out an error message using the logging handler (something is not checking for null). This might be a binding issue; I’m going to trace into it tonight.

Powered by WordPress