Visions of Aestia

25 Feb 2005

Sparta Redland

Filed under: — JBowtie @ 10:17 am

Last night I spent a few hours hacking Sparta to use the Redland API as it’s underlying store. It’s not pretty code by any means as I simply added the new imports and kept tweaking until my personal use case was working. I haven’t even tried to make spartaTest.py work against it.


#!/usr/bin/env python
        
\"\"\"
    sparta.py  (a Simple API for RDF)
    Copyright 2001-2004 Mark Nottingham 
    Portions Copyright 2005 John C Barstow 
        
Sparta is a simple API for RDF that binds RDF nodes to Python
objects and RDF arcs to attributes of those Python objects. As
such, it can be considered a \"data binding\" from RDF to Python.
        
THIS SOFTWARE IS SUPPLIED WITHOUT WARRANTY OF ANY KIND, AND MAY BE
COPIED, MODIFIED OR DISTRIBUTED IN ANY WAY, AS LONG AS THIS NOTICE
AND ACKNOWLEDGEMENT OF AUTHORSHIP REMAIN.
        
Requires rdflib .
        
TODO:
 * redland support; http://redland.opensource.ac.uk/
   * take a redland context as a factory arg
 * take object type information from its rdf:type too (?)
   * type list members?
 * complete schema type support (date/time types; wait for PEP 321)
 * document / refactor
 * unit tests
CHANGES:
 * rdf:Seq support (just like lists) (needs testing)
 * if a property isn't unique, it'll return a PropertySet upon get.
 * Factory takes an optional 'schema_store' arg to keep schemas separate.
   If not specified, the store will be used.
\"\"\"
        
import urlparse, base64, types, sets
import RDF
from RDF import Uri as URI
from RDF import Node as BNode
#from rdflib.constants import FIRST, REST, NIL
RDFS_RANGE = \"http://www.w3.org/2000/01/rdf-schema#range\"
LIST = \"http://www.w3.org/1999/02/22-rdf-syntax-ns#List\"
SEQ = \"http://www.w3.org/1999/02/22-rdf-syntax-ns#Seq\"
        
__version__ = \"0.6.7\"
        
RDF_SEQi = \"http://www.w3.org/1999/02/22-rdf-syntax-ns#_%s\"
MAX_CARD = URI(\"http://www.w3.org/2002/07/owl#maxCardinality\")
        
def loadFile(filename):
        uri=RDF.Uri(string=\"file:\"+filename)
        storage=RDF.Storage(storage_name=\"hashes\",
                name=\"test\",
                options_string=\"new='yes',hash-type='memory',dir='.'\")
        model=RDF.Model(storage)
        parser=RDF.Parser('raptor')
        for s in parser.parse_as_stream(uri,uri):
                model.add_statement(s)
        return model
        
class TripleStore:
        def load(self, filename):
                self.model=loadFile(filename)
                self.prefix_ns_map = {}
                self.ns_prefix_map = {}
        
        def prefix_mapping(self, key, url):
                self.prefix_ns_map[key] = url
                self.ns_prefix_map[str(url)] = key
        
        def triples(self, (s, p, o)):
                qs = RDF.Statement(subject = s,
                    predicate = p,
                    object = o)
                for statement in self.model.find_statements(qs):
                        yield statement.subject,statement.predicate,statement.object
        
        def add(self, (s, p, o)):
                qs = RDF.Statement(s, p,o)
                self.model.add_statement(qs)
        
        def subjects(self, p, o):
                if isinstance(p, str):
                        p = URI(p)
                return self.model.get_sources(p,o)
        
class ThingFactory:
    \"\"\"
    Fed a store, return a factory that can be used to instantiate
    Things into that world.
    \"\"\"
    def __init__(self, store, schema_store=None):
        self.store = store
        if schema_store is not None:
            self.schema_store = schema_store
        else:
            self.schema_store = self.store
        #self.update_prefix_mapping()
        
    def __call__(self, name, **props):
        return _Thing(self.store, self.schema_store, name, props)
        
    def update_prefix_mapping(self):
        \"\"\"
        Update the prefix-to-namespace URI mapping. store.parse()
        and possibly other methods will blow it away; call it
        afterwards (or just set your prefixes afterwareds).
        \"\"\"
        for namespace, prefix in self.store.ns_prefix_map.items():
            self.store.prefix_ns_map[prefix] = namespace
        
class _Thing:
    \"\"\" An RDF Resource, as uniquely identified by a URI. Properties
        of the Resource are avaiable as attributes; for example:
        .prefix_localname is the property in the namespace mapped
        to the \"prefix\" prefix, with the localname \"localname\".
    \"\"\"
    def __init__(self, store, schema_store, name, props={}):
        self._store = store
        self._schema_store = schema_store
        self._object_types = {}
        if name is None:# or name.isdigit():
            self._name = BNode()
        #elif isinstance(name, ID):
        #    self._name = name
        else:
            self._name = self._AttrToURI(name)
        for attr, obj in props.items():
            try:
                self.__setattr__(attr, obj)
            except TypeError:      ### hack
                self.__getattr__(attr).add(obj)
        
    def __getattr__(self, attr):
        if attr[0] == '_':
            return self.__dict__[attr]
        else:
            try:
                pred = self._AttrToURI(attr)
            except ValueError:
                raise AttributeError
            results = self._store.triples((self._name, pred, None))
            if self._isUniqueObject(pred):
                try:
                    obj = results.next()[2]
                    obj_type = self._getObjectType(pred, obj)
                    return self._rdfToPython(obj, obj_type)
                except StopIteration:
                    raise AttributeError
            else:
                return PropertySet(self, pred)
        
    def __setattr__(self, attr, obj):
        if attr[0] == '_':
            self.__dict__[attr] = obj
        else:
            try:
                pred = self._AttrToURI(attr)
                obj_type = self._getObjectType(pred, obj)
                if self._isUniqueObject(pred):
                    self._store.remove((self._name, pred, None))
                    self._store.add((self._name, pred, self._pythonToRdf(obj, obj_type)))
                elif isinstance(obj, (sets.BaseSet, PropertySet)):
                    PropertySet(self, pred, obj.copy())
                else:
                    raise TypeError
            except ValueError:
                raise AttributeError
        
    def __delattr__(self, attr):
        if attr[0] == '_':
            del self.__dict__[attr]
        else:
            self._store.remove((self._name, self._AttrToURI(attr), None))
        
    def _rdfToPython(self, obj, obj_type):
        \"\"\"Given a RDF object and its type, return the equivalent Python object.\"\"\"
        #print \"rdfToPython\", obj, obj.__class__
        if isinstance(obj, RDF.Node) and obj.is_literal():  # typed literals
                return SchemaToPython.get(obj_type, SchemaToPythonDefault)[0](obj)
        elif obj_type == LIST:
            return self._rdfToList(obj)
        elif obj_type == SEQ:
            l, i = [], 1
            while True:
                try:
                    item = self._store.triples((obj, URI(RDF_SEQi % i), None)).next()[2]
                    l.append(self._rdfToPython(item, None)) ### type?
                    i += 1
                except StopIteration:
                    return l
        elif isinstance(obj, RDF.Node) and obj.is_resource():
            return self.__class__(self._store, self._schema_store, obj)
        else:
            raise ValueError
        
    def _pythonToRdf(self, obj, obj_type):
        \"\"\"Given a Python object and its type, return the equivalent RDF object.\"\"\"
        if obj_type == LIST:
            blank = BNode()
            self._listToRdf(blank, obj)   ### this actually stores things...
            return blank
        elif obj_type == SEQ:  ### so will this
            blank = BNode()
            i = 1
            for item in obj:
                self._store.add((blank, URI(RDF_SEQi % i), self._pythonToRdf(item, None))) ### type?
                i += 1
            return blank
        elif isinstance(obj, self.__class__):
            return obj._name
        else:
            return RDF.Node(SchemaToPython.get(obj_type, SchemaToPythonDefault)[1](obj))
        
    def _rdfToList(self, subj):
        \"\"\"Given a RDF list, return the equivalent Python list.\"\"\"
        try:
            first = self._store.triples((subj, FIRST, None)).next()[2]
        except StopIteration:
            return []
        try:
            rest = self._store.triples((subj, REST, None)).next()[2]
        except StopIteration:
            return ValueError
        return [self._rdfToPython(first, None)] + self._rdfToList(rest)  ### type first?
        
    def _listToRdf(self, subj, members):
        \"\"\"Given a Python list, return the eqivalent RDF list.\"\"\"
        first = self._pythonToRdf(members[0], None) ### type members[0]?
        self._store.add((subj, FIRST, first))
        if len(members) > 1:
            blank = BNode()
            self._store.add((subj, REST, blank))
            self._listToRdf(blank, members[1:])
        else:
            self._store.add((subj, REST, NIL))
        
    def _AttrToURI(self, method_name):
        \"\"\"Given an attribute, return a URIRef.\"\"\"
        #print \"AttrToURI\", method_name
        if isinstance(method_name, RDF.Node) and method_name.is_resource():
            #print \"resolve\", method_name.uri
            return method_name.uri
        prefix, localname = method_name.split(\"_\", 1)
        return URI(\"\".join([self._store.prefix_ns_map[prefix], localname]))
        
    def _URIToAttr(self, uri):
        \"\"\"Given a URIRef or a URI, return an attribute.\"\"\"
        for ns_uri, prefix in self._store.ns_prefix_map.items():
            #print uri, uri.__class__
            if ns_uri == str(uri)[:len(ns_uri)]:
                return \"_\".join([prefix, str(uri)[len(ns_uri):]])
        raise ValueError
        
    def _getObjectType(self, pred, obj):
        \"\"\"Given a predicate and an object, figure out the object's type.\"\"\"
        if self._object_types.has_key(pred):
            return self._object_types[pred]
        else:
            try:
                obj_type = self._schema_store.triples((pred, RDFS_RANGE, None)).next()[2]
            except StopIteration:
                obj_type = None
            self._object_types[pred] = obj_type
            return obj_type
        
    def _isUniqueObject(self, pred):
        \"\"\"Given a predicate, figure out if the object has a cardinality greater than one.\"\"\"
        try:
            obj_maxcard = self._schema_store.triples((pred, MAX_CARD, None)).next()[2]
        except StopIteration:
            return False
        if isinstance(obj_maxcard, RDF.Node) and obj_maxcard.is_literal():
            obj_maxcard = str(obj_maxcard)
        elif  isinstance(obj_maxcard, RDF.Node)  and obj_maxcard.is_blank():
            return True
        if int(obj_maxcard) == 1:
            return True
        else:
            return False
        
    def __repr__(self):
        return self._name
        
    def __str__(self):
        return self._URIToAttr(self._name)
        
    def properties(self):
        \"\"\"List unique properties.\"\"\"
        return [str(self.__class__(self._store, self._schema_store, p) )
          for (s,p,o) in self._store.triples((self._name, None, None))]
        
class PropertySet:
    \"\"\"
    A set interface to the object(s) of a non-unique RDF predicate. Interface is a subset
    (har, har) of sets.Set. .copy() returns a sets.Set instance.
    \"\"\"
    def __init__(self, subject, predicate, iterable=None):
        self._subject = subject
        self._predicate = predicate
        self._store = subject._store
        if iterable is not None:
            for obj in iterable:
                self.add(obj)
    def __len__(self):
        return len(list(self._store.triples((self._subject._name, self._predicate, None))))
    def __contains__(self, obj):
        if not isinstance(obj, self._subject.__class__):
            obj_type = self._subject._getObjectType(self._predicate, obj)
            obj = Literal(SchemaToPython.get(obj_type, SchemaToPythonDefault)[1](obj))
        try:
            self._store.triples((self._subject._name, self._predicate, obj)).next()
            return True
        except StopIteration:
            return False
    def __iter__(self):
        for obj in self._store.triples((self._subject._name, self._predicate, None)):
            obj_type = self._subject._getObjectType(self._predicate, obj)
            yield self._subject._rdfToPython(obj[2], obj_type)
    def copy(self):
        return sets.Set(self)
    def add(self, obj):
        obj_type = self._subject._getObjectType(self._predicate, obj)
        self._store.add((self._subject._name, self._predicate,
          self._subject._pythonToRdf(obj, obj_type)))
    def remove(self, obj):
        if not obj in self:
            raise KeyError
        self.discard(obj)
    def discard(self, obj):
        if not isinstance(obj, self._subject.__class__):
            obj_type = self._subject._getObjectType(self._predicate, obj)
            obj = Literal(SchemaToPython.get(obj_type, SchemaToPythonDefault)[1](obj))
        self._store.remove((self._subject._name, self._predicate, obj))
    def clear(self):
        self._store.remove((self._subject, self._predicate, None))
        
SchemaToPythonDefault = (unicode, unicode)
SchemaToPython = {  #  (schema->python, python->schema)  Does not validate.
    'http://www.w3.org/2001/XMLSchema#string': (unicode, unicode),
    'http://www.w3.org/2001/XMLSchema#normalizedString': (unicode, unicode),
    'http://www.w3.org/2001/XMLSchema#token': (unicode, unicode),
    'http://www.w3.org/2001/XMLSchema#language': (unicode, unicode),
    'http://www.w3.org/2001/XMLSchema#boolean': (bool, lambda i:unicode(i).lower()),
    'http://www.w3.org/2001/XMLSchema#decimal': (float, unicode),
    'http://www.w3.org/2001/XMLSchema#integer': (long, unicode),
    'http://www.w3.org/2001/XMLSchema#nonPositiveInteger': (int, unicode),
    'http://www.w3.org/2001/XMLSchema#long': (long, unicode),
    'http://www.w3.org/2001/XMLSchema#nonNegativeInteger': (int, unicode),
    'http://www.w3.org/2001/XMLSchema#negativeInteger': (int, unicode),
    'http://www.w3.org/2001/XMLSchema#int': (int, unicode),
    'http://www.w3.org/2001/XMLSchema#unsignedLong': (long, unicode),
    'http://www.w3.org/2001/XMLSchema#positiveInteger': (int, unicode),
    'http://www.w3.org/2001/XMLSchema#short': (int, unicode),
    'http://www.w3.org/2001/XMLSchema#unsignedInt': (long, unicode),
    'http://www.w3.org/2001/XMLSchema#byte': (int, unicode),
    'http://www.w3.org/2001/XMLSchema#unsignedShort': (int, unicode),
    'http://www.w3.org/2001/XMLSchema#unsignedByte': (int, unicode),
    'http://www.w3.org/2001/XMLSchema#float': (float, unicode),
    'http://www.w3.org/2001/XMLSchema#double': (float, unicode),  # doesn't do the whole range
#    duration
#    dateTime
#    time
#    date
#    gYearMonth
#    gYear
#    gMonthDay
#    gDay
#    gMonth
#    hexBinary
    'http://www.w3.org/2001/XMLSchema#base64Binary': (base64.decodestring, lambda i:base64.encodestring(i)[:-1]),
    'http://www.w3.org/2001/XMLSchema#anyURI': (str, str),
}
        
if __name__ == '__main__':
    # use: \"python -i sparta.py [URI for RDF file]\"
    import sys
    store = TripleStore()
    store.parse(sys.argv[-1])
    Thing = ThingFactory(store)

I also have put together a preliminary RDF-object mapper.

#!/usr/bin/env python
        
\"\"\"
    spartaObj.py  (object mapper for Sparta)
    Copyright 2005 John C Barstow 
        
THIS SOFTWARE IS SUPPLIED WITHOUT WARRANTY OF ANY KIND, AND MAY BE
COPIED, MODIFIED OR DISTRIBUTED IN ANY WAY, AS LONG AS THIS NOTICE
AND ACKNOWLEDGEMENT OF AUTHORSHIP REMAIN.
        
TODO:
 * document / refactor
 * unit tests
 * relicense under GPL
“”\"
import sparta
TYPE = “http://www.w3.org/1999/02/22-rdf-syntax-ns#type”
from RDF import Uri as URI
import RDF
        
class ObjectFactory(sparta.ThingFactory):
        def __init__(self, store, schema_store=None):
                sparta.ThingFactory.__init__(self,store,schema_store)
                self.typemap = {}
        
        def __call__(self,name, **props):
                thing = sparta._Thing(self.store, self.schema_store, name, props)
                for t in thing.rdf_type:
                        if isinstance(t, RDF.Node):
                                print “node URI?”, t.is_resource()
                        if self.typemap.has_key(sparta.URI(t._name)):
                                return self.typemap[sparta.URI(t._name)](thing)
                return thing
        
        def MapType(self, uri, targettype):
                self.typemap[uri] = targettype
        def instancesOf(self, uriText):
                l = [self(s) for s in self.store.subjects(TYPE, URI(uriText))]
                #print “instances of”, uriText, l
                return l
class thingType:
        def __init__(self, thing):
                self.thing = thing
        
        def __getattr__(self, attr):
                if self.__dict__.has_key(attr):
                        return self.__dict__[attr]
                return self.thing.__getattr__(attr)
        
        def __setattr__(self,attr,obj):
                if attr==”thing”:
                        self.__dict__[attr]=obj
                        return
                try:
                        self.thing.__setattr__(attr,obj)
                except AttributeError:
                        self.__dict__[attr]=obj
        
        def properties(self):
                return self.thing.properties()

23 Feb 2005

US outsourcing torture

Filed under: — JBowtie @ 3:11 pm

OUTSOURCING TORTURE, in the New Yorker.

Boudella’s wife said that she was astounded that her husband could be seized without charge or trial, at home during peacetime and after his own government had exonerated him. The term “enemy combatant” perplexed her. “He is an enemy of whom?” she asked. “In combat where?” She said that her view of America had changed. “I have not changed my opinion about its people, but unfortunately I have changed my opinion about its respect for human rights,” she said. “It is no longer the leader in the world. It has become the leader in the violation of human rights.”

There are no words to convey the depth of my outrage and anger.

19 Feb 2005

XML vs RDF

Filed under: — JBowtie @ 10:15 pm

Edd Dumbill’s Weblog: Behind the Times

Edd discusses why he chose RDF over XML for the DOAP project.
It’s really a very good post and I urge you all to read it. There are two points in his point I want to run with.

First, his example of RDF/XML markup looks like normal XML. This is a very, very important thing - it looks absolutely nothing whatsoever like the horrible, horrible ugly syntax the W3C documents use for their examples.

Secondly, in the list of “For XML” points, Edd says:

It’s hard to lock an RDF vocabulary down, should you want to.

Locking down an RDF vocabulary is counterproductive. AIs and ontologies and all those other things you personally don’t care about can slurp up and annotate and derive useful information from an RDF vocabulary, and you get the benefits when you can pick up all those terms you understand in data all over the web, learning new things and perspectives as you go.

Everything we do involves change; our understanding of our data and the things we are trying to model changes, our requirements change, new people accessing our data bring exciting new viewpoints to our model. Locking down a vocabulary means you can’t assimilate new ideas. RDF is far more agile than, say, a SQL database, because it handles change so much more gracefully.

I think Edd’s work is a really good example of RDF as distributed XML. Note how DOAP defines all its own terms, then uses an ontology to map to other well-known vocabularies.

17 Feb 2005

gChargen 0.2

Filed under: — JBowtie @ 12:04 am

Well, looks I have met the functionality milestones for alpha release. I need to do some basic sanitation (license files, dependency checking) and I can then unleash this monster on the world.
Unfortunately the data files are a tricky problem due to Open Gaming Content restrictions; some of my test cases include closed content. Meaning nobody will be able to use it for, well, a while. And being able load and save is part of the next milestone.
Maybe I should keep it to myself until 0.4 after all…

Upgrade complete

Filed under: — JBowtie @ 12:00 am

The upgrade went relatively painlessly. I am very happy.

All my posts (including the old ones) now have sane permalinks that appear in the RSS feeds. I will need to regenerate some FOAF information now that the old plugin no longer works, and I will be looking to add the CC licensing info to future posts.

16 Feb 2005

Apology in advance

Filed under: — JBowtie @ 11:13 pm

I’m about to upgrade my blogging software, so please ignore any insanely strange things that happen in the next 24 hours.

In the meantime, watch the hilarious reactions to the announcement of IE7 on the day when Firefox saw 25 million downloads.

14 Feb 2005

evolt.org - Browser Archive

Filed under: — JBowtie @ 4:29 pm

This is a reminder to self - all the web browsers I might ever need to test against in one place. Isn’t it amazing at the diversity HTTP has spawned?

evolt.org - Browser Archive

Too long since last post

Filed under: — Ama @ 9:04 am

Hmm I’ve been getting lazy - mostly because of how popular this blog has started to get, but John assures me I should continue posting! :BLUSH:

Anyway. Cecily is walking EVERYwhere, faster and faster. She even carries around rather heavy/bulky items like her plastic chair her nanny gave her for her birthday. Its all I can do to keep her out of trouble out in public, I’m actually considering using that toddler harness I bought ages ago! There’s a reason that most kids don’t start walking till a bit later - 1 year olds have no business being so fast!
To top that stress, she’s become an actual toddler. That is to say, besides the walking we have full on tantrums and, what is it they call it here, wobblies? The tantrums are kinda cute, since its just her throwing herself forward over her front legs and ‘crying’. Its easy enough to ignore (though I have to hide the giggle), which is what was suggested to me - it seems to work, no audience makes it boring for her! The wobblies are a bit more annoying… she actually screams, very high pitched, with this look of utter anger and shock that something isn’t going her way. :D I just ignore it too, but my poor ears are paying for it!
And now poor John… the kid has decided 5 am is a good time to wake, and because of my meds sedating me he’s the one that has to get up with her. Wonder how that will go? We were so used to her going down by 8pm and up, at the earliest, by 7:30am. Ah the good ‘ole days. In contrast though, I’m getting longer naps from her during the day :D Can’t say that bothers me much, but thats being a little selfish.

Stats for little from 1 year plunket visit: 80.5cm 11kg, “advanced in language (has a 30+ word vocab) and gross and delicate motor skills such as walking/balance (catches herself very well before falling, and can carry items while walking quickly) and hand use (can hold a pen with her fingers, not whole hand, while scribbling)”
So basically, she’s above the 95 percentile on the plunket graph for height, just below it for weight (therefor proportionate), and any suggested reading material for age groups has to be given 4-8 months in advance. Not bad for a (almost) 13 month old! Its so fun to read child development books because she’s always doing things months, sometimes 6 months like stacking 8 blocks, in advance.

Okay enough show off :D I better change said genius, I can smell her across the room. I wonder if potty training will be sooner too!

08 Feb 2005

RDF as database

Filed under: — JBowtie @ 10:49 am

Yesterday at a job interview, someone asked me about O/R mapping tools. It’s not really a problem I’ve actually given a lot of thought to recently, and I realized why this morning. I promise part 4 of the refactoring next post.

Relational databases are going to be dead in five years. OK, not really dead, everyone has way too much invested, too useful for small problems, etc. But the real action will be in RDF-as-database.

As others have pointed out over on Planet RDF, RDF can be seen as a database. You have primary keys (the rdf:ID) and foreign keys (rdf:resource) and lots and lots of relationships. But RDF is far more flexible, automatically distributed, and allows your data to become agile. Add to that multiple serialization formats, the ability to evolve ontologies and schemas, and the fact that you can use AI reasoners to do your data mining and deduce new properties and relationships, and we have a winner.

Now, this is a bold prediction, so let’s try and back it up with a little more detail.

One - XML allows for the natural expression of hierarchies and parts explosions; this is one of those things that relational models have trouble with. My little refactoring series hopefully shows that moving from pure XML to RDF is trivial.
As more and more of the world’s data moves towards XML, tools for indexing and querying it are becoming more and more powerful. These same tools can be used for RDF serialized as XML.

Two - Agile development methodologies are permeating languages such as C# and Python. But as our code becomes more flexible and sees higher rates of change, the relational models have difficulty keeping up. Refactoring the database is complicated by the general rigidity of the model, the need to transform data and stored procedures using DDL, and the difficulty in distributing changes.
RDF+OWL, on the other hand, makes it very easy to evolve your data. Data can be added and removed incrementally, the usual formats are text-based, meaning diff and patch and version control systems all play nicely with it, and the metadata becomes just more data.

Three - Currently, data warehousing and data mining require specialized tools and knowledge. RDF allows general-purpose reasoners to work with data anywhere on the web, extracting new relationships and spotting trends. Thanks to OWL, domain knowledge can now be written down in a form that allows a general-purpose program to make use of it - and that definition can also evolve over time.

Four - SPARQL is close enough to SQL that a lot of developers will be able transfer their hard-won skills to the new medium. Yes, you need to learn to think in triples, but if you already think in terms of JOINs that’s not a big leap.

Five - The triple translates nicely into relational space. One to three tables, depending on your favorite representation, a handful of stored procedures and you can use today’s engines. Serialize as XML and you can use tomorrow’s engines. And in a few years high-performance triple-based engines with native SPARQL support will be in wide use.

RDF has some natural advantages that make it contender. But it is OWL and SPARQL that give it real traction, because those allow the data people to transfer their current skills into the new space. And once you have your head around the RDF/OWL combination, you finally understand just enough to actually make effective use of an AI - maybe those will be a wide-spread reality this time around.

03 Feb 2005

python IDE requirements

Filed under: — JBowtie @ 2:11 pm

I thought I’d record my feelings on what I want to see in a Python IDE. For the moment, I’m quite happy to stick with GEdit, but I might look at pulling something together later.

For GUI development, I want integrated Glade support. Maybe the Gazpacho editor could be built on.

Some sort of GUI similar to jUnit/NUnit for all unit tests. I should be able to hit a button and get a green or red bar at any time, with a tree drill-down to individual suites and tests.

A Refactoring menu that leverages BicycleRepairMan to automate common refactorings. Highlight code, select “Extract Method”, type method name, done.

A graphical, single-step debugger. The language has debugging capabilities built in, this should not be hard.

Syntax highlighting and decent indentation management (i.e., automatically adopt what the current file does). Code completion should be trivial to implement using class dictionaries.

Integration with Subversion (using pysvn) for source control commands. Having TRAC bug reports appear as TODO items is a cool extra.

Creating new projects should handle the usual open-source boilerplate (README, INSTALL, HACKING, ….) and allow selection of a license and default indentation policy.

Some sort of distutil magic would be good; especially for modules. Public modules intended to go into site-path versus private modules intended for a single application.

That’s all I really need. It’s not a huge feature list and one could leverage plenty of existing projects for most of the functionality. Like all good Python projects, it’s just about tying together good libraries.

01 Feb 2005

More on assertions

Filed under: — JBowtie @ 10:58 pm

I wandered pretty far off-topic in my last post, so let me add some more thoughts.

It is possible, through the addition of metadata about statements, to negate, change, or override existing statements. However, I do not believe there is a well-documented, carefully thought out consensus on how to do so.

This consensus needs to come about, and it needs to be a core part of RDF or OWL (probably the latter) so that it is widely supported by reasoners. In the next few years, there will be hordes of newcomers to this technology, and they all will be struggling with the same issues. If there is not a ready answer, they will declare RDF a failure and go back to their XML files with OO wrappers.

Here’s a couple minor proposals to start the ball rolling.


  John C Barstow

This is the opposite of rdf:Description. Instead of telling the parser to add the assertion, it instructs the parser to remove any such statement and ignore it in future.


  

This tells the parser to expect a contradiction and accept this version as true. I would imagine most reasoners would actually add some metadata or adjust definitions to avoid unqualified contradictions.

When Do Assertions Become Facts?

Filed under: — JBowtie @ 1:21 pm

In his post, Semantic Wave: When Do Assertions Become Facts?, Jamie Pitts struggles with some of the same issues I have been struggling with recently.

Jamie notes:

As time passes, the latest assertions about role will inevitably contradict previous assertions.

For some relationships, such as an individual’s role, the addition of a property indicating “effective dates” is an appropriate way of avoiding contradiction.
All individuals fulfill a role for limited time; that’s why we put dates on resumes. It’s entirely possible and desirable to reflect this in RDF.
As with the database, the “current” understanding is really an implicit query looking for statements that are effective today. Thankfully, it’s much easier to model statements about statements in RDF than in relational models.

It goes without saying that interpretations of reality are formed in the mind through an ongoing process of re-assessment. We qualitatively compare present impressions with recent impressions. Enough contradiction, and we form a new working state of understanding.

Well, this is really AI territory, isn’t it? An AI needs to recognize assertions that contradict its current knowledge base, decide whether to resolve the contradiction by throwing out the new or old assertion, and re-initialize its deduction/inference engine to recreate all derived knowledge.

Hmmm…let’s run with that for a minute (thinking out loud here).

Here’s an AI; it manages one or more knowledge bases in RDF/OWL and serves up answers to SPARQL queries, possibly formulated by some NLS parser.
It gathers information by spidering or exchanging RDF with other AIs. It reasons by tapping external deduction/inference engines.
One of the knowledge bases contains trust metrics, used to weight the new RDF statements.
A reasoner is set up to check the new information for internal consistency - if the information contradicts itself the source is presented a proof and metrics may be adjusted.
A reasoner then compares the new information with the one or more existing knowledge bases looking for contradictions. Multiple reasoners should be consulted.
If no contradictions, the new information is added. If contradictions are found, the stronger assertion is persists; strength depends on many metrics, including trustworthiness of original sources, degree of corroboration from other sources, metrics concerning number of other assertions affected, and so forth.
One or more reasoning daemons takes the new knowledge base and deduces as many new facts as possible. The strength of the underlying premises determines the strength of the new assertions.
A couple of things we can add - an imagination. Create random triples, then use a reasoner to look for contradictions or proofs. Add surviving assertions to the knowledge base. We can try less random things such as making assertions about superclasses or removing restrictions.
How about experimentation? Some sensors and motors, and now we can start adding experimentally derived facts to the knowledge base.

Of course, for Jamie’s example, he can accomplish what he wants simply by adding new statements. All that’s needed is a shift in thinking; you’re not making assertions about the current state of things, you’re making assertions about a date range that happens to include today (and possibly not making any assertions, yet, about the end of the date range). Now facts are not changing, you’re merely acquiring new facts.

What I want to do with RDF

Filed under: — JBowtie @ 9:46 am

Some of the tasks I want to accomplish with RDF, in no particular order.

Keep track of the relationships between characters in a novel, movie, or role-playing game. (FOAF?) Maybe I will finally figure out who shot Mr. Burns.

Create or query for NPCs with some stated constraints, such as “a level 4 character with the Quick Draw feat and at least 5 ranks in the Tumble skill.” (OWL or SPARQL) A good random character generator. No, really.

Represent the semantics of a snippet of Python code with OWL. Generate Python code from an OWL Full description (since both Python and OWL Full allow classes to be treated as instances). Because I am a geek.

Generate human-readable RPG rules from RDF. I can now find and eliminate contradictions in the rule sets before publication.

Write a reasoner that can play a card game like Rokugan, with each card and its rules represented in RDF. No longer do I have convince my wife to play.

Encode enough semantic meaning into a screenplay that I can change any detail about a character and have all dialogue and descriptions reflect this. This will of course displace the standard Hollywood script writer, especially when the program is smart enough to handle replacing a human character with a talking dog.

Powered by WordPress