I’ve been reading about
Topic Maps,
an
ISO
standard that has been augmented with an
XML
representation.
Topic maps are very similar to the RDF, in that they
are all about graphs of topics (representing real-world
subjects) connected with associations. The difference is that
the Topic-maps paradigm seems easier to understand. Maybe its
because they draw a distinction between the topics and the
subjects they stand in for, whereas RDF tends to
conflate the two. Or maybe its the way a few important
relationships (like occurence and instanceOf) are treated
specially in topic maps, which makes maps a little less
bewilderingly generic.
Topic maps have a system of using
URIs
to stand in for particular abstract subjects.
Separate topic maps using the same URI
http://www.topicmaps.org/xtm/language.xtm#en
as the subject indicator for the English language
know they are referring to the same thing.
When they are merged, the corresponding topics will be combined
automatically. One of the activities of various topic-map
committees is creating published subject indicators for
various generically useful types of topic, in order to promote
interoperability between topic maps.
Other (meta)data systems use URIs to represent subjects:
RDF does (using a
weird convention where XML element-names turn in to URIs),
RSS 0.9x/2.0 does
(inasmuch as category
names may be interpreted
relative to a domain
specified by a URL). It would
be kind of cool if we could all agree to use the same
subject identifiers, so our various efforts interoperate as much
as possible.
I had an idle thought about using topic-maps’
processing model (or even topic maps themselves) to represent
the information encoded in RSS and RSS data resources. The attraction is that
topic maps have a concept of merging built in, so writing an
aggregator would in principle be straightforward.
Obviously stories are topics, and categories are topics. What
about dates? These become what in topic-maps are called
occurences (the concept of occurrence is stretched a little).
I assume any topic-map query-engine is willing to do
grouping and sorting on topics according to occurrence values?
This got me thinking about dates in metadata. Most metadata
examples I’ve seen use what might be called free-form
dates. This is perhaps OK within one, isolated database, but
when I am merging two topic-maps, how does my
computer know how to compare dates in random formats like
13 Oct 02
and 2002-09-22
? I would
rather not rely on the cute guessing games that programs like
Microsoft® Excel resort to (e.g., if I enter 12-09-2002
and 13-09-2002 in separate cells, they end up holding
9 December and 13 September).
My suggestion is that the occurrence-types that subclass special
topics that are conventionally used for dates in particular
formats. These special topics in turn would require published
subject indicators. I have created a page that contains PSIs for Date-Time occurrence
types to illustrate the idea. Note! this is just for
discussion. Also, it really needs an attached
machine-processable metadata resource (in XTM, say).
Another slightly weird approach would be to reify the dates.
That is, create topics representing the dates themselves. The
relationship between story and the date it is published on then
is represented as an association between the story topic and the
date topic. Date topics would use PSIs with a format like
http://psi.example.com/2002/date/#2002-09-22
or
http://psi.example.com/2002/date/?d=2002-09-22
(the
latter has the advantage of being able to run a check on the
format of its param). Probably less efficient than using occurences.