Thinking about Topic Maps and dates
I had an idle thought about using topic-maps’ processing model (or even topic maps themselves) to represent the information encoded in RSS and RSS data resources. The attraction is that topic maps have a concept of merging built in, so writing an aggregator would in principle be straightforward.
Obviously stories are topics, and categories are topics. What about dates? These become what in topic-maps are called occurences (the concept of occurrence is stretched a little). I assume any topic-map query-engine is willing to do grouping and sorting on topics according to occurrence values?
This got me thinking about dates in metadata. Most metadata
examples I’ve seen use what might be called free-form
dates. This is perhaps OK within one, isolated database, but
when I am merging two topic-maps, how does my
computer know how to compare dates in random formats like
13 Oct 02
and 2002-09-22
? I would
rather not rely on the cute guessing games that programs like
Microsoft® Excel resort to (e.g., if I enter 12-09-2002
and 13-09-2002 in separate cells, they end up holding
9 December and 13 September).
My suggestion is that the occurrence-types that subclass special topics that are conventionally used for dates in particular formats. These special topics in turn would require published subject indicators. I have created a page that contains PSIs for Date-Time occurrence types to illustrate the idea. Note! this is just for discussion. Also, it really needs an attached machine-processable metadata resource (in XTM, say).
Another slightly weird approach would be to reify the dates.
That is, create topics representing the dates themselves. The
relationship between story and the date it is published on then
is represented as an association between the story topic and the
date topic. Date topics would use PSIs with a format like
http://psi.example.com/2002/date/#2002-09-22
or
http://psi.example.com/2002/date/?d=2002-09-22
(the
latter has the advantage of being able to run a check on the
format of its param). Probably less efficient than using occurences.