Topic maps are composed of representatives of subjects, that is representatives of:
anything whatsoever, regardless of whether it exists or has any other specific characteristics, about which anything whatsoever may be asserted by any means whatsoever (TMDM, 3.14)
Every text is composed of representatives of subjects as well.
Does that make every text a topic map? The answer to that is “no” but why?
Comparing a Text and a Topic Map:
Property | Text | Topic Map |
Subject Representatives | yes | yes |
Explicit Rules for Identification/Representation | no | yes |
Explicit Rules for Merging | no | yes |
I waver between saying that the explicit rules for Identification/Representation are sufficient by themselves and adding explicit rules for Merging. Certainly the rules for merging presume the first but without rules for merging, the rules for identification/representation are nugatory.
Following both sets of rules does not necessarily result in merging all the subject representatives for the same subject. The most any topic map application can claim is that a set of rules for identification/representation have been followed by a particular map and that specified rules for merging have been applied.
Whether a topic map has in fact properly “merged” all the subject representatives is a judgment only a human reader can make, along side whatever texts they happen to be reading.
PS: Merging means that a single representative for a subject results, containing all the different identifications for that subject and any properties of that subject.
True that it is an advantage of Topic Maps that the rules for subject identification are explicit. Although i wonder whether text does not have advantages as well: in order for a text to be understood, the reader must share some of the context of the author. So only in connection with other knowledge the text can be understood. The text in combination with other texts of a similar context actually allows you to guess the context of the author, so it kind of conserves context. And while the subject merge rules in Topic Maps are explicit, i wonder whether this isn’t ,IMHO, a little bit deceiving, since all a PSI or a legend does is to point you to a representation that uses a proxy system that you are more likely to decipher. And as far as i understand the original idea behind occurrences (pointers into the text corpus) also creates hints to the context, because by consulting all occurrences of a topic, one might guess or know (depending on the distance of the contexts of reader and author) the subject of the topic.
The other remark i want to make is much more practical. Over the last 5 years i am creating topic maps on a daily basis. My primary intention is to formalize the statements i want to record for obvious reasons (computation and navigation). Yet, there is a fair amount of text in my topic maps. If you take my Topic Maps based issue tracking system: i found it impossible to find a formal model for describing software issues (even more so to create it ad hoc). The other statements which i am reluctant to formalize is to describe what a software does. Yet, when i take the stance of the objective observer my failure to formalize these statements indicates a lack of understanding in those domains which again indicates that even if i would come up with a model or use an existing one it might be of little use since it is not shared a great deal amongst the possible users of such a system (me being a ideal representative :). This line of argumentation does lead me to one point i additionally want to make and which also relates to some of your other posts: the more time you invest into developing a vocabulary, the harder it will be for consumers to understand, because that time that you spent developing it separates you from them. It depends on the use case whether it is economically reasonable to close that gap by training. But if you can throw out something ad hoc best in cooperation with your users it is probably very useful and easy for them. As easy as writing a text. Yet, some things are simply hard and require people to rethink 🙂
Comment by Robert Cerny — April 13, 2010 @ 3:56 am