Carl Lemp, commented in the XTM group at LinkedIn, potential redesign of topic maps discussion:
2. There are only a few tools to help build a Topic Map.
3. There is almost nothing to help translate familiar information structures to Topic Map structures.
(…)
Getting through 2 and 3 is a bitch.
I can’t help with #2 but I may be able to help with #3.
I suggest mapping the MediaWiki structure that is used for Wikipedia into a topic map.
As a demonstration it has the following advantages:
- Conversion from SQL dump to topic map scripts.
- Large enough to test alternative semantics.
- Sub-sets of Wikipedia good for starter maps.
- Useful to merge with other data sets.
- Well known data set.
- Widespread data format (SQL).
The MediaWiki schema MediaWiki-1.21.1-tables.sql.
The base output format will be CTM.
When we want to test alternative semantics, I suggest that we use “.” followed by “0tm” (zero followed by “tm”) as the file extension. Comments at the head of the file should reference or document the semantics to be applied in processing the file.
In terms of sigla for annotating the SQL, are there any strong feelings against? (Drawn from the TMDM vocabulary section):
A | association | representation of a relationship between one or more subjects |
Ar | association role | representation of the involvement of a subject in a relationship represented by an association |
Art | association role type | subject describing the nature of the participation of an association role player in an association |
At | association type | subject describing the nature of the relationship represented by associations of that type |
Ir | information resource | a representation of a resource as a sequence of bytes; it could thus potentially be retrieved over a network |
Ii | item identifier | locator assigned to an information item in order to allow it to be referred to |
O | occurrence | representation of a relationship between a subject and an information resource |
Ot | occurrence type | subject describing the nature of the relationship between the subjects and information resources linked by the occurrences of that type |
S | scope | context within which a statement is valid |
Si | subject identifier | locator that refers to a subject indicator |
Sl | subject locator | locator that refers to the information resource that is the subject of a topic |
T | topic | symbol used within a topic map to represent one, and only one, subject, in order to allow statements to be made about the subject |
Tn | topic name | name for a topic, consisting of the base form, known as the base name, and variants of that base form, known as variant names |
Tnt | topic name type | subject describing the nature of the topic names of that type |
Tf | topic type | subject that captures some commonality in a set of subjects |
Vn | variant name | alternative form of a topic name that may be more suitable in a certain context than the corresponding base name |
The first step I would suggest is creating a visualization of the MediaWiki schema.
We will still have to iterate over the tables but getting an over all view of the schema will be helpful.
Suggestions on your favorite schema visualization tool?