Complex Merging Conditions In XTM

We need a way to merge topics for reasons that are not specified by the TMDM.

For example, I want merge topics that have equivalent occurrences of type ISBN. Library catalogs in different languages may only share the ISBN of an item as a common characteristic. A topic map generated from each of them could have the ISBN as an occurrence on each topic.

I am assuming each topic map relies upon library identifiers for “standard” merging because that is typically how library systems bind the information for a particular item together.

So, how to make merging occur when there are equivalent occurrences of type ISBN?

Solution: As part of the process of creating the topics, add a subject identifier based on the occurrences of type ISBN that results in equivalent subject identifiers when the ISBN numbers are equivalent. That results in topics that share equivalent occurrences of type ISBN merging.

While the illustration is with one occurrence, there is no limit as to the number of properties of a topic that can be considered in the creation of a subject identifier that will result in merging. Such subject identifiers, when resolved, should document the basis for their assignment to a topic.

BTW, assuming a future TMQL that enables such merging, note this technique will work with XTM 1.0 topic map engines.

Caution: This solution does not work for properties that can be determined only after the topic map has been constructed. Such as participation in particular associations or the playing of particular roles.

PS: There is a modification of this technique to deal with participation in associations or the playing of particular roles. More on that in another post.

4 Responses to “Complex Merging Conditions In XTM”

  1. This is doable today with tolog, and scales easily to far more complex examples. Your particular example could be done like this:

    merge $T1, $T2 from
    isbn($T1, $ISBN),
    isbn($T2, $ISBN),
    $T1 /= $T2

    It’s also possible to specify in TMCL that isbn occurences must be unique, as follows:

    isbn isa tmcl:occurrence-type;
    has-unique-value().

    Exactly what happens with different topics having the same isbn occurrence value depends on the TMCL engine, but it would be strange if they didn’t offer the option to simply merge offending topics.

  2. Patrick Durusau says:

    True, tis true but as you point out, what happens with a TMCL engine depends upon what developers thought was odd or not. Depending on a common mind set of developers isn’t a a good strategy.

    Appreciate you pointing out the tolog solution since that non-standard solution is widely available.

    I do think it is useful for people to think about solutions that involve the standard mechanisms of subject identity since those will persist across all applications that support the XTM/TMDM.

  3. Robert Cerny says:

    I am not sure if i can follow you. I do not think that it makes a difference how you encode the statement in the topic map. In any case one can design a procedure for URI construction and share it. The playing of a particular role is not something that does not exist in the source dataset, is it? It is just not as clearly visible as the ISBN. Maybe i miss something. Can you give an example?

  4. Patrick Durusau says:

    Robert,

    Good point. Even when streaming data, there will be a point when it is “known” that a particular subject is playing a role and a URI construction rule can be triggered. Which would then trigger additional merging.

    I think there may be exceptions to that statement but right now I can’t formulate an example.