Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

May 28, 2015

I Is For Identifier

Filed under: Identifiers,Topic Maps — Patrick Durusau @ 1:20 pm

As you saw yesterday, Sam Hunting and I have a presentation at Balisage 2015 (Wednesday, August 12, 2015, 9:00 AM, if you are buying a one-day ticket), “Spreadsheets – 90+ million end user programmers with no comment tracking or version control.”

If you suspect the presentation has something to do with topic maps, take one mark for your house!

You will have to attend the conference to get the full monty but there are some ideas and motifs that I will be testing here before incorporating them into the paper and possibly the presentation.

The first one is a short riff on identifiers.

Omitting the hyperlinks, the Wikipedia article in identifiers says in part:

An identifier is a name that identifies (that is, labels the identity of) either a unique object or a unique class of objects, where the “object” or class may be an idea, physical [countable] object (or class thereof), or physical [noncountable] substance (or class thereof). The abbreviation ID often refers to identity, identification (the process of identifying), or an identifier (that is, an instance of identification). An identifier may be a word, number, letter, symbol, or any combination of those.

(emphasis in original)

It goes on to say:


In computer science, identifiers (IDs) are lexical tokens that name entities. Identifiers are used extensively in virtually all information processing systems. Identifying entities makes it possible to refer to them, which is essential for any kind of symbolic processing.

There is an interesting shift in that last quote. Did you catch it?

The first two sentences are talking about identifiers but the third shifts to “[i]identifying entities makes it possible to refer to them….” But single token identifiers aren’t the only means to identify an entity.

For example, a police record may identify someone by their Social Security Number and permit searching by that number, but it can also identify an individual by height, weight, eye/hair color, age, tatoos, etc.

But we have been taught from a very young age that I stands for Identifier, a single token that identifies an entity. Thus:

identifier2

Single identifiers are found in “virtually all information systems,” not to mention writing from all ages and speech as well. They save us a great deal of time by allowing us to say “President Obama” without having to enumerate all the other qualities that collectively identify that subject.

Of course, the problem with single token identifiers is that we don’t all use the same ones and sometimes use the same ones for different things.

So long as we remain fixated on bare identifiers:

identifier2

we will continue to see efforts to create new “persistent” identifiers. Not a bad idea for some purposes, but a rather limited one.

Instead of bare identifiers, what if we understood that identifiers stand in the place of all the qualities of the entities we wish to identify?

That is our identifiers were seen as being pregnant with the qualities of the entities they represent:

identifier-pregnant

For some purposes, like unique keys in a database, our identifiers can be seen as opaque identifiers, that’s all there is to see.

For other purposes, such as indexing across different identifiers, then our identifiers are pregnant with the qualities that identify the entities they represent.

If we look at the qualities of the entities represented by two or more identifiers, we may discover that the same identifier represents two different entities, or we may discover that two (or more) identifiers represent the same entities.

I think we need to acknowledge the allure of bare identifiers (the ones we think we understand) and their usefulness in many circumstances. We should also observe that identifiers are in fact pregnant with the qualities of the entities they represent, enabling use to distinguish the same identifier but different entity case and match different identifiers for the same entity.

Which type of identifier you need, bare or pregnant, depends upon your use case and requirements. Neither one is wholly suited for all purposes.

(Comments and suggestions are always welcome but especially on these snippets of material that will become part of a larger whole. On the artwork as well. I am trying to teach myself Gimp.)

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress