Archive for the ‘Identification’ Category

Thinking, Fast and Slow

Tuesday, December 27th, 2011

Thinking, Fast and Slow by Daniel Kahneman, Farrar, Straus and Giroux, New York, 2011.

I got a copy of “Thinking, Fast and Slow” for Christmas and it has already proven to be an enjoyable read.

Kahneman says early on (page 28):

The premise of this book is that it is easier to recognize other people’s mistakes than our own.

I thought about that line when I read a note from a friend that topic maps needed more than my:

tagging everything with “Topic Maps….”

Which means I haven’t been clear about the reasons for the breath of materials I have and will be covering in this blog.

One premise of this blog is that the use and recognition of identifiers is essential for communication.

Another premise of this blog is that it is easier for us to study the use and recognition of identifiers by others, much for the same reasons we can recognize the mistakes of others more easily.

The use and recognition of identifiers by others aren’t mistakes but they may be different from those we would make. In cases where they differ from ours, we have a unique opportunity to study the choices made and the impacts of those choices. And we may learn patterns in those choices that we can eventually see in our own choices.

Understanding the use and recognition of identifiers in a particular circumstance and the requirements for the use and recognition of identifiers, is the first step towards deciding whether topic maps would be useful in some circumstance and in what way?

For example, processing social security records in the United States, anything other than “bare” identifiers like a social security number may be unnecessary and add load with no corresponding benefit. Aligning social security records with bank records, might need to reconsider the judgement to use only social security numbers. (Some information sharing is “against the law.” But as the Sheriff in “Oh Brother where art thou?” says: “The law is a man made thing.” Laws change, or you can commission absurdist interpretations of it.)

Topic maps aren’t everywhere but identifiers and recognition of identifiers are.

Understanding identifiers and their recognition will help you choose the most appropriate solution to a problem

Ontology Matching 2011

Tuesday, December 13th, 2011

Ontology Matching 2011

Proceedings of the 6th International Workshop on Ontology Matching (OM-2011)

From the conference website:

Ontology matching is a key interoperability enabler for the Semantic Web, as well as a useful tactic in some classical data integration tasks dealing with the semantic heterogeneity problem. It takes the ontologies as input and determines as output an alignment, that is, a set of correspondences between the semantically related entities of those ontologies. These correspondences can be used for various tasks, such as ontology merging, data translation, query answering or navigation on the web of data. Thus, matching ontologies enables the knowledge and data expressed in the matched ontologies to interoperate.

The workshop has three goals:

  • To bring together leaders from academia, industry and user institutions to assess how academic advances are addressing real-world requirements. The workshop will strive to improve academic awareness of industrial and final user needs, and therefore direct research towards those needs. Simultaneously, the workshop will serve to inform industry and user representatives about existing research efforts that may meet their requirements. The workshop will also investigate how the ontology matching technology is going to evolve.
  • To conduct an extensive and rigorous evaluation of ontology matching approaches through the OAEI (Ontology Alignment Evaluation Initiative) 2011 campaign. The particular focus of this year’s OAEI campaign is on real-world specific matching tasks involving, e.g., open linked data and biomedical ontologies. Therefore, the ontology matching evaluation initiative itself will provide a solid ground for discussion of how well the current approaches are meeting business needs.
  • To examine similarities and differences from database schema matching, which has received decades of attention but is just beginning to transition to mainstream tools.

An excellent set of papers and posters.

While I was writing this post, I realized that had the papers been described as matching subject identifications by similarity measures, I would have felt completely different about the papers.

Isn’t that odd?

Question: Do you agree/disagree that mapping ontologies is different from mapping subject identifications? Why/why not?

OCLC Developer Network

Monday, October 24th, 2011

OCLC Developer Network

From the webpage:

The OCLC Developer Network is a community of developers collaborating to propose, discuss and test OCLC Web Services. This open source, code-sharing infrastructure improves the value of OCLC data for all users by encouraging new OCLC Web Service uses.

Thought while I was looking at OCLC resources I might as well give a shout out to the OCLC Developer Network. A community that has an interest in identifiers and identification for the purpose of furthering access to information. Who could be more sympathetic to topic maps?

WorldCat Identities Network

Monday, October 24th, 2011

WorldCat Identities Network

A project of OCLC Research, the WorldCat Identities Network is described as:

The WorldCat Identity Network uses the WorldCat Identities Web Service and the WorldCat Search API to create an interactive Related Identity Network Map for each Identity in the WorldCat Identities database. The Identity Maps can be used to explore the interconnectivity between WorldCat Identities.

A WorldCat Identity can be a person, a thing (e.g., the Titanic), a fictitious character (e.g., Harry Potter), or a corporation (e.g., IBM).

I can’t claim to be a fan of jumpy network node displays but that isn’t a criticism, more a matter of personal taste. Some people find that sort of display quite useful.

The information conveyed, leaving display to one side, is quite interesting. It has just enough fuzziness (to me at any rate) to approach the experience of serendipitous discovery using more traditional library tools. I suspect that will vary from topic to topic but that was my experience with briefly using the interface.

Despite my misgivings about the interface, I will be returning to explore this service fairly often.

BTW, the service is obviously mis-named. What is being delivered is what we used to call “see also” or related references, thus: WorldCat “See Also” Network would be a more accurate title.

For class:

  1. Spend at least an hour or more with the service and write a 2 page summary of what you liked/disliked about it. (no citations)
  2. What subject/relationship did you choose to follow? Discover anything you did not expect? 1 page (no citations)

Summing up Properties with subjectIdentifiers/URLs?

Thursday, September 8th, 2011

I was picking tomatoes in the garden when I thought about telling Carol (my wife) the plants are about to stop producing.

Those plants are at a particular address, in the backyard, middle garden bed of three, are of three different varieties, but I am going to sum up those properties by saying: “The tomatoes are about to stop producing.”

It occurred to me that a subjectIdentifier could be assigned to a topic element on the basis of summing up properties of the topic.* That would have the advantage of enabling merging on the basis of subjectIdentifiers as opposed to more complex tests upon properties of a topic.

Disclosure of the basis for assignment of a subjectIdentifier is an interesting question.

It could be that a service wishes to produce subjectIdentifiers and index information based upon complex property measures, producing for consumption, the subjectIdentifiers and merge-capable indexes on one or more information sets. The basis for merging being the competitive edge offered by the service.

If promoting merging with a vendor’s process or format, which is seeking to become the TCP/IP of some area, the basis for merging and tools to assist with it will be supplied.

Or if you are an intelligence agency and you want an inward and outward facing interface that promotes merging of information but does not disclose your internal basis for identification, variants of this technique may be of interest.

*The notion of summing up imposes no prior constraints on the tests used or the location of the information subjected to those tests.

When Should Identifications Be Immutable?

Thursday, September 8th, 2011

After watching a presentation on Clojure and its immutable data structures, I began to wonder when should identifications be immutable?

Note that I said when should identifications… which means I am not advocating a universal position for all identifiers but rather a choice that may vary from situation to situation.

We may change our minds about an identification, the fact remains that at some point (dare I say state?) a particular identification was made.

For example, you make a intimate gesture at a party only to discover your spouse wasn’t the recipient of the gesture. But at the time you made the gesture, at least I am willing to believe, you thought it was your spouse. New facts are now apparent. But it is also a new identification. As your spouse will remind you, you did make a prior, incorrect identification.

As I recall, topics (and other information items) are immutable for purposes of merging. (TMDM, 6.2 and following.) That is merging results in a new topic or other new information item. On the other hand, merging also results in updating information items other than the one subject to merging. So those information items are not being treated as immutable.

But since the references are being updates, I don’t think it would be inconsistent with the TMDM to create new information items to be the carriers of the new identifiers and thus treating the information items as immutable.

Would be application/requirement specific but say for accounting/banking/securities and similar applications, it may be important for identifications to be immutable. Such that we can “unroll” a topic map as it were to any prior arbitrary identification or state.