New Paper: Linked Data Strategy for Global Identity
Angela Guess writes:
Hugh Glaser and Harry Halpin have published a new PhD thesis for the University of Southampton Research Repository entitled “The Linked Data Strategy for Global Identity” (2012). The paper was published by the IEEE Computer Society. It is available for download here for non-commercial research purposes only. The abstract states, “The Web’s promise for planet-scale data integration depends on solving the thorny problem of identity: given one or more possible identifiers, how can we determine whether they refer to the same or different things? Here, the authors discuss various ways to deal with the identity problem in the context of linked data.”
At first I was hurt that I didn’t see a copy of Harry’s dissertation before it was published. I don’t always agree with him (see below) but I do like keeping up with his writing.
Then I discovered this is a four page dissertation. I guess Angela never got past the cover page. It is an article in the IEEE zine, IEEE Internet Computing.
Harry fails to mention that the HTTP 303 “trick,” was made necessary by Tim Berners-Lee’s failure to understand the necessity to distinguish identifiers from addresses. Rather that admit to or correct that failure, the solution being pushed is to create web traffic overhead in the form of 303 “tricks.” “303” should be re-named, “TBL”, so we are reminded with each invocation who made it necessary. (lower middle column, page 3)
I partially agree with:
We’re only just beginning to explore the vast field of identity, and more work is needed before linked data can fulfill its full potential.(on page 5)
The “just beginning” part is true enough. But therein lies the rub. Rather than explore the “…vast field of identity…” which changes from domain to domain, first and then propose a solution, the Linked Data proponents took the other path.
They proposed a solution and in the face of its failure to work, now are inching towards the “…vast field of identity….” Seems a might late for that.
Harry concludes:
The entire bet of the linked data enterprise critically rests on using URIs to create identities for everything. Whether this succeeds might very well determine whether information integration will be trapped in centralized proprietary databases or integrated globally in a decentralized manner with open standards. Given the tremendous amount of data being created and the Web’s ubiquitous nature, URIs and equivalence links might be the best chance we have of solving the identity problem, transforming a profoundly difficult philosophical issue into a concrete engineering project.
The first line, “The entire bet….” omits to say that we need the same URIs for everything. That is called the perfect language project, which has a very long history of consistent failure. Recent attempts include Esperanto and LogLang.
The second line, “Whether this succeeds…trapped in centralized proprietary databases…” is fear mongering. “If you don’t support linked data, (insert your nightmare scenario).”
The final line, “…transforming a profoundly difficult philosophical issue into a concrete engineering project” is magical thinking.
Identity is a very troubled philosophical issue but proposing a solution without understanding the problem doesn’t sound like a high percentage shot to me. You?