Hash-URIs for Verifiable, Immutable, and Permanent Digital Artifacts by Tobias Kuhn and Michel Dumontier.
Abstract:
To make digital resources on the web verifiable, immutable, and permanent, we propose a technique to include cryptographic hash values in URIs. We show how such hash-URIs can be used for approaches like nanopublications to make not only specific resources but their entire reference trees verifiable. Digital resources can be identified not only on the byte level but on more abstract levels, which means that resources keep their hash values even when presented in a different format. Our approach sticks to the core principles of the web, namely openness and decentralized architecture, is fully compatible with existing standards and protocols, and can therefore be used right away. Evaluation of our reference implementations shows that these desired properties are indeed accomplished by our approach, and that it remains practical even for very large files.
I rather like the author’s summary of their approach:
our proposed approach boils down to the idea that references can be made completely unambiguous and veriable if they contain a hash value of the referenced digital artifact.
Hash-URIs (assuming proper generation) would be completely unambiguous and verifiable for digital artifacts.
However, the authors fail to notice two important issues with Hash-URIs:
- Hash-URIs are not human readable.
- Not being human readable means that mappings between Hash-URIs and other references to digital artifacts will be fragile and hard to maintain.
For example,
In prose an author will not say, “As found by “http://example.org/r1.RA5AbXdpz5DcaYXCh9l3eI9ruBosiL5XDU3rxBbBaUO70” (from the article).
In some publishing styles, authors will say: “…as a new way of scientific publishing [8].”
In other styles, authors will say: “Computable functions are therefore those “calculable by finite means” (Turing, 1936: 230).”
That is to say of necessity there will be a mapping between the unambiguous and verifiable reference (UVR) and the ones used by human authors/readers.
Moreover, should the mapping between UVRs and their human consumable equivalents be lost, recovery is possible but time consuming.
The author’s go to some lengths to demonstrate the use of Hash-URIs with RDF files. RDF is one approach among many to digital artifacts.
If the mapping issues between Hash-URIs and other identifiers can be addressed, a more general approach to digital artifacts would make this proposal more viable.
I first saw this in a tweet by Tobias Kuhn.