From the post:

An increasing number of universities and research organisations are starting to build research data repositories to allow permanent access in a trustworthy environment to data sets resulting from research at their institutions. Due to varying disciplinary requirements, the landscape of research data repositories is very heterogeneous. This makes it difficult for researchers, funding bodies, publishers, and scholarly institutions to select an appropriate repository for storage of research data or to search for data.

The registry allows the easy identification of appropriate research data repositories, both for data producers and users. The registry covers research data repositories from all academic disciplines. Information icons display the principal attributes of a repository, allowing users to identify the functionalities and qualities of a data repository. These attributes can be used for multi-faceted searches, for instance to find a repository for geoscience data using a Creative Commons licence.

By April 2013, 338 research data repositories were indexed in 171 of these are described by a comprehensive vocabulary, which was developed by involving the data repository community (

The search at can be found at:
The information icons are explained at:

Does this sound like any of these?:


The Dataverse Network Project

IOGDS: International Open Government Dataset Search

PivotPaths: a Fluid Exploration of Interlinked Information Collections

Quandl [> 2 million financial/economic datasets]

Just to name five (5) that came to mind right off hand?

Addressing the heterogeneous nature of data repositories by creating another, semantically different data repository, seems like a non-solution to me.

What would be useful would be to create a mapping of this “new” classification, which I assume works for some group of users, against the existing classifications.

That would allow users of the “new” classification to access data in existing repositories, without having to learn their classification systems.

The heterogeneous nature of information is never vanquished but we can incorporate it into our systems.

Comments are closed.