Data Repositories – Mother’s Milk for Data Scientists by Jerry A. Smith.
From the post:
Mothers are life givers, giving the milk of life. While there are so very few analogies so apropos, data is often considered the Mother’s Milk of Corporate Valuation. So, as a data scientist, we should treat dearly all those sources of data, understanding their place in the overall value chain of corporate existence.
A Data Repository is a logical (and sometimes physical) partitioning of data where multiple databases which apply to specific applications or sets of applications reside. For example, several databases (revenues, expenses) which support financial applications (A/R, A/P) could reside in a single financial Data Repository. Data Repositories can be found both internal (e.g., in data warehouses) and external (see below) to an organization. Here are a few repositories from KDnuggets that are worth taking a look at: (emphasis in original)
I count sixty-four (64) collections of data sets as of today.
What I haven’t seen, perhaps you have, is an index across the most popular data set collections that dedupes data sets and has thumb-nail information for each one.
Suggested indexes across data set collections?