Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

June 12, 2012

Open Content (Index Data)

Filed under: Data,Data Source — Patrick Durusau @ 3:20 pm

Open Content

From the webpage:

The searchable indexes below expose public domain ebooks, open access digital repositories, Wikipedia articles, and miscellaneous human-cataloged Internet resources. Through standard search protocols, you can make these resources part of your own information portals, federated search systems, catalogs etc. Connection instructions for SRU and Z39.50 are provided. If you have comments, questions, or suggestions for resources you would like us to add, please contact us, or consider joining the mailing list.. This service is powered by Index Data’s Zebra and Metaproxy

Looking around after reading the post on the interview with Sebastian Hammer on Federated Search I found this listing of resources.

Database name #records Description
gutenberg 22194

Project Gutenberg.
High-quality clean-text ebooks, some audio-books.

oaister 9988376

OAIster. A Union catalog of digital resources, chiefly open archives of journals, etc.

oca-all 135673 All of the ebooks made available by the Internet Archive
as part of the Open Content Alliance (OCA). Includes high-quality, searchable PDFs, online book-readers,
audio books, and much more. Excludes the Gutenberg sub-collection, which is available as a
separate database.
oca-americana 49056 The American
Libraries
collection of the Open Content Alliance.
oca-iacl 669 The Internet Archive Children’s Library. Books for children from around the world.
oca-opensource 2616 Collection of community-contributed books at the Internet Archive.
oca-toronto 37241 The Canadian Libraries
collection of the Open
Content Alliance
.
oca-universallibrary 30888 The Universal Library, a digitzation
project founded at Carnegie-Mellon University. Content hosted at the Internet Archive.
wikipedia 1951239 Titles and abstracts from Wikipedia, the open encyclopedia.
wikipedia-da 66174 The Danish Wikipedia. Many thanks to Fujitsu Denmark for their support for the indexing of the national Wikipedias.
wikipedia-sv 243248 The Swedish Wikipedia.

Latency is an issue but I wonder what my reaction would be if a search quickly offered 3 or 4 substantive resources and invited me to read/manipulate them, while it seeks additional information/data?

Most of the articles you see cited in this blog aren’t the sort of thing you can skim and some take more than one pass to jell.

I suppose I could be offered 50 highly relevant articles in milli-seconds but I am not capable of assimalating them that quickly.

So how many resources have been wasted to give me a capacity I can’t effectively use?

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress