Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 10, 2011

Google1000 dataset

Filed under: Dataset,Image Recognition,Machine Learning — Patrick Durusau @ 6:46 pm

Google1000 dataset

From the post:

This is a dataset of scans of 1000 public domain books that was released to the public at ICDAR 2007. At the time there was no public serving infrastructure, so few people actually got the 120GB dataset. It has since been hosted on Google Cloud Storage and made available for public download: (see the post for the links)

Intended for OCR and machine learning purposes. The results of which you may wish to unite in topic maps with other resources.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress