The Correct End Of Your Telescope – Viewing Schema.org Adoption by Richard Wallis.
I have been banging on about Schema.org for a while. For those that have been lurking under a structured data rock for the last year, it is an initiative of cooperation between Google, Bing, Yahoo!, and Yandex to establish a vocabulary for embedding structured data in web pages to describe ‘things’ on the web. Apart from the simple significance of having those four names in the same sentence as the word cooperation, this initiative is starting to have some impact. As I reported back in June, the search engines are already seeing some 7%-10% of pages they crawl containing Schema.org markup. Like it or not, it is clear that Schema.org is rapidly becoming a de facto way of marking up your data if you want it to be shared on the web and have it recognised by the major search engines.
It is no coincidence then, at OCLC we chose Schema.org as the way to expose linked data in WorldCat. If you haven’t seen it, just search for any item at worldcat.org, scroll to the bottom of the page and open up the Linked Data tab and there you will see the [not very pretty, but hay it’s really designed for systems not humans] Schema.org marked up linked data for the item, with links out to other data sources such as VIAF, LCSH, FAST, and Dewey.
Schema.org has much to recommend itself but I suspect that HTML remains the “…de facto way of marking up your data if you want it to be shared on the web and have it recognised by the major search engines.”
Ten percent is no mean feat but it is still ten percent.
Patrick,
HTML remains the “…de facto way of marking up your text if you want it to be shared on the web and have it recognised by the major search engines.”
If you want to share structured data about your resources you need to choose a well recognised vocabulary, such as Schema.org, and a way to embed it in the HTML – RDFa.
To use a Google description of the difference: Its all about ‘Things’ not ‘Strings’ – check out this Google post for a bit more insight: http://googleblog.blogspot.com/2012/05/introducing-knowledge-graph-things-not.html
Comment by RJW — November 4, 2012 @ 8:22 pm
Richard,
Thanks for the Google link but I am familiar with the insertion of extra helps so that computers can make better use of texts. Usually under the rubric “structured data.”
I don’t disagree that given the impoverished state of computer processing of language, as opposed to human processing, solutions such as Schema.org are an absolute necessity.
Where I do disagree is with the need to eradicate semantic diversity for the convenience of our computers.
In other fields, that would be called cooking the data until it fits a method chosen for the solution.
Comment by Patrick Durusau — November 5, 2012 @ 5:50 am