Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 4, 2012

The Correct End Of Your Telescope – Viewing Schema.org Adoption

Filed under: Dewey - DDC,FAST,LCSH,Schema.org,WorldCat — Patrick Durusau @ 7:16 pm

The Correct End Of Your Telescope – Viewing Schema.org Adoption by Richard Wallis.

telescope graphic

I have been banging on about Schema.org for a while.  For those that have been lurking under a structured data rock for the last year, it is an initiative of cooperation between Google, Bing, Yahoo!, and Yandex to establish a vocabulary for embedding structured data in web pages to describe ‘things’ on the web.  Apart from the simple significance of having those four names in the same sentence as the word cooperation, this initiative is starting to have some impact.  As I reported back in June, the search engines are already seeing some 7%-10% of pages they crawl containing Schema.org markup.  Like it or not, it is clear that Schema.org is rapidly becoming a de facto way of marking up your data if you want it to be shared on the web and have it recognised by the major search engines.

It is no coincidence then, at OCLC we chose Schema.org as the way to expose linked data in WorldCat.  If you haven’t seen it, just search for any item at worldcat.org, scroll to the bottom of the page and open up the Linked Data tab and there you will see the [not very pretty, but hay it’s really designed for systems not humans] Schema.org marked up linked data for the item, with links out to other data sources such as VIAF, LCSH, FAST, and Dewey.

Schema.org has much to recommend itself but I suspect that HTML remains the “…de facto way of marking up your data if you want it to be shared on the web and have it recognised by the major search engines.”

Ten percent is no mean feat but it is still ten percent.

July 2, 2012

The strange case of eugenics:…

Filed under: Classification,Collocative Integrity,Dewey - DDC,Ontogeny — Patrick Durusau @ 4:01 pm

The strange case of eugenics: A subject’s ontogeny in a long-lived classification scheme and the question of collocative integrity by Joseph T. Tennis. (Tennis, J. T. (2012), The strange case of eugenics: A subject’s ontogeny in a long-lived classification scheme and the question of collocative integrity. J. Am. Soc. Inf. Sci., 63: 1350–1359. doi: 10.1002/asi.22686)

Abstract:

This article introduces the problem of collocative integrity present in long-lived classification schemes that undergo several changes. A case study of the subject “eugenics” in the Dewey Decimal Classification is presented to illustrate this phenomenon. Eugenics is strange because of the kinds of changes it undergoes. The article closes with a discussion of subject ontogeny as the name for this phenomenon and describes implications for information searching and browsing.

Tennis writes:

While many theorists have concerned themselves with how to design a scheme that can handle the addition of subjects, very little has been done to study how a subject changes after it is introduced to a scheme. Simply because we add civil engineering to a scheme of classification in 1920 does not signify that it means the same thing today. Almost 100 years have passed, and many things have changed in that subject. We may have subdivided this class in 1950, thereby separating the pre-1950 meaning from the post-1950 meaning and also affecting the collocative power of the class civil engineering. Other classes in the superclass of engineering might be considered too close, and are eliminated over time, affecting the way the classifier does her or his work (cf. Tennis, 2007; Tennis & Sutton, 2008). It is because of these concerns, coupled with the design requirement of collocation in classification, that we need to look at the life of a subject over time—the subject’s scheme history or ontogeny.

Deeply interesting work that has implications for topic map structures and the preservation of “collocative integrity” over time.

One suspects that preservation of “collocative integrity” is an ongoing process that requires more than simple assignments in a scheme.

What factors would you capture to trace the ontogeny of “euqenics” and how would you use them to preserve “collocative integrity” across that history using a topic map? (Remembering that users at any point in that ontogeny may be ignorant of prior (obviously subsequent) changes in its classification.)

June 29, 2012

DDC 23 released as linked data at dewey.info

Filed under: Classification,Dewey - DDC,Linked Data — Patrick Durusau @ 3:14 pm

DDC 23 released as linked data at dewey.info

From the post:

As announced on Monday at the seminar “Global Interoperability and Linked Data in Libraries” in beautiful Florence, an exciting new set of linked data has been added to dewey.info. All assignable classes from DDC 23, the current full edition of the Dewey Decimal Classification, have been released as Dewey linked data. As was the case for the Abridged Edition 14 data, we define “assignable” as including every schedule number that is not a span or a centered entry, bracketed or optional, with the hierarchical relationships adjusted accordingly. In short, these are numbers that you find attached to many WorldCat records as standard Dewey numbers (in 082 fields), as additional Dewey numbers (in 083 fields), or as number components (in 085 fields).

The classes are exposed with full number and caption information and semantic relationships expressed in SKOS, which makes the information easily accessible and parsable by a wide variety of semantic web applications.

This recent addition massively expands the data set by over 38.000 Dewey classes (or, for the linked data geeks out there, by over 1 million triples), increasing the number of classes available almost tenfold. If you like, take some time to explore the hierarchies; you might be surprised to find numbers for Maya calendar or transits of Venus (loyal blog readers will recognize these numbers).

All the old goodies are still there, of course. Depending on which type of user agent is accessing the data (e.g., a browser) a different representation is negotiated (HTML or various flavors of RDF). The HTML pages still include RDFa markup, which can be distilled into RDF by browser plug-ins and other applications without the user ever having to deal with the RDF data directly.

More details follow but that should be enough to capture your interest.

Good thing there is a pointer for the Maya calendar. Would hate for interstellar archaeologists to think we were too slow to invent a classification number for the disaster that is supposed to befall us this coming December.

I have renewed my ACM and various SIG memberships to run beyond December 2012. In the event of an actual disaster refunds will not be an issue. 😉

Powered by WordPress