Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

February 7, 2013

The Semantic Web Is Failing — But Why? (Part 1)

Filed under: Identity,OWL,RDF,Semantic Web — Patrick Durusau @ 4:29 pm

Introduction

Before proposing yet another method for identification and annotation of entities in digital media, it is important to draw lessons from existing systems. Failing systems in particular, so their mistakes are not repeated or compounded. The Semantic Web is an example of such a system.

Doubters of that claim should the report Additional Statistics and Analysis of the Web Data Commons August 2012 Corpus by Web Data Commons.

Web Data Commons is a structured data research project based at the Research Group Data and Web Science at the University of Mannheim and the Institute AIFB at the Karlsruhe Institute of Technology. Supported by PlanetData and LOD2 research projects, the Web Data Commons is not opposed to the Semantic Web.

But the Additional Statistics and Analysis of the Web Data Commons August 2012 Corpus document reports:

Altogether we discovered structured data within 369 million of the 3 billion pages contained in the Common Crawl corpus (12.3%). The pages containing structured data originate from 2.29 million among the 40.5 million websites (PLDs) contained in the corpus (5.65%). Approximately 519 thousand websites use RDFa, while only 140 thousand websites use Microdata. Microformats are used on 1.7 million websites. It is interesting to see that Microformats are used by approximately 2.5 times as many websites as RDFa and Microdata together. (emphasis added)

To sharpen the point, RDFa is 1.28% of the 40.5 million websites, eight (8) years after its introduction (2004) and four (4) years after reaching Recommendation status (2008).

Or more generally:

Parsed HTML URLs 3,005,629,093
URLs with Triples 369,254,196

On in a layperson’s terms, for this web corpus, parsed HTML URLs outnumber URLs with Triples between approximately eight to one.

Being mindful that the corpus is only web accessible data and excludes “dark data,” the need for a more robust solution that the Semantic Web is self-evident.

The failure of the Semantic Web is no assurance that any alternative proposal will fare better. Understanding why the Semantic Web is failing is a prerequisite to any successful alternative.


Before you “flame on,” you might want to read the entire series. I end up with a suggestion based on work by Ding, Shinavier, Finin and McGuinness.


The next series starts with Saving the “Semantic” Web (Part 1)

6 Comments

  1. […] The Semantic Web Is Failing — But Why? (Part 1) […]

    Pingback by Saving the “Semantic” Web (part 5) « Another Word For It — February 18, 2013 @ 3:34 pm

  2. […] Word For It Patrick Durusau on Topic Maps and Semantic Diversity « The Semantic Web Is Failing — But Why? (Part 1) The Semantic Web Is Failing — But Why? (Part 3) […]

    Pingback by The Semantic Web Is Failing — But Why? (Part 2) « Another Word For It — February 18, 2013 @ 3:49 pm

  3. […] The Semantic Web Is Failing — But Why? (Part 1) […]

    Pingback by The Semantic Web Is Failing — But Why? (Part 3) « Another Word For It — February 18, 2013 @ 3:49 pm

  4. […] The Semantic Web Is Failing — But Why? (Part 1) […]

    Pingback by The Semantic Web Is Failing — But Why? (Part 4) « Another Word For It — February 18, 2013 @ 3:50 pm

  5. […] The Semantic Web Is Failing — But Why? (Part 1) […]

    Pingback by The Semantic Web Is Failing — But Why? (Part 5) « Another Word For It — February 18, 2013 @ 3:51 pm

  6. […] The Semantic Web Is Failing — But Why? (Part 1) […]

    Pingback by Simple Web Semantics – Index Post « Another Word For It — February 18, 2013 @ 4:23 pm

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress