Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

February 11, 2013

Saving the “Semantic” Web (part 2) [NOTLogic]

Filed under: Linked Data,RDF,Semantic Web — Patrick Durusau @ 5:45 pm

Expressing Your Semantics: NOTLogic

Saving the “Semantic” Web (part 1) ended concluding authors of data/content should be asked about the semantics of their content.

I asked if there were compelling reasons to ask someone else and got no takers.

The acronym, NOTLogic may not be familiar. It expands to: Not Only Their Logic.

Users should express their semantics in the “logic” of their domain.

After all, it is their semantics, knowledge and domain that are being captured.

Their “logic” may not square up with FOL (first order logic) but where’s the beef?

Unless one of the project requirements is to maintain consistency with FOL, why bother?

The goal in most BI projects is ROI on capturing semantics, not adhering to FOL for its own sake.

Some people want to teach calculators how to mimic “reasoning” by using that subset known as “logic.”

However much I liked the Friden rotary calculator of my youth:

Calculator

teaching it to mimic “reasoning” isn’t going to happen on my dime.

What about yours?

There are cases where machine learning technique are very productive and fully justified.

The question you need to ask yourself (after discovering if you should be using RDF at all, The Semantic Web Is Failing — But Why? (Part 2)) is whether “their” logic works for your use case.

I suspect you will find that you can express your semantics, including relationships, without resort to FOL.

Which may lead you to wonder: Why would anyone want you to use a technique they know, but you don’t?

I don’t know for sure but have some speculations on that score I will share with you tomorrow.

In the mean time, remember:

  1. As the author of content or data, you are the person to ask about its semantics.
  2. You should express your semantics in a way comfortable for you.

7 Comments

  1. I was waiting to respond until your series was finished, apparently you are looking for feedback now.

    Your first point is that the semantic web is failing.
    I disagree there. I disagree even more with your solutions.
    As your solutions are not practical on a large scale or over time.

    I think that your assumptions about the semantic web are incorrect, which leads to wrong conclusions.

    The semantic web is a set of social constructs (social not technical!) aimed to solve problems that many of us have in our day to day work (not only research).
    The problem we have is what is the meaning of recorded information.

    The semantic web aims to solve this with a number of social constructs.
    The first is: represent your information in simple sentences.

    e.g.
    我 爱我的 足球

    Oh you don’t understand simplified Chinese. That’s to bad. Lets see if we can help you here.
    我 =i = the subject of our sentences
    爱我的 = love = the predicate of our sentence, the relation between the first and the last word.
    足球 = football = the object of our sentence

    Simple enough sentence right. Except who is I and is the football American style or is it what the Americans call soccer, and what kind of love are we talking about anyway?
    Well to avoid that confusion we name things with URI.

    http://www.linkedin.com/in/jervenbolleman http://dictionary.reference.com/browse/love?s=t http://en.wikipedia.org/wiki/Football

    So ok, what do we have here there is a “love” relation ship between something represented by a “linkedin page” and a concept documented at another page.
    This is simple enough example. So basically by using URI’s we are writing sentences by providing references into a dictionary.
    The dictionary here being the whole internet instead of a single book in your library.
    The aim here is that you can look up the meaning of words as you desire.
    You are no longer depended on asking me for what I meant.
    Which is nice because I won’t be around for ever.
    This leads to me most common reason about not asking the author about what he really meant with a sentence and that is that they are no longer around or does not answer your questions.
    If you think that is not a problem you have not talked to a theologists, met a management consultant or someone who maintains software 😉

    Now you can say you don’t need RDF for this, and I would agree.
    There are many ways to represent this information.
    RDF is just a set of social constructs termed in a clear technical specification which helps us communicate.
    RDF as explained here is the wheel and brake in your VW example, and powerful enough to enable a lot of useful work.
    RDF deployments can be as complicated as your F16 example as well, but that’s only because a VW van does not always suffice.
    Of course a URI might not resolve to a webpage (the dictionary might not have the word).
    This is a problem of course, but at least you have more context to work with then before.

    You complain about a second problem we have (according to you only in the semantic web).
    That problem is the use of multiple identifiers for the same concept and when they can be merged or not.
    Take the following hypothetical example

    http://www.foxnews.com/Barack_Obama owl:sameAs http://www.msnbc.com/Barack_Obama .

    Ok, fox and msnbc both talk about the same concept. The 44th president of the United States of America. Both are “news” television channels.
    The question then becomes are they really talking about the same concept?
    When I hear Bill O’Reilly on fox talk about Barack Obama, you doubt that he is talking about the same man that Rachel Maddow (MSNBC) talks about.
    Merging the information from both sources can give really wierd results.

    http://www.foxnews.com/Barack_Obama rdf:type fox:SpawnOfTheDevil .
    http://www.msnbc.com/Barack_Obama http://www.msnbc.com/last_hope_of http://en.wikipedia.org/wiki/Homo_sapiens .

    When merged this gives: “the last hope of humanity is a spawn of the devil”…
    Mapping information is hard work, and logical shared identity is rare, but can be used usefully for many problems.
    Because mapping is hard, we in the field make a lot of mistakes that need to be corrected.
    Also a mapping for one’s use case is no good for another.
    Adding ‘-‘,’+’ or ‘=’ into a JSON file is not going to help anyone there.
    Is your ‘-‘ the same as my ‘-‘ or is it closer to a ‘=’ ?

    There is of course a difference between recording data and asking questions of it.
    The semantic web has a convincing story here as well.
    SPARQL 1.1 as currently undergoing ratification has a lot to speak for itself.
    The nice thing about SPARQL is that its storage agnostic.
    Something no other query language has come close to achieve.
    You can SPARQL against relational databases (SPARQL -> SQL translations).
    You can SPARQL against RDF stores.
    You can SPARQL against graph stores.
    You can SPARQL against web-services (XML/rest or SOAP via SADI).
    You can SPARQL against excel work sheets.
    You can SPARQL against all of the above in one go! and more…

    Name one single query language which is so flexible in the real world.

    Because, of the simple data-model, that supports communication we gain a lot of direct capabilities.

    To think that its failing at this time is to seriously underestimate the work done and achieved over the last few years.

    You don’t understand that success takes time, and that for a system with large momentum it is difficult to change direction.
    The web as you know it with HTML etc… was preceded by a lot of other systems.
    In the beginning of the web you also had people claiming Gopher was so much better because it was more widespread.

    Now about reasoning using OWL, or RDFS based constructs.
    Sure this is not for everyone, but don’t think it is not useful.
    Programming in assembly is also not for everyone, but that is useful as well.
    Declaring your semantics in a way compatible to first order logic can give you a large ROI.
    When that is the case use it. When that is not the case but if it does not cost you use it.
    You would be very surprised to see how often it is useful.
    Even if you only use the surface capabilities.

    All in all, I would have been more impressed with this series of blog posts in 2004 than now.

    Comment by j22 — February 12, 2013 @ 8:42 am

  2. Had you identified yourself I would be responding to you with the courtesy of a personal name.

    1) Is the Semantic Web Failing? You say no. I point to facts gathered by Semantic Web advocates indicating a failure of adoption.

    Moreover, if the Semantic Web isn’t failing, where is the adoption by Bing, Google, etc. The “big” players? Oh, I forgot, they are building alternatives to the Semantic Web.

    Readers should judge the success/failure of the Semantic Web on the facts. The ones I have cited and others that are openly available.

    2) On asking authors, either my explanation failed or you are confusing asking authors with use of any single identifier.

    2.a Single Identifier, even a URI. The problem with a single identifier is that robust identification of a subject requires multiple properties. Take my example of owl:sameAs. One URI, multiple interpretations. Yes?

    2.b Ask the author. Yes many time the author won’t be present but an author who wants to make their identification clearer (a relative state) to others, should have the ability to use more than a single identifier, URI or not.

    Related but distinct issues.

    3. Merging separate identifiers is always hard work and I never said or implied it was limited to the Semantic Web. The Semantic Web is poorly suited to that task was my point.

    People have struggled (and overcome) this problem with indexes for years. My JSON example was an illustration, not a solution.

    4. SPARQL – Was your question to name one of the least used query languages in the world? And yes, simple data models give you simple answers, if that is what you are looking for.

    5. Success takes time. No doubt and advocates of Esperanto are still waiting for their moment in the sun.

    The fundamental problem of the Semantic Web is thinking the substitution of one identifier (URIs) for all others is anything other than identifier substitution. It doesn’t move the ball one inch further down the field.

    6. Logic – I pointed out in my post that logic can be useful, the question is whether it is a prerequisite for the Semantic Web? Pat Hayes and others disagree with you and seem to think it is a prerequisite for using RDF for example.

    You should note that I quoted Semantic Web advocates in context to make my points. I drew my own conclusions but the facts were those espoused by Semantic Web experts themselves.

    As far as these post being more impressive in 2004, well, at that point many of us speculated on the coming failure of the Semantic Web even then. Due to the single identifier and other issues.

    But vagueness of the Semantic Web enterprise and lack of real world experience made people who should have known better to say “give it a chance.”

    It has had a chance. A very long and well-funded one. And it remains ignored by all the major IT vendors.

    Contrast the Semantic Web adoption curve with HTML for a W3C on W3C comparable. See the difference?

    What more signs of failure would you want?

    Comment by Patrick Durusau — February 12, 2013 @ 10:22 am

  3. This is a late night reply, and I did not bother to spell check it or even look for basic grammar mistakes. My reply does read more hostile than its should. But thats because I just strongly disagree with your conclusions on this topic. I really like your blogging style and do pick up a lot of very useful information from your work. And I would be very sorry if my writing lessens your enjoyment in maintaining your blog/twitter feed.

    I thought my personal name is in your system. It should be now. Not that my name matters or does it? Wouldn’t it have been nice if I used a URI where you could look up more about this thing called j22. There is one in my examples, you can even find my CV there! Of course this is the internet, someone, else might just claim to be me.

    I agree readers should judge on their own interpretation of the facts.

    1) Are Bing, Google, and Yahoo! really the big boy’s?
    Seems to me they are rather small and focussed in what they do. Selling advertising eye balls. Does anything in the semantic web make their live easier? They don’t do that much high precision data integration. When they do they often buy it off the shelf (i.e. flight information). Or crowd source it so that they can use it in their systems. What works for the big boys might not
    work for the small ones.

    2.a) URI’s just pointers into a dictionary. Nothing more nothing less. As you know a dictionary explanation is still subject for interpretation. Nothing changed, you just have a better idea of what is meant. More context for you to form your opinion. The use of URI’s just leads to a single identifier not pointing to two or more resources. e.g. 654924 being a color and a NCBI taxonomy record. Use of URI’s just solved this issue (and its surprisingly common).
    Using URI’s are just a useful convention people can depend upon. Much like java packages compared to C++ namespaces.

    2.b) Of course they could use something else than an URI. But few of them would be as practical.

    3. Is anything better suited? does anything else even agree on what an identifier is and how to use it to look up more information? JSON, for sure, does not.
    Indexes don’t help if the original data is not precise enough. Something where the semantic web helps.

    4. Sure SPARQL is not used much, but the power that it gives the users is fantastic. SQL was not used much in the beginning, if you think that IT changes quickly I have other news for you. I just talked to some bank IT staff, they are happily stuck with java 1.3. Software out of support for more than 5 years. Just 12 years after it being launched. Things that work, or kinda work do not get thrown out and reimplemented just because something new comes along. And if you think you can just take a large code and data base infrastructure to a different in less than a year you have never dealt with real infrastructure projects.

    5. Well is Esperanto a bad idea or even a failure? Does it solve someone’s problem?

    6. Logic is required, first order logic or even set theory, I think is over doing it. And if I disagree with Pat Hayes, well I disagree with lots of people, that in my nature 😉

    HTML solved a problem that many more people realized they had than the semantic web. First show people their problem then show them the general solution. For now most are happy with their ruby goldberg contraptions. And as I stated in the beginning: the semantic web is a social solution.

    What should we have been doing instead of the semantic web? ISO Topic Maps? There is some great work in there, but has it been a better success?

    Relational databases took 10 years from idea to implementation, and an other 10 years before it became the “standard”.

    I started on the road to RDF not because I wanted too, but because I was hired to maintain a system that used/provided data in that format. And in the beginning (2008) I thought what crap.

    Only later did I realize the many benefits that a simple concept like using URI’s for pointers and triples for directed graph gives T back what it was missing the C for communication, and the I for information.

    7) People fund things because it might solve their problem (that includes entertainment against the problem of boredom 😉 . And like most of these things if its, worth doing it will gain critical mass. If its not worth it will stay a hobby, And I know, that we are building and selling solutions that many in my field have found very very useful.

    Comment by j22 — February 12, 2013 @ 4:02 pm

  4. I deeply appreciate your comments and don’t take them as “hostile” at all. They help to sharpen (correct?) ;-), my thinking on these issues.

    There are some points raised in your reply that I will defer into full posts so we can tease out some of the threads in this discussion. I will mark those in this reply.

    I will also mark several points where I simply don’t know the answer to some questions.

    You say:

    “1) Are Bing, Google, and Yahoo! really the big boy’s?
    Seems to me they are rather small and focussed in what they do. Selling advertising eye balls. Does anything in the semantic web make their live easier? They don’t do that much high precision data integration.”

    Good question. I put it that way because most users experience software via major software vendors. That may be unfortunate but I suspect you will agree it is the case.

    Your observation “They don’t do that much high precision data integration.” is interesting. If there were high demand for “high precision data integration,” I suspect they would offer it.

    So for both the Semantic Web, Topic Maps and other semantic integration technologies, where is the consumer demand for “high precision data integration?” (I don’t know the answer to that question.)

    You say:

    2.a) URI’s just pointers into a dictionary. Nothing more nothing less.

    3. Is anything better suited?”

    A partial answer and partial deferral.

    If a URI is a pointer into a dictionary, then why ask a machine to read the dictionary entry? My experience of using dictionaries is that I read the dictionary content and decide if the term is the one I want.

    I need to defer further comment here because I want to expand on your suggestion that a URI is just a pointer into a dictionary.

    You say:

    “4. Sure SPARQL is not used much, but the power that it gives the users is fantastic. SQL was not used much in the beginning, if you think that IT changes quickly I have other news for you.”

    I posted an entry quite recently saying that new technologies wrap old ones so we don’t disagree about the rate of IT “change.” In my experience, banks are still using 2400 baud modems and the software that goes with them. That was five years ago and may have changed now. Perhaps not.

    The problem of IT change is more subtle. How do I document the semantics of queries in SQL or SPARQL? So that mapping from those queries can performed to successors of those query languages? Those are “subjects” in the same sense as any other subject.

    You say:

    “5. Well is Esperanto a bad idea or even a failure? Does it solve someone’s problem?”

    I need to defer on this one because it is a story with history and merits a full explanation.

    You say:

    “6. Logic is required, first order logic or even set theory, I think is over doing it. And if I disagree with Pat Hayes, well I disagree with lots of people, that in my nature”

    Question: So what do you mean by logic? I was disagreeing with the RDF as specified.

    You say:

    “What should we have been doing instead of the semantic web? ISO Topic Maps? There is some great work in there, but has it been a better success?”

    In terms of adoption, no Topic Maps have not (yet?) become a great success. 😉

    There is an interesting issue here about who is served by current Semantic Web and Topic Map approaches. I must defer here but hope to post on that issue today.

    You say:

    “7) People fund things because it might solve their problem …. And I know, that we are building and selling solutions that many in my field have found very very useful.”

    True, as are relational database solutions. The question comes when users have mixed systems, relational, XML, Semantic Web. As you say SPARQL can query them all, but how do we determine the semantics of those data structures and the data they contain?

    Taking RDF data for example, how do we determine the semantics of owl:sameAs? How do you look beyond the primitive owl:sameAs using RDF?

    Once again, I deeply appreciate your comments and will try to reach the deferred replies above as soon as possible.

    Glad you find my blog useful!

    Comment by Patrick Durusau — February 13, 2013 @ 10:12 am

  5. I am picking up the deferral on:

    ****
    You say:

    “What should we have been doing instead of the semantic web? ISO Topic Maps? There is some great work in there, but has it been a better success?”

    In terms of adoption, no Topic Maps have not (yet?) become a great success.

    There is an interesting issue here about who is served by current Semantic Web and Topic Map approaches. I must defer here but hope to post on that issue today.
    ****

    In Saving the Semantic Web (Part 4), Democracy vs. Aristocracy, http://tm.durusau.net/?p=37279

    Comment by Patrick Durusau — February 13, 2013 @ 4:16 pm

  6. […] Saving the “Semantic” Web (part 2) [NOTLogic] […]

    Pingback by Simple Web Semantics – Index Post « Another Word For It — February 26, 2013 @ 12:59 pm

  7. […] of a recent comment on this series reads: What should we have been doing instead of the semantic web? ISO Topic Maps? […]

    Pingback by Saving the “Semantic” Web (part 4) « Another Word For It — June 16, 2013 @ 1:05 pm

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress