Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

August 27, 2011

Factual’s Crosswalk API

Filed under: Crosswalk,Mapping — Patrick Durusau @ 9:09 pm

Factual’s Crosswalk API by Matthew Hurst.

From the post:

Factual, which is mining the web for knowledge using a variety of web mining methods, has released an API in the local space which aims to expose, for a specific local entity (e.g. a restaurant) the places on the web that it is mentioned. For example, you might find for a restaurant its homepage, its listing on Yelp, its listing on UrbanSpoon, etc.

This mapping between entities and mentions is potentially a powerful utility. Given all these mentions, if some of the data changes (e.g. via a user update on a Yelp page) then the central knowledge base information for that entity can be updated.

When I looked, the crosswalk API was still limited to the US. Matthew uncovers the accuracy of mapping issues known all to well to topic mappers.

From the Factual site:

Factual Crosswalk does four things:

  1. Converts a Factual ID into 3rd party identifiers and URLs
  2. Converts a 3rd party URL into a Factual canonical record
  3. Converts a 3rd party namespace and ID into a Factual canonical record
  4. Provides a list of URLs where a given Factual entity is found on the Internet

Don’t know about you but I am unimpressed.

In part because of the flatland mapping approach to identification. If all I know is Identifier1 was mapped to Identifier2, that is better than a poke with a sharp stick for identification purposes, but only barely. How do I discover what entity you thought was represented by Identifier1 or Identifier2?

I suppose piling up identifiers is one approach but we can do better than that.


PS: I am adding Crosswalk as a category so I can cover traditional crosswalks as developed by librarians. I am interested in what implicit parts of crosswalks should become explicit in a topic map. Pointers and suggestions welcome. Or conversions of crosswalks into topic maps.

2 Comments

  1. There’s a separate Factual “Core” API for resolving entity IDs from whatever data you have about them, the results of which you can then use with Crosswalk. You can search and filter fairly flexibly, though topic mapping is admittedly lightweight. There is a three level categorical taxonomy, and there are the basic facts about each place. The full-text search feature of the API incorporates some synonym and stemming information gleaned from stitching together the data that drives the Crosswalk data.

    I work at Factual (obviously). Definitely interested in how to make this kind of data easier to find.

    Comment by Bradley — August 29, 2011 @ 12:26 pm

  2. @Bradley,

    Appreciate your comments! I have a post, with an attempt at graphics, in draft form but let me venture a couple of comments.

    True enough that you and I can resolve identifiers based on additional information but the question is how to we communicate the basis for that resolution to others? So they can accept or possibly reject our resolution?

    Moreover, I have come to think that identifiers are parts of composites that identify some subject. The easiest one is that all identifiers are used by some one. In a particular context.

    One of the difficulties we face is that computers don’t, for the most part, treat identifiers are parts of composites. Which makes reliable mapping all the more difficult.

    Hope you are having a great day!

    Comment by Patrick Durusau — August 29, 2011 @ 6:51 pm

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress