Archive for the ‘Uncategorized’ Category

The Linking Open Data cloud diagram

Thursday, September 23rd, 2010

The Linking Open Data cloud diagram is maintained by This page is maintained by Richard Cyganiak and Anja Jentzsch.

I suppose having DBpedia at the center of linked data is better than the CIA Factbook. ;-)

I find large visualizations like this one useful as marketing tools or “that’s cool” examples, but not terribly useful for actual analysis.

Has your experience been different?

Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic and systems chemical biology data

Sunday, September 19th, 2010

Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic and systems chemical biology data

Destined to be a deeply influential resource.

Read the paper, use the application for a week Chem2Bio2RDF, then answer these questions:

  1. Choose three (3) subjects that are identified in this framework.
  2. For each subject, how is it identified in this framework?
  3. For each subject, have you seen it in another framework or system?
  4. For each subject seen in another framework/system, how was it identified there?

Extra credit: What one thing would you change about any of the identifications in this system? Why?

Experience in Extending Query Engine for Continuous Analytics

Sunday, September 5th, 2010

Experience in Extending Query Engine for Continuous Analytics by Qiming Chen and Meichun Hsu has this problem statement:

Streaming analytics is a data-intensive computation chain from event streams to analysis results. In response to the rapidly growing data volume and the increasing need for lower latency, Data Stream Management Systems (DSMSs) provide a paradigm shift from the load-first analyze-later mode of data warehousing….

Moving from load-first analyze-later has implications for topic maps over data warehouses. Particularly when events that are subjects may only have a transient existence in a data stream.

This is on my reading list to prepare to discuss TMQL in Leipzig.

PS: Only five days left to register for TMRA 2010. It is a don’t miss event.

Post Early, Post Often

Thursday, August 19th, 2010

Apologies for the lack of a post for August 18, 2010.

I was working on a post late yesterday evening when my ISP lost connectivity to the Net. :-(

I could not stay up late enough to see if it would be repaired before the end of the day.

Hence, no post for August 18, 2010.

Have a lot of stuff in the queue so will try to get an early post out most days.

PGS – Pretty Good Semantics

Thursday, August 5th, 2010

PGS – Pretty Good Semantics is the result of months of conversation with Sam Hunting.

Our starting premise: Users want to say things of interest to them, as simply as possible, for them.

Note the focus on users. Not on description logic. Not on formal ontologies. Not on reasoning, artificial or otherwise. Not even on complex mappings between identifications. But on users.

All of those other things are worthwhile enterprises, some of them anyway, which you can pursue your own leisure.

The question is how to empower users to say things about what interests them? And if possible, how to do so without re-writing the WWW to deal with 303 clouds, etc. ?

Our answer to those questions: PGS – Pretty Good Semantics. It asks very little of users yet can annotate any identifier on the WWW to say whatever a user likes.

It uses existing HTML techniques and works with existing web servers and search engines.

Enjoy!

Lesson for Topic Maps?

Friday, July 16th, 2010

In an exchange over a MapReduce resource, Robert Barta observed how large that ecosystem has grown in just a year, and suggested there is a lesson for the TM community in that growth. But what lesson is that? (He didn’t say, but I have written to ask.)

“MapReduce” isn’t a cooler a name than “Topic Maps” so that’s not lesson.

MapReduce isn’t less complex than topic maps so that’s not the lesson as well.

Two issues that MapReduce does not face:

  1. Users resisted (and still do resist) markup because it requires making explicit choices about the structure of a text. We learn text structures from users, but for the most part, they are reluctant to name those parts. Is there an analogy to making subjects explicit for a topic map?
  2. If we identify our subjects (our insider vocabulary), then what makes us special will be known by others.

MapReduce doesn’t face the first issue because users can create whatever mapping they wish, without ever saying explicitly what subjects are involved. It also preserves the special nature of insider vocabularies since it has no explicit mechanism for identifying subjects.

Are those the lessons? If they are, are there work arounds? Are there other lessons?

Pragmatic Topic Map Streaming – From Semantic Headache

Tuesday, July 6th, 2010

Pragmatic Topic Map Streaming by Jan Schreiber raises some interesting questions about how to construct a data stream for a topic map.

I particularly like the idea of creating mini-topic maps as it were. See his post for the details.

He did not touch on was how topic map stream software would recognize subjects. A topic map stream creator with a configurable subject recognition would be really useful. Most of us could use the “topic maps subjects” recognition filter while others, interested in dull subjects like the World Cup (just teasing) could have a subject filter for it. Some of us could have both, feeding different topic maps.

iPhone Opportunity for Topic Maps

Sunday, July 4th, 2010

The You Say God Is Dead? There’s an App for That story in the New York Times, July 2, 2010, looks like an opportunity for topic maps.

For publishers, it would be possible to map responses on the basis of topics and let the topic map handle the details of where that is the appropriate response to an “opposing” app. It should shorten the update/production cycle as new material is added to counter new arguments or variations of old ones.

On the product side, publishers could use topic maps to enable users to respond to a variety of ways of naming or phrasing particular issues. In debates over religion, as in all other areas, differences in terminology can make it difficult to come to grips with the opposing side.

Depending on how it was implemented, a topic map app could integrate other resources, ranging from study materials to personal contacts as they relate to this application. Think of a topic map as being able to bridge between data held in mini-silos on an iPhone. So users could add in information into the app that was useful to them in such debates.

Any other critical points I should make as I contact publishers of these apps to recommend topic maps?

*****
PS: Did anyone with an iPhone try out tmjs from Jan Schreiber? I really don’t want to have to buy an iPhone just for that. Help me out here.

Constructions from Dots and Lines

Monday, June 14th, 2010

Constructions from Dots and Lines by Marko A. Rodriguez and Peter Neubauer is an engaging introduction to graphs and why they are important.

Abstract:

A graph is a data structure composed of dots (i.e. vertices) and lines (i.e. edges). The dots and lines of a graph can be organized into intricate arrangements. The ability for a graph to denote objects and their relationships to one another allow for a surprisingly large number of things to be modeled as a graph. From the dependencies that link software packages to the wood beams that provide the framing to a house, most anything has a corresponding graph representation. However, just because it is possible to represent something as a graph does not necessarily mean that its graph representation will be useful. If a modeler can leverage the plethora of tools and algorithms that store and process graphs, then such a mapping is worthwhile. This article explores the world of graphs in computing and exposes situations in which graphical models are beneficial.

JErlang: Erlang with Joins

Friday, June 11th, 2010

JErlang: Erlang with Joins by Hubert Ploiniczak should interest anyone implementing distributed topic map systems.

The value of having a distributed architecture (did I hear “Internet?”) has been lost on the Semantic Web advocates. With topic maps you can have multiple locations that “resolve” identifiers to other identifiers and pass on information about something that has been identified.

Most existing topic maps look like data silos but that is more a matter of habit than architectural limitation.

I should put in a plug for the Springer Alert Service, which brought the article with the same title, JErlang: Erlang with Joins to my attention. Highly recommended as a way to stay current on the latest CS research. Remember articles don’t have to say “topic map” in the title or abstract to be relevant.

PS: Topic map observations: The final report and article have the same name. In topic maps the different locations for the items would be treated as subject locators, thus allowing them to retain the same name but being distinguished one from the other. Note that the roles differ with the two subjects as well. Susan Eisenbach is the supervisor of the final report and is a co-author of the article reported by Springer.

The Fourth Paradigm: Data-intensive Scientific Discovery

Wednesday, June 9th, 2010

Jack Park points to The Forth Paradigm: Data-Intensive Scientific Discovery as a book that merits our attention.

Indeed it does! Lurking just beneath the surface of data-intensive research are questions of semantics. Diverse semantics. How does data-intensive research occur in a multi-semantic world?

Paul Ginsparg (Cornell University), in Text in a Data-centric World, has the usual genuflection towards “linked data” without stopping to consider the cost of evaluating every URI to decide if it is an identifier or a resource. Nor why adding one more name to the welter of names we have now (that is the semantic diversity problem) is going to make our lives any better?

Ginsparg writes:

Such an articulated semantic structure [linked data] facilitates simpler algorithms acting on World Wide Web text and data and is more feasible in the near term than building a layer of complex artificial intelligence to interpret free-form human ideas using some probabilistic approach.

Solving the “perfect language” problem, which has never been solved, is more feasible than “…building a layer of complex artificial intelligence to interpret free-form human ideas using some probabilistic approach” to solve it for us?

Perhaps so but one wonders why that is a useful observation?

On the “perfect language” problem, see The Search for the Perfect Language by Umberto Eco.

The Future of the Journal

Saturday, June 5th, 2010

The Future of the Journal is another slide deck by Anita de Waard that reads like a promotional piece for topic maps, sans any mention of topic maps.

While Anita makes a strong case for annotation of data in science publishing, the same is true for government, legal, environmental, business, finance, etc., publications. All publications are as complex as depicted on these slides. It isn’t as obvious in the humanities because that “data” has been locked away so long that we have forgotten it is there.

The more complex the information we record, via “annotations” or some other mechanism, the greater the need for librarians to organize it and help us find it. Self-help in research is like the guy about to do a self-appendectomy with his doctor’s advice over the phone. Doable, maybe, but the results are pretty much what you would expect.

Rather than future of the journal, I would say: Future of Information.