Archive for the ‘Uncategorized’ Category

Constructions from Dots and Lines

Monday, June 14th, 2010

Constructions from Dots and Lines by Marko A. Rodriguez and Peter Neubauer is an engaging introduction to graphs and why they are important.

Abstract:

A graph is a data structure composed of dots (i.e. vertices) and lines (i.e. edges). The dots and lines of a graph can be organized into intricate arrangements. The ability for a graph to denote objects and their relationships to one another allow for a surprisingly large number of things to be modeled as a graph. From the dependencies that link software packages to the wood beams that provide the framing to a house, most anything has a corresponding graph representation. However, just because it is possible to represent something as a graph does not necessarily mean that its graph representation will be useful. If a modeler can leverage the plethora of tools and algorithms that store and process graphs, then such a mapping is worthwhile. This article explores the world of graphs in computing and exposes situations in which graphical models are beneficial.

JErlang: Erlang with Joins

Friday, June 11th, 2010

JErlang: Erlang with Joins by Hubert Ploiniczak should interest anyone implementing distributed topic map systems.

The value of having a distributed architecture (did I hear “Internet?”) has been lost on the Semantic Web advocates. With topic maps you can have multiple locations that “resolve” identifiers to other identifiers and pass on information about something that has been identified.

Most existing topic maps look like data silos but that is more a matter of habit than architectural limitation.

I should put in a plug for the Springer Alert Service, which brought the article with the same title, JErlang: Erlang with Joins to my attention. Highly recommended as a way to stay current on the latest CS research. Remember articles don’t have to say “topic map” in the title or abstract to be relevant.

PS: Topic map observations: The final report and article have the same name. In topic maps the different locations for the items would be treated as subject locators, thus allowing them to retain the same name but being distinguished one from the other. Note that the roles differ with the two subjects as well. Susan Eisenbach is the supervisor of the final report and is a co-author of the article reported by Springer.

The Fourth Paradigm: Data-intensive Scientific Discovery

Wednesday, June 9th, 2010

Jack Park points to The Forth Paradigm: Data-Intensive Scientific Discovery as a book that merits our attention.

Indeed it does! Lurking just beneath the surface of data-intensive research are questions of semantics. Diverse semantics. How does data-intensive research occur in a multi-semantic world?

Paul Ginsparg (Cornell University), in Text in a Data-centric World, has the usual genuflection towards “linked data” without stopping to consider the cost of evaluating every URI to decide if it is an identifier or a resource. Nor why adding one more name to the welter of names we have now (that is the semantic diversity problem) is going to make our lives any better?

Ginsparg writes:

Such an articulated semantic structure [linked data] facilitates simpler algorithms acting on World Wide Web text and data and is more feasible in the near term than building a layer of complex artificial intelligence to interpret free-form human ideas using some probabilistic approach.

Solving the “perfect language” problem, which has never been solved, is more feasible than “…building a layer of complex artificial intelligence to interpret free-form human ideas using some probabilistic approach” to solve it for us?

Perhaps so but one wonders why that is a useful observation?

On the “perfect language” problem, see The Search for the Perfect Language by Umberto Eco.

The Future of the Journal

Saturday, June 5th, 2010

The Future of the Journal is another slide deck by Anita de Waard that reads like a promotional piece for topic maps, sans any mention of topic maps.

While Anita makes a strong case for annotation of data in science publishing, the same is true for government, legal, environmental, business, finance, etc., publications. All publications are as complex as depicted on these slides. It isn’t as obvious in the humanities because that “data” has been locked away so long that we have forgotten it is there.

The more complex the information we record, via “annotations” or some other mechanism, the greater the need for librarians to organize it and help us find it. Self-help in research is like the guy about to do a self-appendectomy with his doctor’s advice over the phone. Doable, maybe, but the results are pretty much what you would expect.

Rather than future of the journal, I would say: Future of Information.