Archive for the ‘Stanbol’ Category

Integration of Information Workbench with Stanbol: Public Demo Available

Thursday, December 20th, 2012

Integration of Information Workbench with Stanbol: Public Demo Available

From the post:

We are happy to announce a public demo showcasing the integration of Apache Stanbol into the Information Workbench platform by fluid Operations AG. This integration is the result of fluid Operations’ participation in the IKS Early Adopters program.

With the help of Apache Stanbol enhancement engines, the Information Workbench is able to enrich free-text content with references to semantic data instances. This enables advanced data management capabilities to Information Workbench users in the area of semantic content management and publishing: both making it easier to organize internal data by linking structured and free-text content as well as assisting in content authoring and publishing.

Our public demo illustrates these capabilities by presenting a competitive intelligence scenario. In this demo, a collection of documents is annotated with relevant DBpedia entities representing companies, people, and locations mentioned in these documents. These annotations are used to browse the document collection and visualize it using different widgets: e.g., presenting mentioned locations using Google Maps and number of entity mentions with charts. In addition, Stanbol content enhancement is used to enrich information imported from external Web sources: particularly, abstracts of relevant news articles accessed via the New York Times Article Search API.

A promising demonstration of Apache Stanbol.

I was less impressed with the content.

Take one of the top ten companies being tracked, Ferrari.

In the Information Workbench, the Ferrari entry displays a Google map displays to your right, marking a location in Italy. I suspect I know the meaning of that location on the map but some reassurance on that score would be nice.

The “relevant news” includes “Italy’s Premier Refuses To Commit to Running,” rather puzzling for Ferrari until you read more of the story to find: “Luca Cordero di Montezemolo, the president of Ferrari who started a civic movement last month and said it would endorse Mr. Monti.”

On the other hand, DBpedia may be so coarse that searches based upon it are on par with the average search engine.

I applaud the early use of Stanbol but stronger data sources are going to be required for interesting results.

Apache Stanbol graduates to Top-Level Project

Tuesday, October 2nd, 2012

Apache Stanbol graduates to Top-Level Project

From the post:

The Apache Software Foundation (ASF) has announced that Apache Stanbol has graduated from project incubation. Stanbol is an open source Java stack designed to interface with a content management system (CMS) to enhance it with semantic information. With the elevation to a Top-Level Project, the ASF recognises that the project’s community has been “well-governed” according to the foundation’s principles and follows “The Apache Way” for running a project.

Stanbol is a modular collection of reasoning engines, content enhancers and components to manage rules and metadata for content fed into the framework, all wrapped with a RESTful API and orchestrated within an Apache Felix OSGi container. A CMS adapter allows the system to connect to content management systems from which it can extract data to use in evaluating and developing rules and annotations.

The RESTful API can then be used to provide semantic information for content from a different source based upon information the server has previously analysed. Stanbol is more of a collection of reusable components than a complete solution for semantic searching, however. It is designed to work alongside CMS systems and existing search software.

I suppose too much *nix experience has made me suspicious of “complete solutions” for anything. Components, particularly interchangeable ones, seem a lot more robust.