From the post:
Baloo is the next generation of the Nepomuk project. It’s responsible for handling user metadata such as tags, rating and comments. It also handles indexing and searching for files, emails, contacts, and so on. Baloo aims to be lighter on resources and more reliable than its parent project.
The Nepomuk project started as a research project in the European Union. The goal was to explore the use of relations between data for finding what you are looking for. It was build completely on top of RDF. While RDF is a great from a theoretical point of view, it is not the simplest tool to understand or optimize. The databases which currently exist for RDF are not suited for desktop use.
The Nepomuk developers have tried very hard over the last years to optimize the indexing and searching infrastructure, and they have now come to the conclusion that Nepomuk cannot be further optimized without migrating away from RDF.
RDF also heavily relied on ontologies. These ontologies are a way to describe how the data should be stored and represented. They used the ontologies from the original EU research project – Shared Desktop Ontologies. These ontologies were not designed in a time when it was not very clear how they would work and have sub-optimal performance and ease of use. They are quite vague in certain areas and often duplicate information. This leads to scenarios where it takes forever to figure out how the data should be stored. Additionally, since all the data needs to be stored in RDF, one cannot optimize for one specific data type.
Given these shortcomings and the many lessons learned over the last years the Nepomuk developers decided to drop RDF and rechristen the project under the name of Baloo. You can find more technical background and info on its architecture here.
I suggested to someone in synchronous time that authoring support for schema.org based metadata could be a win-win for users and document processing software.
For users, search appliances, local or even Google, can ingest “lite” schema definitions that provide immediate ROI on adding semantics to your documents. Well, I say immediate, as soon as they are indexed.
That should require no more skill than being able to type, assuming your document software can recognize the terms you use and annotate them properly.
If you want a successful strategy, do you follow the one that has resulted in a user base measured in increments of hundred’s of millions or do you prefer the righteous remnant approach with say less than 50,000?
I’m no marketing person but even I know the answer to that one. 😉
PS: There are some ankle biters who complain about the MS Office user numbers. Let’s just say between MS Office and Apache OpenOffice and the other ODF based word processors, that DocBook users are out-numbered by at least 20,000 to 1. Who needs more accurate numbers than that?
PPS: Microformats don’t have the precision that RDF and/or Topic Maps have to offer. But precision without adoption can’t be very precise. With adoption of microformats, more precision can be added as required by particular use cases.
I first saw this in a tweet by Jan Schnasse.