Are simplified hadoop interfaces the next web cash cow? is a question that Brian Breslin is asking these days.
It isn’t that hard to imagine that not only Hadoop interfaces being cash cows but also canned analysis of public date sets that can be incorporated into those interfaces.
But then the semantics question comes back up when you want to join that canned analysis to your own. What did they mean by X? Or Y? Or for that matter, what are the semantics of the data set?
But we can solve that issue by explicit subject identification! Did I hear someone say topic maps? 😉 So our identifications of subjects in public data sets will themselves become a commodity. There could be competing set-similarity analysis of public data sets.
If a simplified Hadoop interface is the next cash cow, we need to be ready to stuff it with data mapped to subject identifications to make it grow even larger. A large cash cow is a good thing, a larger cash cow is better and a BP-sized cash cow is just about right.
Patrick,
Interesting viewpoint. I honestly don’t know much about topic maps, but am glad that I got you thinking about using hadoop for this purpose.
Comment by Brian Breslin — July 14, 2010 @ 12:16 pm
Brian,
Glad you like it! Topic maps bridge the gap between speakers when they use different identifications for the same subject. Happens in all walks of life, including IT.
Mapping data without explicit subject identification can be made to work. Many semantic integration solutions take that route.
The problem is: How do you re-use that mapping six months from now when you aren’t really sure what some of the terms mean? Or how do you share that mapping with someone else (as in selling it to them)?
Comment by Patrick Durusau — July 14, 2010 @ 3:36 pm