Twitter Flies by Hadoop on Search Quest by Ian Armas Foster.
From the post:
People who use Twitter may not give a second thought to the search bar at the top of the page. It’s pretty basic, right? You type something into the nifty little box and, like the marvel of efficient search that it is, it offers suggestions for things the user might wish to search during the typing process.
On the surface, it operates like any decent search engine. Except, of course, this is Twitter we’re talking about. There is no basic functionality at the core here. As it turns out, a significant amount of effort went into designing the Twitter search suggestion engine and the network is still just getting started refining this engine.
A recent Twitter-published scientific paper tells the tale of Twitter’s journey through their previously existing Hadoop infrastructure to a custom combined infrastructure. This connects the HDFS to a frontend cache (to deal with queries and responses) and a backend (which houses algorithms that rank relevance).
The latency of the Hadoop solution was too high.
Makes me think about topic map authoring with a real time “merging” interface. One that displays the results of a current topic, association or occurrence that is being authored on the map.
Or at least the option to choose to see such a display with some reasonable response time.