SpiderDuck: Twitter’s Real-time URL Fetcher
A bit of a walk on the engineering side but in order to be relevant, topic maps do have to be written and topic map software implemented.
This a very interesting write-up of how Twitter relied mostly on open source tools to create a system that could be very relevant to topic map implementations.
For example, the fetch/no-fetch decision for URLs is based on a comparison to URLs fetched within X days. Hmmm, comparison of URLs, oh, those things that occur in subjectIdentifier and subjectLocator properties of topics. Do you smell relevance?
And there is harvesting of information from web pages, one assumes that could be done on “information items” from a topic map as well, except there it would be properties, etc. Even more relevance.
What parts of SpiderDuck do you find most relevant to a topic map implementation?