Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

December 4, 2010

Zoie: Real-time search indexing

Filed under: Full-Text Search,Indexing,Lucene,Search Engines,Software — Patrick Durusau @ 10:04 am

Zoie: Real-time search indexing

Somehow appropriate that following the lead on Kafka would lead me to Zoie (and other goodies to be reported).

From the website:

Zoie is a real-time search and indexing system built on Apache Lucene.

Donated by LinkedIn.com on July 19, 2008, and has been deployed in a real-time large-scale consumer website: LinkedIn.com handling millions of searches as well as hundreds of thousands of updates daily.

News: Zoie 2.0.0 is released … – Compatible with Lucene 2.9.x.

In a real-time search/indexing system, a document is made available as soon as it is added to the index. This functionality is especially important to time-sensitive information such as news, job openings, tweets etc.

Design Goals:

  • Additions of documents must be made available to searchers immediately
  • Indexing must not affect search performance
  • Additions of documents must not fragment the index (which hurts search performance)
  • Deletes and/or updates of documents must not affect search performance.

In topic map terms:

  • Additions to topic map must be made available to searchers immediately
  • Indexing must not affect search performance
  • Additions to topic map must not fragment the index (which hurts search performance)
  • Deletes and/or updates of a topic map must not affect search performance.

I would say that #’s 3 and 4 are research questions at this point.

Additions, updates and deletions in a topic map may have unforeseen (unforeseeable?) consequences.

Such as causing:

  • merging to occur
  • merging to be undone
  • roles to be played
  • roles to not be played
  • association to be valid
  • association to be invalid

to name only a few.

It may be possible to formally prove the impact that certain events will have but I am not aware of any definitive analysis on the subject.

Suggestions?

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress