Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

June 5, 2011

Your Data, Your Search

Filed under: ElasticSearch,Search Engines,Searching — Patrick Durusau @ 3:21 pm

Your Data, Your Search by Karel Minařík.

Slide deck but a very interesting one. Covers the shortcomings of search, an overview of reverse indexing and ends up with ElasticSearch. Along the way he observes that the “analysis” step is often more important than the “search” step. Suspect that analysis is nearly always more important than searching. And certainly harder.

May 19, 2011

Search Your Gmail Messages with ElasticSearch and Ruby

Filed under: Dataset,ElasticSearch,Search Data,Search Engines,Search Interface — Patrick Durusau @ 3:26 pm

Search Your Gmail Messages with ElasticSearch and Ruby

From the website:

If you’d like to check out ElasticSearch, there’s already lots of options where to get the data to feed it with. You can use a Twitter or Wikipedia river to fill it with gigabytes of public data, or you can feed it very quickly with some RSS feeds.

But, let’s get a bit personal, shall we? Let’s feed it with your own e-mail, imported from your own Gmail account.

A useful way to teach basic searching.

After all, a search of Wikipedia or Twitter may return impressive results, but are they correct results?

Hard for a user to say because both Wikipedia and Twitter are large enough that verification (other than by other programs) of search results isn’t possible.

Assuming your Gmail inbox is smaller than Wikipedia you should be able to recognize what results are “correct” and which ones look “off.”

And you may learn some Ruby in the bargain.

Not a bad day’s work. 😉


PS: You may want to try the links on mining Twitter, Wikipedia and RSS feeds with ElasticSearch.

May 14, 2011

Data Visualization with ElasticSearch and Protovis

Filed under: ElasticSearch,Visualization — Patrick Durusau @ 6:24 pm

Data Visualization with ElasticSearch and Protovis

This is a great article on ElasticSearch and visualization with Protovis but I mention it because of the following:

Nevertheless, a modern full-text search engine can do much more than that. At its core lies the inverted index, a highly optimized data structure for efficient lookup of documents matching the query. But it also allows to compute complex aggregations of our data, called facets. (Emphasis and links in original)

Do you think of facets as aggregations of data?

Is merging an aggregation of data?

March 20, 2011

99 Problems, But The Search Ain’t One

Filed under: ElasticSearch,Search Engines,Searching — Patrick Durusau @ 1:25 pm

99 Problems, But The Search Ain’t One

Slides and video from UK PHP presentation by Andrei Zmievski on ElasticSearch.

From the webpage:

ElasticSearch is the new kid on the search block. Built on top of Lucene and adhering to the best concepts of so-called NoSQL movement, ElasticSearch is a distributed, highly available, fast RESTful search engine, ready to be plugged into Web applications. Come to this session and learn how to set up, index, search, and tune ElasticSearch in less time than it takes to order a latte (disclaimer: at sufficiently busy central Starbucks locations. Side effects may include euphoria, stuff getting done, and extra time to spend with girlfriend).

While I appreciate an optimistic (enthusiastic?) presentation and I like ElasticSearch, predictions of the end of searching problems is a bit premature. 😉

I commend the article to you but would note that the search problems addressed by topic maps, such as:

  1. Different identifications of the same subject
  2. Re-use of the same identifiers for different subjects
  3. Inability to reliably merge indexes from more than one source

All remain with ElasticSearch.

« Newer Posts

Powered by WordPress