Archive for the ‘Search Analytics’ Category

Prostitutes Appeal to Pope: Text Analytics applied to Search

Sunday, April 29th, 2012

Prostitutes Appeal to Pope: Text Analytics applied to Search by Tony Russell-Rose.

It is hard for me to visit Tony’s site and not come away with several posts he has written that I want to mention. Today was no different.

Here is a sampling of what Tony talks about in this post:

Consider the following newspaper headlines, all of which appeared unambiguous to the original writer:

  • DRUNK GETS NINE YEARS IN VIOLIN CASE
  • PROSTITUTES APPEAL TO POPE
  • STOLEN PAINTING FOUND BY TREE
  • RED TAPE HOLDS UP NEW BRIDGE
  • DEER KILL 300,000
  • RESIDENTS CAN DROP OFF TREES
  • INCLUDE CHILDREN WHEN BAKING COOKIES
  • MINERS REFUSE TO WORK AFTER DEATH

Although humorous, they illustrate much of the ambiguity in natural language, and just how much pragmatic and linguistic knowledge must be employed by NLP tools to function accurately.

A very informative and highly amusing post.

What better way to start the week?

Enjoy!

Relevance Tuning and Competitive Advantage via Search Analytics

Sunday, January 8th, 2012

Relevance Tuning and Competitive Advantage via Search Analytics

It must be all the “critical” evaluation of infographics I have been reading but I found myself wondering about the following paragraph:

This slide shows how Search Analytics can be used to help with A/B testing. Concretely, in this slide we see two Solr Dismax handlers selected on the right side. If you are not familiar with Solr, think of a Dismax handler as an API that search applications call to execute searches. In this example, each Dismax handler is configured differently and thus each of them ranks search hits slightly differently. On the graph we see the MRR (see Wikipedia page for Mean Reciprocal Rank details) for both Dismax handlers and we can see that the one corresponding to the blue line is performing much better. That is, users are clicking on search hits closer to the top of the search results page, which is one of several signals of this Dismax handler providing better relevance ranking than the other one. Once you have a system like this in place you can add more Dismax handlers and compare 2 or more of them at a time. As the result, with the help of Search Analytics you get actual, real feedback about any changes you make to your search engine. Without a tool like this, you cannot really tune your search engine’s relevance well and will be doing it blindly.

Particularly the line:

That is, users are clicking on search hits closer to the top of the search results page, which is one of several signals of this Dismax handler providing better relevance ranking than the other one.

Really?

Here is one way to test that assumption:

Report for any search as the #1 or #2 result, “private cell-phone number for …” and pick one of the top ten movie actresses for 2011. And you can do better than that, make sure the cell-phone number is one that rings at your search analytics desk. Now see how many users are “…clicking on search hits closer to the top of the search results page….”

Are your results more relevant than a movie star?

Don’t get me wrong, search analytics are very important, but let’s not get carried away about what we can infer from largely opaque actions.

Some other questions: Did users find the information they needed? Can they make use of that information? Does that use improve some measurable or important aspect of the company business? Let’s broaden search analytics to make search results less opaque.

Search Analytics

Sunday, November 6th, 2011

Search Analytics

From the post:

Here is another take on Search Analytics, this one being presented at Enterprise Search Summit Fall 2011 in Washington DC, to an audience coming mainly from the US government agencies, very large enterprises, and large international companies with 10s of thousands of employees world wide. The audience was good and posed a number of good questions after the talk. The full slide deck is below as well as in Sematext@Slideshare.

I like the:

If you can’t measure it, you can’t fix it! [emphasis in original, I did fix the punctuation to move the comma from "measure, it" to "measure it,".]

line. Although I would have liked it better when I was an undergraduate student taking empirical methodology in political science. A number of years later I still agree that measurement is important but am less militant that measurement is always possible or even useful.

Still, a very good slide deck and a good way to start off the week!

Search Analytics for Your Site

Thursday, October 20th, 2011

Search Analytics for Your Site

From the website:

Any organization that has a searchable web site or intranet is sitting on top of hugely valuable and usually under-exploited data: logs that capture what users are searching for, how often each query was searched, and how many results each query retrieved. Search queries are gold: they are real data that show us exactly what users are searching for in their own words. This book shows you how to use search analytics to carry on a conversation with your customers: listen to and understand their needs, and improve your content, navigation and search performance to meet those needs.

I haven’t read this book so don’t take this post as an endorsement or “buy” recommendation.

While watching the slide deck, it occurred to me that if search analytics could improve your website, why not use search analytics to develop the design and content of a topic map?

The design aspect in the sense that the most prominent, easiest to use/find content is what is popular with users. Could even be by time of the day if you have a topic map that is accessible 24 x 7.

The content aspect in the sense of what is included, what we say about it and perhaps how it is findable is based on search analysis.

If you were developing a topic map about Sarah Palin, perhaps searching for “dude” should return her husband as a topic. I can think of other nicknames but this isn’t a political blog.

Comments on this book or suggestions of other search analytics resources appreciated.