We’re Bringing Learning to Rank to Elasticsearch.
From the post:
It’s no secret that machine learning is revolutionizing many industries. This is equally true in search, where companies exhaust themselves capturing nuance through manually tuned search relevance. Mature search organizations want to get past the “good enough” of manual tuning to build smarter, self-learning search systems.
That’s why we’re excited to release our Elasticsearch Learning to Rank Plugin. What is learning to rank? With learning to rank, a team trains a machine learning model to learn what users deem relevant.
When implementing Learning to Rank you need to:
- Measure what users deem relevant through analytics, to build a judgment list grading documents as exactly relevant, moderately relevant, not relevant, for queries
- Hypothesize which features might help predict relevance such as TF*IDF of specific field matches, recency, personalization for the searching user, etc.
- Train a model that can accurately map features to a relevance score
- Deploy the model to your search infrastructure, using it to rank search results in production
Don’t fool yourself: underneath each of these steps lie complex, hard technical and non-technical problems. There’s still no silver bullet. As we mention in Relevant Search, manual tuning of search results comes with many of the same challenges as a good learning to rank solution. We’ll have more to say about the many infrastructure, technical, and non-technical challenges of mature learning to rank solutions in future blog posts.
… (emphasis in original)
A great post as always but of particular interest for topic map fans is this passage:
…
Many of these features aren’t static properties of the documents in the search engine. Instead they are query dependent – they measure some relationship between the user or their query and a document. And to readers of Relevant Search, this is what we term signals in that book.
… (emphasis in original)
Do you read this as suggesting the merging exhibited to users should depend upon their queries?
That two or more users, with different query histories could (should?) get different merged results from the same topic map?
Now that’s an interesting suggestion!
Enjoy this post and follow the blog for more of same.
(I have a copy of Relevant Search waiting to be read so I had better get to it!)