Fundamentals of Information Retrieval: Illustration with Apache Lucene by Majirus FANSI.
From the description:
Information Retrieval is becoming the principal mean of access to Information. It is now common for web applications to provide interface for free text search. In this talk we start by describing the scientific underpinning of information retrieval. We review the main models on which are based the main search tools, i.e. the Boolean model and the Vector Space Model. We illustrate our talk with a web application based on Lucene. We show that Lucene combines both the Boolean and vector space models.
The presentation will give an overview of what Lucene is, where and how it can be used. We will cover the basic Lucene concepts (index, directory, document, field, term), text analysis (tokenizing, token filtering, sotp words), indexing (how to create an index, how to index documents), and seaching (how to run keyword, phrase, Boolean and other queries). We’ll inspect Lucene indices with Luke.
After this talk, the attendee will get the fundamentals of IR as well as how to apply them to build a search application with Lucene.
I am assuming that the random lines in the background of the slides are an artifact of the recording. Quite annoying.
Otherwise, a great presentation!