Archive for the ‘Retrieval’ Category

Content-Based Image Retrieval at the End of the Early Years

Tuesday, January 22nd, 2013

Content-Based Image Retrieval at the End of the Early Years by Arnold W.M. Smeulders, Marcel Worring, Simone Santini, Amarnath Gupta, and Ramesh Jain. (Smeulders, A.W.M.; Worring, M.; Santini, S.; Gupta, A.; Jain, R.; , “Content-based image retrieval at the end of the early years,” Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.22, no.12, pp.1349-1380, Dec 2000
doi: 10.1109/34.895972)


Presents a review of 200 references in content-based image retrieval. The paper starts with discussing the working conditions of content-based retrieval: patterns of use, types of pictures, the role of semantics, and the sensory gap. Subsequent sections discuss computational steps for image retrieval systems. Step one of the review is image processing for retrieval sorted by color, texture, and local geometry. Features for retrieval are discussed next, sorted by: accumulative and global features, salient points, object and shape features, signs, and structural combinations thereof. Similarity of pictures and objects in pictures is reviewed for each of the feature types, in close connection to the types and means of feedback the user of the systems is capable of giving by interaction. We briefly discuss aspects of system engineering: databases, system architecture, and evaluation. In the concluding section, we present our view on: the driving force of the field, the heritage from computer vision, the influence on computer vision, the role of similarity and of interaction, the need for databases, the problem of evaluation, and the role of the semantic gap.

Excellent survey article from 2000 (not 2002 as per the Ostermann paper).

I think you will appreciate the treatment of the “semantic gap,” both in terms of its description as well as ways to address it.

If you are using annotated images in your topic map application, definitely a must read.

How do you measure the impact of tagging on retrieval?

Thursday, May 31st, 2012

How do you measure the impact of tagging on retrieval? by Tony Russell-Rose.

From the post:

A client of mine wants to measure the difference between manual tagging and auto-classification on unstructured documents, focusing in particular on its impact on retrieval (i.e. relevance ranking). At the moment they are considering two contrasting approaches:

See Tony’s post for details.

What do you think?

How accurate can manual review be?

Friday, December 23rd, 2011

How accurate can manual review be?

From the post:

One of the chief pleasures for me of this year’s SIGIR in Beijing was attending the SIGIR 2011 Information Retrieval for E-Discovery Workshop (SIRE 2011). The smaller and more selective the workshop, it often seems, the more focused and interesting the discussion.

My own contribution was “Re-examining the Effectiveness of Manual Review”. The paper was inspired by an article from Maura Grossman and Gord Cormack, whose message is neatly summed up in its title: “Technology-assisted review in e-discovery can be more effective and more efficient than exhaustive manual review”.

Fascinating work!

Does this give you pause about automated topic map authoring? Why/why not?

Spectral Based Information Retrieval

Saturday, December 25th, 2010

Spectral Based Information Retrieval Author: Laurence A. F. Park (2003)

Every now and again I run into a dissertation that is an interesting and useful survey of a field and an original contribution to the literature.

Not often but it does happen.

It happened in this case with Park’s dissertation.

The beginning of an interesting threat of research that treats terms in a document as a spectrum and then applies spectral transformations to the retrieval problem.

The technique has been developed and extended since the appearance of Park’s work.

Highly recommended, particularly if you are interested in tracing the development of this technique in information retrieval.

My interest is in the use of spectral representations of text in information retrieval as part of topic map authoring and its potential as a subject identity criteria.

Actually I should broaden that to include retrieval of images and other data as well.


  1. Prepare an annotated bibliography of ten (10) recent papers usually spectral analysis for information retrieval.
  2. Spectral analysis helps retrieve documents but what if you are searching for ideas? Does spectral analysis offer any help?
  3. How would you extend the current state of spectral based information retrieval? (5-10 pages, project proposal, citations)

SRU Search/Retrieval via URL

Sunday, December 12th, 2010

SRU Search/Retrieval via URL

Standards, resources, including free implementations for the SRU effort.

SRU: the protocol – SearchRetrieve Operation: Binding for SRU 2.0 (draft)

CQL: The Contextual Query Language – CQL: The Contextual Query Language (draft)

The website reports that standardization is to be completed soon. And the available drafts date from 2010.

However, if you follow known servers you will find only thirteen (13) known servers as of 12 December 2010.

Standards can be written prior to wide spread adoption but before spending too much effort on this protocol and query language, I think we need to watch its adoption curve closely.