A Faster LZ77-Based Index by Travis Gagie and Pawel Gawrychowski.
Abstract:
Suppose we are given an AVL-grammar with $r$ rules for a string (S [1..n]) whose LZ77 parse consists of $z$ phrases. Then we can add $\Oh{z \log \log z}$ words and obtain a compressed self-index for $S$ such that, given a pattern (P [1..m]), we can list the occurrences of $P$ in $S$ in $\Oh{m^2 + (m + \occ) \log \log n}$ time.
Not the best abstract I have ever read. At least in terms of attracting the most likely audience to be interested.
I would have started with: “Indexing of genomes, which are 99.9% same, can be improved in terms of searching, response times and reporting of secondary occurrences.” Then follow with the technical description of the contribution. Don’t make people work for a reason to read the paper.
Any advancement in indexing, but particularly in an area like genomics, is important to topic maps.
Update: See the updated version of this paper: A Faster Grammar-Based Self-Index.
[…] Updated version of the paper I covered at: A Faster LZ77-Based Index. […]
Pingback by A Faster Grammar-Based Self-Index « Another Word For It — December 25, 2011 @ 6:07 pm