Pattern Compression – 7 Magnitudes of Reduction

Making Pattern Mining Useful.

Jilles Vreeken’s dissertation was a runner-up for the 2010 ACM SIGKDD Dissertation Award.

Vreeken proposes “compression” of data patterns on the basis of Minimum Description Length (MDL) (see The Minimum Description Length Principle) and KRIMP, “a heuristic parameter-free algorithm for finding the optimal set of frequent itemsets.” (SIGKDD, vol. 12, issue 1, page 76)

Readers should take note that experience indicates that KRIMP achieves 7 magnitudes of reduction in patterns. Let me say that again: KRIMP achieves 7 magnitudes of reduction in patterns. In practice, not theory.

Vreeken’s homepage has other materials of interest on this topic.

Questions:

  1. Application of “minimum description length” in library science? (report for class)
  2. How would you apply “minimum description length” techniques in library science? (3-5 pages, citations)
  3. Introduction to “Minimum Description Length For Librarians (class presentation, examples relevant to librarians)

Comments are closed.