A multi-Teraflop Constituency Parser using GPUs by John Canny, David Hall and Dan Klein.
Abstract:
Constituency parsing with rich grammars remains a computational challenge. Graphics Processing Units (GPUs) have previously been used to accelerate CKY chart evaluation, but gains over CPU parsers were modest. In this paper, we describe a collection of new techniques that enable chart evaluation at close to the GPU’s practical maximum speed (a Teraflop), or around a half-trillion rule evaluations per second. Net parser performance on a 4-GPU system is over 1 thousand length- 30 sentences/second (1 trillion rules/sec), and 400 general sentences/second for the Berkeley Parser Grammar. The techniques we introduce include grammar compilation, recursive symbol blocking, and cache-sharing.
Just in case you are interested in parsing “unstructured” data, mostly what they also call “texts.”
I first saw the link: BIDParse: GPU-accelerated natural language parser at hgup.org. Then I started looking for the paper. 😉