Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

January 28, 2012

Mavuno: Hadoop-Based Text Mining Toolkit

Filed under: Mahout,Natural Language Processing — Patrick Durusau @ 10:54 pm

Mavuno: A Hadoop-Based Text Mining Toolkit

From the webpage:

Mavuno is an open source, modular, scalable text mining toolkit built upon Hadoop. It supports basic natural language processing tasks (e.g., part of speech tagging, chunking, parsing, named entity recognition), is capable of large-scale distributional similarity computations (e.g., synonym, paraphrase, and lexical variant mining), and has information extraction capabilities (e.g., instance and semantic relation mining). It can easily be adapted to new input formats and text mining tasks.

Just glancing at the documentation I am intrigued by the support for Java regular expressions. More on that this coming week.

I first saw this at myNoSQL.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress