Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

October 22, 2011

Java Wikipedia Library (JWPL)

Filed under: Data Mining,Java,Software — Patrick Durusau @ 3:16 pm

Java Wikipedia Library (JWPL)

From the post:

Lately, Wikipedia has been recognized as a promising lexical semantic resource. If Wikipedia is to be used for large-scale NLP tasks, efficient programmatic access to the knowledge therein is required.

JWPL (Java Wikipedia Library) is a open-source, Java-based application programming interface that allows to access all information contained in Wikipedia. The high-performance Wikipedia API provides structured access to information nuggets like redirects, categories, articles and link structure. It is described in our LREC 2008 paper.

JWPL contains a Mediawiki Markup parser that can be used to further analyze the contents of a Wikipedia page. The parser can also be used stand-alone with other texts using MediaWiki markup.

Further, JWPL contains the tool JWPLDataMachine that can be used to create JWPL dumps from the publicly available dumps at download.wikimedia.org.

Wikipedia is a resource of growing interest. This toolkit may prove useful in mining it for topic map purposes.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress