Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 11, 2012

Python interface to Stanford Core NLP tools v1.3.3

Filed under: Natural Language Processing,Python,Stanford NLP — Patrick Durusau @ 5:25 am

Python interface to Stanford Core NLP tools v1.3.3

From the README.md:

This is a Python wrapper for Stanford University’s NLP group’s Java-based CoreNLP tools. It can either be imported as a module or run as a JSON-RPC server. Because it uses many large trained models (requiring 3GB RAM on 64-bit machines and usually a few minutes loading time), most applications will probably want to run it as a server.

  • Python interface to Stanford CoreNLP tools: tagging, phrase-structure parsing, dependency parsing, named entity resolution, and coreference resolution.
  • Runs an JSON-RPC server that wraps the Java server and outputs JSON.
  • Outputs parse trees which can be used by nltk.

It requires pexpect and (optionally) unidecode to handle non-ASCII text. This script includes and uses code from jsonrpc and python-progressbar.

It runs the Stanford CoreNLP jar in a separate process, communicates with the java process using its command-line interface, and makes assumptions about the output of the parser in order to parse it into a Python dict object and transfer it using JSON. The parser will break if the output changes significantly, but it has been tested on Core NLP tools version 1.3.3 released 2012-07-09.

If you have NLP requirements and work in Python, this may be of interest.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress