Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

April 27, 2014

Using NLTK for Named Entity Extraction

Filed under: Entity Extraction,Named Entity Mining,NLTK,Python — Patrick Durusau @ 7:33 pm

Using NLTK for Named Entity Extraction by Emily Daniels.

From the post:

Continuing on from the previous project, I was able to augment the functions that extract character names using NLTK’s named entity module and an example I found online, building my own custom stopwords list to run against the returned names to filter out frequently used words like “Come”, “Chapter”, and “Tell” which were caught by the named entity functions as potential characters but are in fact just terms in the story.

Whether you are trusting your software or using human proofing, named entity extraction is a key task in mining data.

Having extracted named entities, the harder task is uncovering relationships between them that may not be otherwise identified.

Challenging with the text of Oliver Twist but even more difficult when mining donation records and the Congressional record.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress