Using NLTK for Named Entity Extraction

Using NLTK for Named Entity Extraction by Emily Daniels.

From the post:

Continuing on from the previous project, I was able to augment the functions that extract character names using NLTK’s named entity module and an example I found online, building my own custom stopwords list to run against the returned names to filter out frequently used words like “Come”, “Chapter”, and “Tell” which were caught by the named entity functions as potential characters but are in fact just terms in the story.

Whether you are trusting your software or using human proofing, named entity extraction is a key task in mining data.

Having extracted named entities, the harder task is uncovering relationships between them that may not be otherwise identified.

Challenging with the text of Oliver Twist but even more difficult when mining donation records and the Congressional record.

Comments are closed.