Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

June 5, 2013

Entity recognition with Scala and…

Filed under: Entity Resolution,Natural Language Processing,Scala,Stanford NLP — Patrick Durusau @ 4:05 pm

Entity recognition with Scala and Stanford NLP Named Entity Recognizer by Gary Sieling.

From the post:

The following sample will extract the contents of a court case and attempt to recognize names and locations using entity recognition software from Stanford NLP. From the samples, you can see it’s fairly good at finding nouns, but not always at identifying the type of each noun.

In this example, the entities I’d like to see are different – companies, law firms, lawyers, etc, but this test is good enough. The default examples provided let you choose different sets of things that can be recognized: {Location, Person, Organization}, {Location, Person, Organization, Misc}, and {Time, Location, Organization, Person, Money, Percent, Date}. The process of extracting PDF data and processing it takes about five seconds.

For this text, selecting different options sometimes led to the classifier picking different options for a noun – one time it’s a person, another time it’s an organization, etc. One improvement might be to run several classifiers and to allow them to vote. This classifier also loses words sometimes – if a subject is listed with a first, middle, and last name, it sometimes picks just two words. I’ve noticed similar issues with company names.

(…)

The voting on entity recognition made me curious about interactive entity resolution where a user has a voice.

See the next post.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress