Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

April 16, 2014

Regular expressions unleashed

Filed under: Names,Regexes,Topic Maps — Patrick Durusau @ 2:08 pm

Regular expressions unleashed by Hans-Juergen Schoenig.

From the post:

When cleaning up some old paperwork this weekend I stumbled over a very old tutorial. In fact, I have received this little handout during a UNIX course I attended voluntarily during my first year at university. It seems that those two days have really changed my life – the price tag: 100 Austrian Schillings which translates to something like 7 Euros in today’s money.

When looking at this old thing I noticed a nice example showing how to test regular expression support in grep. Over the years I had almost forgotten this little test. Here is the idea: There is no single way to print the name of Libya’s former dictator. According to this example there are around 30 ways to do it:…

Thirty (30) sounds a bit low to me but it’s sufficient to point out that mining all thirty (30) is going to give you a number of false positives, when searching for news on the former dictator of Libya.

The regex to capture all thirty (30) variant forms in a PostgreSQL database is great but once you have it, now what?

Particularly if you have sorted out the dictator from the non-dictators and/or placed them in other categories.

Do you pass that sorting and classifying onto the next user or do you flush the knowledge toilet and all that hard work just drains away?

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress