Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

August 2, 2013

A new version of the Compact Language Detector

Filed under: Language — Patrick Durusau @ 6:36 pm

A new version of the Compact Language Detector by Mike McCandless.

From the post:

It’s been almost two years since I originally factored out the fast and accurate Compact Language Detector from the Chromium project, and the effort was clearly worthwhile: the project is popular and others have created additional bindings for languages including at least Perl, Ruby, R, JavaScript, PHP and C#/.NET.

Eric Fischer used CLD to create the colorful Twitter language map, and since then further language maps have appeared, e.g. for New York and London. What a multi-lingual world we live in!

Suddenly, just a few weeks ago, I received an out-of-the-blue email from Dick Sites, creator of CLD, with great news: he was finishing up version 2.0 of CLD and had already posted the source code on a new project.

So I’ve now reworked the Python bindings and ported the unit tests to Python (they pass!) to take advantage of the new features. It was much easier this time around since the CLD2 sources were already pulled out into their own project (thank you Dick and Google!).

Great library if you need to detect languages.

I understand some of the security agencies think use of a non-English language is a dot to be connected.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress