Hyperbolic lots by Ben Zimmer.
From the post:
For the past couple of years, Google has provided automatic captioning for all YouTube videos, using a speech-recognition system similar to the one that creates transcriptions for Google Voice messages. It’s certainly a boon to the deaf and hearing-impaired. But as with Google’s other ventures in natural language processing (notably Google Translate), this is imperfect technology that is gradually becoming less imperfect over time. In the meantime, however, the imperfections can be quite entertaining.
I gave the auto-captioning an admittedly unfair challenge: the multilingual trailer that Michael Erard put together for his latest book, Babel No More: The Search for the World’s Most Extraordinary Language Learners. The trailer features a story from the book told by speakers of a variety of languages (including me), and Erard originally set it up as a contest to see who could identify the most languages. If you go to the original video on YouTube, you can enable the auto-captioning by clicking on the “CC” and selecting “Transcribe Audio” from the menu.
The transcription does a decent job with Erard’s English introduction, though I enjoyed the interpretation of “hyperpolyglots” — the subject of the book — as “hyperbolic lots.” Hyperpolyglot (evidently coined by Dick Hudson) isn’t a word you’ll find in any dictionary, and it’s not that frequent online, so it’s highly unlikely the speech-to-text system could have figured it out. But the real fun begins with the speakers of other languages.
You will find this amusing.
Ben notes the imperfections are becoming fewer.
Curious, since languages are living, social constructs, at what point to we measure the number of “imperfections?”
Or should I say from whose perspective do we measure the number of “imperfections?”
Or should we use both of those measures and others?