Speech Recognition vs. Language Processing by Geoffrey Pullum.
From the post:
I have stressed that we are still waiting for natural language processing (NLP). One thing that might lead you to believe otherwise is that some companies run systems that enable you to hold a conversation with a machine. But that doesn’t involve NLP, i.e. syntactic and semantic analysis of sentences. It involves automatic speech recognition (ASR), which is very different.
ASR systems deal with words and phrases rather as the song “Rawhide” recommends for cattle: “Don’t try to understand ’em; just rope and throw and brand ’em.”
Labeling noise bursts is the goal, not linguistically based understanding.
(…)
Prompting a bank customer with “Do you want to pay a bill or transfer funds between accounts?” considerably improves the chances of getting something with either “pay a bill” or “transfer funds” in it; and they sound very different.
In the latter case, no use is made by the system of the verb + object structure of the two phrases. Only the fact that the customer appears to have uttered one of them rather than the other is significant. What’s relevant about pay is not that it means “pay” but that it doesn’t sound like tran-. As I said, this isn’t about language processing; it’s about noise-burst classification.
I can see why the NLP engineers dislike Pullum so intensely.
Characterizing “speech recognition” as “noise-burst classification,” while entirely accurate, is also offensive.
😉
“Speech recognition” fools a layperson into thinking NLP is more sophisticated than it is in fact.
The question for NLP engineers is: Why the pretense at sophistication?