Audio Adversarial Examples: Targeted Attacks on Speech-to-Text by Nicholas Carlini and David Wagner.
Abstract:
We construct targeted audio adversarial examples on automatic speech recognition. Given any audio waveform, we can produce another that is over 99.9% similar, but transcribes as any phrase we choose (at a rate of up to 50 characters per second). We apply our iterative optimization-based attack to Mozilla’s implementation DeepSpeech end-to-end, and show it has a 100% success rate. The feasibility of this attack introduce a new domain to study adversarial examples.
You can consult the data used and code at: http://nicholas.carlini.com/code/audio_adversarial_examples.
Important not only for defeating automatic speech recognition but also for establishing properties of audio recognition differ from visual recognition.
A hint that automatic recognition properties cannot be assumed for unexplored domains.