Error Correction of Speech Recognition by Custom Phonetic Alphabet Input for Ultra-Small Devices

Abstract

Automatic speech recognition (ASR) is one of the most effective ways to input text, in particular, for ultra-small devices such as smartwatches. Although the accuracy of ASR has been improving these days, it still often makes recognition errors. If you want to correct words that have been recognized incorrectly, you need to use a software keyboard or read out the words again. However, it is difficult and annoying to input text correctly using a software keyboard on a small display. Besides, even if you read out the same phrase again, there is no guarantee that your speech will be recognized correctly. To address this problem, we designed a custom phonetic alphabet optimal for ASR. It enables the user to input words more accurately than spelling them out directly or using the NATO phonetic alphabet, which is known as the standardized phonetic alphabet used for human-human speech interaction under noise. Furthermore, we conducted user studies to verify our method's efficiency in correcting speech recognition errors on a small display.

Our custom phonetic alphabet consists of 27 words (26 words corresponding to each alphabet + “space”). Since some words used in NATO phonetic alphabet are unfamiliar to non-native English speakers, we chose words used in daily life and familiar to them. We also used the words contained in NATO phonetic alphabet such as “X-ray.” The words are shorter in average than the words used in NATO phonetic alphabet, and we chose words easily distinguished by ASR. These words are currently selected subjectively by the authors based on recognition performance of our own speeches.