Tuesday, June 15, 2010

Speech Dasher: Fast Writing using Speech and Gaze (K. Vertanen & D. MacKay, 2010)

A new version of the Dasher typing interface utilizes speech recognition provided by the CMU PocketSphinx software doubles the typing performance measured in words per minute. From a previous 20 WPM to 40 WPM, close to what a professional keyboard jockey may produce.

Speech Dasher allows writing using a combination of speech and a zooming interface. Users first speak what they want to write and then they navigate through the space of recognition hypotheses to correct any errors. Speech Dasher’s model combines information from a speech recognizer, from the
user, and from a letter-based language model. This allows fast writing of anything predicted by the recognizer while also providing seamless fallback to letter-by-letter spelling for words not in the recognizer’s predictions. In a formative user study, expert users wrote at 40 (corrected) words per
minute. They did this despite a recognition word error rate of 22%. Furthermore, they did this using only speech and the direction of their gaze (obtained via an eye tracker).

  • Speech Dasher: Fast Writing using Speech and Gaze
    Keith Vertanen and David J.C. MacKay. CHI '10: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, To appear. [Abstract+videos, PDF, BibTeX]

