Jahanzeb Sherwani

Speech Interfaces for Information Access by Low Literate Users Degree Type: Ph.D. in Computer Science
Advisor(s): Roni Rosenfeld, Alexander Rudnicky
Graduated: August 2009

Abstract:

In the developing world, critical information, such as in the field of healthcare, can often mean the difference between life and death. While information and communications technologies enable multiple mechanisms for information access by literate users, there are limited options for information access by low literate users.

In this thesis, I investigate the use of spoken language interfaces by low literate users in the developing world, specifically health information access by community health workers in Pakistan. I present results from five user studies comparing a variety of information access interfaces for these users. I first present a comparison of audio and text comprehension by users of varying literacy levels and with diverse linguistic backgrounds. I also present a comparison of two telephony-based-interfaces with different input modalities: touch-tone and speech. Based on these studies, I show that speech interfaces outperform equivalent touch-tone interfaces for both low literate and literate users, and that speech interfaces outperform text interfaces for low literate users.

A further contribution of the thesis is a novel approach for the rapid generation of speech recognition capability in resource-poor languages. Since most languages spoken in the developing world have limited speech resources, it is difficult to create speech recognizers for such languages. My approach leverages existing off-the-shelf technology to create robust, speaker-independent, small-vocabulary speech recognition capability with minimal training data requirements. I empirically show that this method is able to reach recognition accuracies of greater than 90% with very little effort and, even more importantly, little speech technology skill.

The thesis concludes with an exploration of orality as a lens with which to analyze and understand low literate users, as well as recommendations on the design and testing of user interfaces for such users, such as an appreciation for the role of dramatic narrative in content creation for information access systems.

Keywords:
Speech recognition, speech interfaces, developing countries, emerging markets, ICTD, ICT4D