Easy access to information is one of the key factors contributing to the
economic and social growth of modern societies in the fields of
agriculture, health, nutrition, education, commerce and governance.
Learning how to deal with a specific crop disease, where the closest
market with the best current price for wheat is, how to make
home-made oral rehydration solution – all are examples where a
timely and correct answer can be immensely beneficial, and in some
cases may mean the difference between life and death. In the field of
community health, international public health experts have stated
that "providing access to reliable health information for health
workers in developing countries is potentially the single most cost
effective and achievable strategy for sustainable improvement in
health care".
However, most of the available means of information access such as print media
are useful only for the literate members of the society. Other modes
like television and radio are non-interactive and computers, although
being interactive are not suitable for the major portion of the
society that is either not familiar with its interface and usage or
does not have access to it at all. Hence information access gets
restricted to literate and affluent circles. One solution to this
problem can be a telephone based speech interface for human-computer
interaction. These systems can provide cost-effective
and natural forms of information access for large populations in the
developing world – both those who are completely non-literate, as
well as the semi-literate, those that have difficulty in reading
fluently.
"Telephone-based Speech Interfaces for Access to
Information by Non-literate Users” is a joint effort of
CRULP in collaboration with the LTI department of
Carnegie Mellon University and the Agha Khan University.
The goal of this project is to investigate the use of
speech interfaces in a field-deployed system by
providing easy access to medical information to lady
health workers in Pakistan. This will be achieved by
developing a telephone based dialogue system consisting
of an Urdu Speech Recognition system and a Text to
Speech system that can interact with the health workers
to answer their queries. However, a dialog system is
much more than an ASR engine coupled with a TTS engine:
a dialog system needs to be able to mimic human
conversation abilities by providing an intuitive
conversation flow, detecting and correcting recognition
errors, and giving feedback to the caller throughout the
call.
The Text-to-Speech system required for this project has
already been developed by CRULP. In addition to the
problems of ASR (Automatic Speech Recognition) that are
still present for English, the prime impediment towards
the completion of this project is the lack of research
and local language resources for URDU. On an abstract
level a speaker independent automatic continuous (and
spontaneous) speech recognition system for local
languages and its further adaptation to telephone based
interface is required as a first step towards achieving
this goal.