Where The Wireless Things Are:
How Can Speech Technology Solve The WAP-Enabled Phone's Toughest
BY PAM RAVESI
Wireless Application Protocol (WAP) -- the industry accepted bridge
between the Internet and wireless devices may have flung wide open the door
to Web-enabled wireless phones, but many potential users remain ambivalent.
There are several obvious problems that have not been well addressed. The
cumbersome keypad and restricted LCD display limit the usefulness and appeal
of the handheld Web. Speech technology addresses this issue, and promises to
make wireless Web-enabled mobile phones and small handheld computers much
easier to operate. Rather than stopping to use a stylus or hunt through the
letters and numbers on a mobile phone's keypad, users with a speech-equipped
wireless Web device may gain instant access to Web content by simply
speaking into the phone and asking for it. Mobile phone users don't need to
surf the Web -- they need fast answers to specific queries. Speech
technology lets them ask specific questions and hear instant answers.
The question remains: Is speech technology up to the challenge? Many
design project managers, developers, product managers, and engineers think
it may be. This is an important question because through architecting the
right interface, a company designing wireless Web devices and mobile
connectivity can deliver what customers want most: Sleek, portable devices
that offer complete connectivity, without the inherent limitations of a
cramped, and often awkward, interface.
RAPPING OVER WAP
Wireless Internet access is based on the convergence of mobile devices,
telecommunications, and Internet services. A typical wireless Internet
access setup might combine a mobile phone, a wireless data connection, and a
technology (or several) for input and output. One such input/output
methodology is a browser interface employing WAP. WAP is an open, global
standard that bridges telephony services and Internet browsing, and is
currently the main standard for delivering text and graphics to a browser on
a wireless device (much like a desktop Web browser). Speech technologies can
provide WAP-enabled devices with a voice interface for input and
text-to-speech technology for output.
Pitfalls In The Wireless Internet
The challenges: How can a pocket-sized device with a two-inch by
two-inch display process and deliver the same information provided by a
full-sized desktop computer with a monitor?
The small size of most wireless Internet access devices limits their
processing power and bandwidth. Often these devices employ a stylus and keys
the size of Chiclets. These devices also have a very small display screen,
making it difficult to view more than a line or two of data at a time.
Simply, not all Internet content is suitable for the minimal textual and
graphical displays that WAP offers. How can these small-sized devices best
provide Internet interaction for their users?
For both wireline and wireless Internet access to be truly useful, they
should be available to the user in a simple to use format, keeping in mind
location, format, and language. Everyone would agree that the primary
purpose of wireless Internet access is to provide quick and easily read
information, but the size and portability of most wireless access devices
dictates that information searches are fast, easy, and accurate, requiring a
sophisticated set of tools that can eliminate the prolonged surfing,
linking, and browsing that traditional Web searches require. Supporting the
user's native language may not be an option in the coming years. Therefore,
wireless (and wireline) Internet access should also support translation
SPEECH AND LANGUAGE TECHNOLOGIES: REDEFINING USABILITY
Automatic speech recognition (ASR) technology can create a speech user
interface (SUI) that allows users to issue commands and request information
using voice only. The SUI eliminates the need for "Chiclet keys,"
the stylus, and other awkward input mechanisms. Unlike a keyboard, a SUI is
not restricted by the size of the device. It is an intuitive interface
requiring little or no training. Text-to-speech (TTS) technology can also
help eliminate the difficulties involved in trying to present reams of
information on a small-sized display. TTS technology "translates"
written text into synthetic speech, "reading" Web content aloud.
Like ASR, TTS can be employed irrespective of the size of the access
device. Employed together, ASR and TTS technologies give users a fast and
very intuitive manner of transferring information, even with the limited
size of wireless Internet devices. Users can achieve even greater
flexibility and usability when ASR and TTS are combined with natural
language processing (NLP). NLP is a form of artificial intelligence that
recognizes and processes conversational speech, eliminating the need for
specific phrases or words for commands, searches, or inquiries.
MANAGING CONTENT OVERLOAD FROM ANYWHERE
The job of wireless Internet devices is to provide instant access to desired
information from anywhere. The real question is whether or not speech
technology can adequately replace keyed-in commands. To do so, it must be
very good at accurately recognizing users' spoken commands, while still
maintaining the flexibility to free users from memorizing specific words and
phrases. The software must have significant memory and processing power at
its disposal for this application. Text-to-speech (TTS) read-back voice is
also a crucial element, especially if mated with a multilingual, voice rec
technology. It is unlikely that a company would get very far in the
marketplace offering only a clipped, robotic TTS engine.
Global Internet access supports business and personal uses as varied as
information requests from tourists traveling in foreign countries to
academic research on international Web sites; the more targeted and precise
the results of a query, the better. One solution is to deliver search
results as summaries to handheld wireless devices, with the option to
receive full documents. Fully supported by multilingual technology and
speech-enabled protocols, applying WAP in such a sophisticated manner is all
but the pre-supposed destiny of the wireless world.
ANYBODY KNOW WHERE WE ARE GOING?
As WAP becomes more pervasive, more information is likely to be
processed on servers and sent in standard format to different types of
browsers. This will decrease the need for processing on the client side.
Concurrently, though, overall technology will continue to improve, providing
better client/terminal performance. The question is whether future users
will settle for the current performance level in a smaller-sized device,
push for even smaller devices and settle for less performance, or yearn
simply for a more usable format. One of the options open to developers eager
to make speech work is distributed computing. Speech and language
technologies supporting wireless Internet access may reside on the client
device, on the server, or between client and server. Vendors are currently
employing all three approaches. As with other emerging technologies, it
isn't clear which, if any, approach will become the standard.
DIALING INTO THE FUTURE
In the foreseeable future, technological enhancements such as increased
bandwidth will improve the usability of wireless Internet access. These
technology changes will also increase the chances of speech and language
technologies becoming pervasive. The increased bandwidth offered by
broadband networks -- high-speed circuit switched data (HSCSD), general
packet radio service (GPRS), and universal mobile telecommunications system
(UMTS) -- promises the delivery of more and more information to mobile
users. The processing power of mobile terminals will increase as the
technology advances. Both of these developments will require increased use
of content management technologies to find, categorize, and summarize the
additional content. As worldwide use of the Internet continues to increase,
access services may begin to provide advanced applications of speech
technology that will allow machine translation to support multilingual use.
The promise of speech to make technology as easy to use as asking a
question or issuing a command appears more tangible with every technological
advance. Wireless Internet access is revolutionizing the manner in which
Internet information is requested, delivered, and accessed. While the
industry and its business models are still evolving, it is clear that the
number of wireless data users worldwide is growing exponentially. No one can
predict all the applications that the new technology will make possible, but
one thing is definite: The marriage of speech technologies with WAP-enabled
handhelds will be at the forefront of wireless Internet applications.
Pamela Ravesi is senior director of product management for Lernout
& Hauspie's telephony group. She is responsible for defining
L&H's telephony strategy, products, business partners and business
plans. She has successfully launched over 40 different products in nine
different languages worldwide.
To The November 2000 Table Of Contents ]