
June 1999
SPEECH RECOGNITION:
There's No Need To Be Apprehensive
BY MARK HOLTHOUSE
Designing and implementing a speech recognition solution may seem like a daunting task.
While detecting and interpreting DTMF tones is a complicated process, at least the
developer can be certain regarding the nature of the tones themselves. Not so with speech
recognition. Differing voices and natural language make designing user-independent speech
recognition difficult. However, the design shouldn't be unnecessarily difficult. Clear,
goal-based development procedures and ongoing user testing can help insure that your
speech recognition solution turns out to be all you hoped it would be.
Building a telephone-based speech recognition application requires the coordination of
multiple elements throughout the system. Many IVR developers already have that experience.
However, unlike other IVR systems, over-the-telephone speech applications involve a
completely new approach to the user interface. For this reason, they require a different
application development process. If call center managers follow their old IVR models, they
are sure to be disappointed with the results. The following list outlines eight key steps
to help you in developing and deploying a successful, cost-effective speech application:
1. Understand Your Goals And Those Of Your Callers
An effective speech deployment begins with setting objectives. What do you want speech
recognition to do for your call center or business? Setting clear expectations will help
you measure your forward progress.
Next, before you simply swap out touchtone for speech, think about your callers.
Touchtone IVR applications help reduce call center costs, but they often frustrate users,
resulting in bail-outs to live agents. Why did the customer pick up the phone in the first
place? What would they like to accomplish? The answers aren't always obvious. Travelers
want to purchase tickets, for example, but they also may want to obtain the lowest fares
or the most direct routes.
You'll also need to consider different user groups. How will you distinguish between
novice and experienced users? Will business customers use the system differently than
consumers?
2. Plan For A Seamless Customer Experience
Speech can augment the systems you already have in place, or it can provide entirely new
channels for information and e-commerce. It's important to understand the full technical
capabilities of telephone speech recognition and how it works with your other systems.
You also need to evaluate your current system components and decide how speech can be
integrated with them. Should you discontinue touchtone entirely, or maintain it for
certain actions, such as entering a personal password? Most importantly, how can speech
provide "the second leg of e-commerce," complementing current and future
Internet-based services and delivering the same types of "anytime, anywhere"
applications without requiring a computer and browser.
3. Let Your Personality Shine Through
Now it's time to have some fun and think about how callers can interact with your new
speech applications. Consider some basic design principles:
- Establish your application's "personality." This appears both in the voice
quality of your recordings and in the formality of your prompts. For example, a brokerage
firm may want to project an image of security, strength, and know-how, whereas a cruise
line might stress convenience, fun, or value.
- Accommodate both experienced and novice users. Here, natural language processing (the
ability to speak in complete phrases and sentences) and "barge-in" capability
(the ability to interrupt the system) come into play. These capabilities give seasoned
callers conversational flexibility in conducting transactions. Seasoned callers might
simply say, for example, "I want to fly to Washington, D.C., next Thursday in the
morning." First-time users should still be able to follow step-by-step directions and
prompts.
- Be friendly. Use polite, conversational phrases - "I'm sorry, I didn't understand
that," as compared to the more technical, "That was an invalid entry."
4. Create Your Call Flow
Call flow is the map or model of how users will navigate through an automated system.
Start by establishing common questions and possible answers. Once you understand the call
flow, you can draft your caller prompts. These should include tips for first-time callers
and hints for using natural language in the future: "Next time, you can just say,
'transfer 200 dollars from savings to checking.'"
Next, draft your prompts. With first drafts in hand, use role-playing to make sure that
the prompts are clearly understood and that they can be answered unambiguously.
5. Get To The Data
You've thought long and hard about your callers and how to direct them toward information
or transactional services. But the user interface will only succeed if your back-end,
enterprise database can support the transactions. Can you retrieve a summary of a caller's
accounts with a single transaction? While your application is still early in the
development process, you should perform a thorough analysis of your database. What types
of transactions is it capable of handling? What interfaces are required? Answering these
questions up front helps avoid taxing your database later on.
6. Develop The Application
In years past, the application development phase required the expertise of speech
scientists and veteran programmers, plus months of tedious work. Today, many advanced
tools are available to accelerate the process. During this phase you will:
- Build the call flow. Bring your model call flow map to life. Use prepackaged speech
modules with configurable parameters to simplify the process of creating self-service
applications. Eliminate troubleshooting later on by testing component parts of your
application as you build them.
- Record prompts. Your prompts are critical, as they reflect your company's personality.
Selecting and directing your voice talent well creates a system that callers will enjoy
and use repeatedly.
- Conduct Usability Tests. During a first round of usability tests, you'll gather input
from real callers. One-on-one observed sessions allow you to identify any confusion,
interface glitches, or recognition issues. As you review your results, consider your
original goals. Evaluate transaction completion (the statistical rate of callers
completing the tasks they set out to do) and work toward a rate of 95 percent or better.
- Rapid iteration. Unfortunately, even with the most careful development process, it's
unlikely that the user interface will work as you expected the first time.
Over-the-telephone speech recognition is uncharted territory for many of us - developers
and users alike. For best results, plan on loops of testing and refinements. Over time,
add new functions, make your data more realistic, and correct and improve the user
interface based on user feedback.
7. Conduct A Pilot Test
Make your application available to a limited group of callers who can use it in realistic,
unobserved settings. The closer these people match your target users, the more accurate
the results. For a system where repeat calls are expected, try running the pilot long
enough for users to dial in multiple times. The best pilot test will include several
hundred different callers and several thousand calls. Analyze the interactions using a
variety of tools and approaches, and be sure to listen to selected calls offline for signs
of confusion or "out-of-vocabulary" utterances.
8. Deploy The System And Check It Regularly
Bring your application to market in limited launch or beta test, while monitoring and
analyzing calls on a regular basis. This is the time to re-visit your initial objectives.
Are callers using the system as you expected? Are they conducting other types of
transactions? Most importantly, are they successfully completing the tasks they set out to
do?
Ramp up your caller volume either by expanding geographically or introducing a service
to a select group of users (e.g., frequent travelers). Then gradually make it available to
a wider market. Ongoing transaction completion rates - your goal should be at least 95
percent - can tell you a great deal about how callers are reacting to the system.
Evaluate your system every few months. Continue to look at transaction completion rates
and analyze calls. Survey your customers. The conclusions you draw from a careful periodic
assessment of the application will help you optimize the system, to the benefit of both
your callers and your company.
Mark Holthouse is senior vice president of operations for SpeechWorks
International, Inc. He can be reached for comment at mark.holthouse@speechworks.com.
SpeechWorks is a leading provider of automated speech recognition (ASR) software for
large-scale, customer service solutions and speech-enabled e-commerce. For more
information, please visit their Web site at www.speechworks.com.
|