Service With A Smile: The Art Of Speech Recognition
Over The Phone
BY STEVE CHAMBERS AND CHRISTOPHER �BLADE� KOTELLY, SPEECHWORKS INTERNATIONAL
We�ve all encountered products we don�t like: hard-to-hold toothbrushes,
impossible-to-program VCRs, confusing touch-tone phone systems. These
products, and thousands like them, become even more frustrating when the
technology is new and there is little or no precedent set for the best
way to interact with them. It�s far better for new technology to be designed
appropriately � so that it delights the user from the first encounter
with the product.
Speech recognition technology is a good example. While speech recognition
systems over the telephone are beginning to go mainstream in companies
such as United Airlines, E*Trade and HP, many callers are still using
them for the first time, and that first user experience often sets the
tone for how the system will be received by callers in the long-term.
Since you, as a stakeholder in your company�s customer-facing applications,
never get a second chance to make a first impression, the user-design
interfaces must be effective, right from the first call.
While much has been written about the science and technology of automated
speech recognition systems � the recognition engines, accuracy algorithms,
databases and phone networks � little has been written about the art of
the systems, or the way callers think and feel while using them. Ultimately,
the art focuses on creating an engaging and satisfying caller experience.
Speech systems should be designed to help callers with what is important
to them. Good systems excite callers and make them want to call back because
they sense that the system works quickly and intuitively to provide the
desired answers. A really good system makes callers feel as if they are
speaking with a human and not a computer.
In contrast, poorly designed systems are frustrating and induce callers
to switch to an operator immediately, resulting in increased hold times,
over-worked agents and a lower ROI.
The Three Steps Of Speech-Interface Design
There is an art to designing systems that cause callers to hang up with
a smile and choose to use the system again and again. Like many artistic
endeavors, the �art� involves a disciplined process to get the right input
and create the right interactions to which your audience will favorably
respond. With over the telephone speech recognition, the process falls
into three phases: researching the way callers think and speak about the
application, designing the application so that callers understand and
enjoy using it, and testing and refining the application with a focus
on caller satisfaction.
Phase 1: Research. Ocean explorer Jacques Cousteau once
said, �The best way to observe a fish is to become one.� This holds true
for the speech system design team. To understand what motivates callers,
the team must know how callers are feeling when they call a company, their
motivation for using the system, how experienced the callers might be
with speech systems, how frequently they call, demographic information
(will the callers have thick accents or want to speak in a different language?)
and other aspects of the entire caller experience that are particular
with each system. This may include knowing the culture of the industry.
The team must determine what callers really need to know and when they
need to know it. For example, a caller may want to find out fare information
for inexpensive flights, but she probably won�t want to know about the
kind of plane, availability of special meals and seat assignment until
she has found the best airfare.
The designer also needs to know what�s motivating the company providing
the speech application. What services does the company want to provide
through the speech system? What is the task at hand � booking flights?
Trading stocks? Making a dinner reservation? On what kinds of callers
should the company focus? This might be frequent fliers for an airline
or high-volume stock traders for a brokerage. Why should the company focus
on a specific customer segment? How can the brand of the company be communicated
over the phone? Ultimately, the designer needs to be both the fish (callers)
and the ocean (the company).
It is vital for the design team to understand the mental state of the
caller. If designers look only for the obvious motivations of the caller
(for example, the caller wants to access his or her bank account to transfer
some funds), they might assume that the caller is relaxed and thinking
clearly. But what if that same caller is transferring funds because five
checks just bounced or his credit card was stolen? That caller might be
frustrated and angry and may not want to hear long, slowly paced prompts.
This caller would prefer to hear succinct prompts that offer helpful shortcuts
and �barge-in,� or the ability to talk over the prompts to jump quickly
through the call.
During the research phase, the design team performs several tasks to
get information about the proposed system and the role of the caller.
They listen to calls coming into the company�s call center, observe how
customers use the company�s Web site and touch-tone systems (if available)
and role-play sample conversations. They then take the knowledge gained
from their research and leverage it in the design phase.
Phase 2: Design. After the designer gains an understanding
of both the callers� and the company�s needs, the system design can be
developed. This is where the team constructs a diagram, or call flow,
which outlines the way questions are asked and describes the way the answers
are handled. There is one major difference between a traditional touch-tone
application call flow and one that is designed for speech recognition
systems which make them far more user-friendly: Personality is built in.
Personality refers to the way the computer speaks to the caller, the words
it uses, its tone of voice (the designers may use a professional voice
talent to record the audio prompts in keeping with the system�s personality),
and the pacing and pauses of the automated system. Personality affects
the ability of the caller to understand and enjoy using the application.
If a system has asked the caller to answer a long series of questions
only to determine that the information the caller is seeking isn�t available,
the system must inform the caller in a way that does not upset him. If
a warm, friendly system which had told the user, �I�ll get that information
for you right away� suddenly breaks out of character and informs the user
in a stilted fashion that, �The requested information was not retrievable.
Hang up and call again later,� the user will feel stunned and confused.
A live agent would most likely express an emotion that would sympathize
with the caller�s impending frustration � and so too should the automated
system.
So how does the design team know when they have the call flow and the
personality just right?
Phase 3: Testing. Usually, when people think about testing
technology, they think about quality assurance (QA). But QA doesn�t tell
you if people will enjoy using the application. To ensure that callers
will like the system and use it effectively, usability testing is imperative.
In this stage, the design team gathers a focus group of potential callers
and observes them using the system to complete particular tasks. After
watching just a few people use a system, issues become readily apparent.
Some of these may include callers who respond with answers the system
cannot comprehend. Most of the time, after observing a random sample within
the expected user community, the design team will understand about 95
percent of the usability issues and will know how to revise the design
issues that might prevent callers from having a satisfactory experience.
As an example, a system used by a bank in the Southern U.S. was tested
with participants from that region. When the system asked a �yes� or �no�
question, many callers responded with, �yes ma�am� and �no, ma�am.� This
is something that would probably not have occurred with a system used
primarily in the Northern U.S. The design team quickly added �ma�am� to
the system�s vocabulary and the callers were then able to complete the
interaction flawlessly.
When conducting usability testing, the team must test the call flow
and also the personality of the system. After testing, some callers may
think the voice prompts are too slow, but what they may actually mean
is that they want the information faster. In this fairly common occurrence,
the system can be redesigned so it need not keep returning to a database
to retrieve information for the caller. It can be designed so that it
draws on database information at only one point during the call, rather
than repeatedly searching. This is another instance in which an expert
design team is critical to ensuring a successful application.
There are two primary methods to determine if people enjoy using the
system. First, designers can observe the expression on callers� faces
and how they talk while using the system. If they grimace and speak in
a strained voice, the system needs redesigning to make it more appealing.
If callers smile, a primary goal of any speech recognition design team,
the design is probably right on track. Second, sample callers can complete
a survey to rate their impression of the system�s automated voice, the
call flow, ease of use, etc.
While testing a new system is necessary before it becomes available to
all callers, the systems may require periodic tune-ups after they go live
and throughout their lifecycles. For example, if a company launches a
new ad campaign, a different caller population might begin to use the
system and have some particular expectations which the system was not
designed to address. Or, if a company changes its name or adds service
features, the speech system will need updating to reflect the changes.
As a result, maintenance of the system is an ongoing process.
Choosing A System
Following are some things to consider when assessing a vendor. Try some
of the vendor�s deployed systems yourself. Are they friendly and helpful?
Do they have the requirements you are seeking? Does the system enhance
the caller�s experience? What is the experience of the vendor�s design
team and how many systems have they designed? Do their deployed systems
all sound the same or do they have their own personality? Is it important
to you to have features that allow companies to monitor the application
once it�s deployed so that the design can be continually checked and improved?
Finally, is the vendor willing to work with you well after the system
is deployed to fine tune it and make adjustments?
The important point is that the perfect system is one that meets both
your needs and your callers� needs. A system that will make your callers
smile and want to use it again makes all the difference between acquiring
a short-term customer and retaining a long-term one.
Steve Chambers is vice president of worldwide marketing for SpeechWorks
International, Inc. and Christopher �Blade� Kotelly is creative director
of interface design.
|