Component Standards Speed Development Of Speech-Enabled
Applications BY STEVE EHRLICH
Call center application developers are looking for something better than automated
touch-tone systems to handle the thousands of incoming calls they get every day, but
dont want to spend significant time learning new technologies and designing
applications. Corporations are accelerating the deployment of speech recognition systems
to eliminate the interminable prerecorded menus of information that often frustrate
callers, thereby improving customer service, lowering the number of callers that
"zero out" to a live operator, and offering significant cost savings.
The standard that will promote the development of speech applications is speech
components. By using components, developers dont have to rewrite code every time
they add to or build new applications. The components will remove much of the complex
linguistic guesswork from building a speech recognition system.
SIMPLIFYING SPEECH APPS
The hardware and software technology is now mature enough that corporations can
confidently implement these speech recognition applications. With a conversational speech
interface, the caller has more control over their transaction. They can barge into the
automated system, state what they need, and say it as a natural language query.
Speech recognition applications can also be accessed from any telephone, which is a
huge consideration for companies like UPS that are looking to expand into Europe, where
rotary phones are still common. Speech rec is also valuable for companies like Charles
Schwab, whose clients often call its automated stock quote system from cellular
telephones, and want hands-free access while on the road.
Until now, companies have been hampered by the difficulty of developing and deploying
the applications that recognize spoken words. Anticipating every possible user response
and accounting for the vagaries of human expression has been a larger and more complex
task than most companies are equipped to handle. But the industry is rallying around a set
of standards that will allow even lay programmers to develop customized speech recognition
applications from preset modules. New programming architectures leverage the same concepts
that object-oriented programming brings to Internet applications: componentization,
reusability, and customization.
SPEECH COMPONENT STRUCTURE
Speech components should contain three elements that help developers build applications
more quickly and efficiently: We call them grammars, dialogs, and prompts. Dialogs
describe the entire flow of interaction between a user and an application; grammars are
lists of valid comments a user might say in a particular portion of a dialog; and prompts
are messages played to a user, usually to elicit a response.
Experienced programmers are excited by the possibility of using a building-block
approach to create software and streamline a sometimes tedious and repetitive development
process. Bob Wohlsen, the technical director of voice solutions at Charles Schwab, agrees
that "Anything that helps combine the grammar for the recognizer and the dictionary
for the recognizer, such as lists of generic information for stock quotes or city or
airport names, would helpful. In my IVR systems, having components would certainly
simplify my development process."
For components to truly catch on as a standard for speech recognition development, they
need to be platform-independent. They are defined once and then implemented in a number of
different environments from that single definition. Portability of the components is made
possible through a set of APIs for different IVR platforms and for ActiveX and JavaBeans.
These components can then be integrated into applications built using an IVR
platforms application generator or any popular development tool, such as Visual
Basic or JBuilder. Support for component standards such as ActiveX and JavaBeans increases
the appeal and shortens the learning curve of speech recognition programming for
developers at large.
REAL-WORLD APPLICATIONS
Early adopters of speech recognition systems report stunning returns on their investments.
Systems often pay for themselves in weeks or months instead of years depending on the
volume of calls companies are dealing with. Just the conservation of human capital alone
is staggering.
Sears, Roebuck and Co.
Since deploying a speech recognition system to route calls in its stores nationwide last
spring, Sears, Roebuck and Co. has shifted 3,000 switchboard operators to other positions,
gaining millions of dollars in human resources in the process. Plus, Sears automated
call center can handle four calls at once, whereas human operators could only answer one
call at a time (also see "Speech Rec Slashes Sears
Call Handling Costs").
The company is considering growing the system to handle employee inquiries into
payroll, benefits, and company procedures, and is committed to staying on the cutting edge
of speech recognition technology.
United Parcel Service
United Parcel Service reported similar savings in human costs. After deploying a package
tracking system last November, the company saved so many man-hours that it didnt
have to hire its usual supply of temporary workers for the Christmas holiday. On December
23, 1997 alone, UPS logged 193,000 calls on the system.
Increasingly, we will see corporations deploying speech recognition systems, as more
and more call centers in the financial and travel industries seek to cash in on the
benefits of this automation. Getting these speech recognition applications to market is
becoming easier as the application development tools become more sophisticated, as
standards are set for APIs, and as programming tasks are streamlined by using components.
Steve Ehrlich is vice president of marketing for Nuance Communications, one of the
largest U.S. suppliers of conversational speech recognition systems. Prior to joining
Nuance, Steve spent twelve years at Oracle Corporation. Nuances SpeechObjects
product is a set of reusable components incorporating design and development standards to
significantly speed up development of speech-enabled applications. The company plans to
offer training and certification courses for SpeechObjects developers. Sears, UPS, and
Charles Schwab are all Nuance customers. For more information, contact the company at
650-847-0000, or visit their Web site at www.nuance.com.
|