×

SUBSCRIBE TO TMCnet
TMCnet - World's Largest Communications and Technology Community

CHANNEL BY TOPICS


QUICK LINKS




 

Feature Article
August 2000

 

Natural Language Speech Recognition: Expanding The Dimensions Of IVR

BY DR. VAL MATULA, AVAYA

The emergence of natural language speech recognition (NLSR) is a major step toward fulfilling visions we saw portrayed in popular science fiction more than 30 years ago. Astronaut Bowman pleaded with Hal to open the pod bay doors in 2001: A Space Odyssey; Dr. McCoy demanded a medical synopsis from a database in Star Trek.

More importantly, NLSR is becoming a useful and exciting tool for supporting customer relationship management (CRM) applications. At a given prompt, NLSR recognizes a sentence of words from a set of tens of thousands of words, and the technology performs additional sentence structure processing to extract specific meanings. The technology enables customers to speak naturally into telephones and computers in order to send and retrieve information and even complete entire transactions. It is ideal for IVR call flows in which callers need to provide more complex information than can be captured by pressing keys on a telephone key pad. Furthermore, as an innovative way of using networks to conduct business, NLSR is at the forefront of the e-business movement.

NLSR represents a logical progression from "basic" speech recognition, which businesses have been implementing with increasing regularity over the last few years. Today, as consumers, we barely notice a prompt to "press or say one." But if prompted to say something like, "I want to speak to someone in sporting goods," we consciously approve. By giving customers this option to communicate naturally, businesses and organizations are creating favorable impressions and increasing the likelihood customers will return. Additionally, NLSR provides opportunities to automate new types of transactions, potentially reducing costs and increasing revenues.

While NLSR presents great opportunities, organizations should approach it with the same care they would use while examining any leading-edge technology. Importing "technology for technology's sake" is never a good idea. Compounding the considerations for NLSR are so-called "human factors," which must attempt to anticipate customers' logical thoughts and assumptions during the process of making a transaction.

When we're conversing with organizations about possibly adopting NLSR, we strongly recommend the following application development cycle. It helps avoid the "common pitfalls" companies run into when dealing with technology -- i.e., it may "work great," but it's not helping your business.

Project Definition
This is the first step in any telecommunications/IT project. It establishes the scope of the project, the anticipated business case with estimated call volumes, the development community and the general timeline. Good candidates for an enterprise's first use of NLSR would be an application that has good call volumes, does not demand too much complexity in its technology and has a payback period on the order of no more than 6 to 12 months.

Define Call Flow And System Architecture
Here, we examine which questions will be asked of the caller, and what results will be presented back. Human factor elements are important considerations in defining this flow. For example, if account number or customer information is to be transferred to CTI applications, SNA hosts, etc., the call center infrastructure must be in place to support this transfer. Self-service applications require access to corporate data servers. Asking a caller for his or her account number "for better routing" should also transfer the account number to the target agent so the agent need not ask for the number again before assisting the caller. If NLSR is to be implemented using a client/server architecture with the IVR system(s), then the local area network infrastructure must be specified to assure the bandwidth between the IVR system and the NLSR engine servers.

Customer Interaction Testing
Organizations can begin creating the basic flow of an application by listening to live agents handle calls. A great deal can be learned about the types of questions agents must ask to elicit information from callers, as well as the different questions and vocabulary that customers might use when making a request or discussing an issue with an agent.

Offline testing with human factors experts can be valuable in determining the best way to ask questions of callers. For example, an appliance manufacturer we worked with wanted to use voice response to separate calls at a help desk/repair center. Originally, the first question was, "Do you have a large or small appliance?" But human factors testing found that consumers could consistently place almost all products into one of those two categories, the only exception being microwaves. Using focus groups, the company determined that a better question was, "Do you have a large appliance, a small appliance or a microwave?" It significantly increased the percentage of correct user responses.

Other tools and methods are available for customer interaction testing. For example, designers frequently do not know how callers will react to a NLSR-enabled IVR application. Furthermore, senior management may not have a feel for what a proposed application might sound like, or what the potential call flow might be. Situations like these can be addressed by creating a mock-up of the proposed application.

Using PC-based tools, organizations can create a live storyboard of the proposed application flow, and record prompts accordingly. With a common telephone interface card in the PC, the application works with a live agent controlling the flow. When a call is brought to the PC, the agent uses a mouse and clicks to play the first prompt to the caller. The caller hears the prompt(s) and responds as if it were a completely automatic IVR system. The agent hears the caller through a headset, analyzes the response and then chooses the next prompt to play to the caller. Along the way, caller's responses are recorded for later analysis.

This method enables testing and demonstration of a process without significant IVR application development. Senior management can get a feel for the application, designers can re-record prompts to be more precise, and application flow can be modified for smoother operation, all before full application development.

This step can be skipped for simple applications, but it can provide significant benefits to more complex applications and assist in the decision-making process before proceeding with a project.

IVR Application Development
In this step, the script or call flow is specified, and the IVR application is developed. Host interfaces or CTI links are programmed, if appropriate. If the customer interaction testing phase was executed, the prompts recorded during that phase are used in the application development -- if not, prompts are recorded.

In addition to these "standard IVR" activities, the vocabulary and semantics for responses to application prompts must be specified to the system. Anticipated caller responses are determined from the recordings captured during customer interaction testing or are estimated from past experience. In either case, tools available from the IVR provider or from the NLSR engine provider will be used to specify the grammar and vocabulary for the engine's recognizer.

First-Use Customer Trial
Generally, IVR applications benefit from a first-use trial with customers in which callers' reactions to the application are monitored. If necessary, changes are made to the call flow, prompts and options presented. It's true that many DTMF or basic speech applications shorten this step due to budget or project timeline constraints. While this may still result in an acceptable (if lower) success rate with a DTMF application, general industry experience at this time is that NLSR-based applications strongly benefit from this step. It helps ensure the general calling population interprets the prompts as designed, that the responses are in the specified vocabulary and thus are recognizable, and the overall application works as intended.

It is not unusual to begin this phase with a success rate for an application at a relatively low number, perhaps 50 percent, and see it raised to 60, 70 or even 80 percent within a matter of a few weeks.

Customer Lifecycle Maintenance
Many IVR applications, once installed, are not strongly monitored or updated. While again, this may be acceptable for DTMF applications, there is some need with NLSR-based applications to monitor and, if necessary, adjust the application to maintain a continued high success rate for users. Changes that can occur that will affect the application may include the following scenarios.

  • As users become more familiar with the application, they may provide shorter responses and may use barge-in (speaking over a prompt) more frequently.
  • Changes in the company's catalog, billing format or product descriptions may confuse callers or may require changes in the acceptable vocabularies.
  • Callers may assume that additional services beyond those offered (because the company has acquired new businesses, offered new products, etc.) are available through the application, and request them by name.
  • Through marketing and market expansion, new sets of users from different parts of the country or different demographics may engage the system. This can change the vocabulary presented to the recognizers. A simple example: A bank may have customers call about "car loans." If a bank expands into the Gulf Coast region, or if time passes and winter turns to summer, the bank may now receive calls for loans on jet-skis.

Natural language speech recognition is an exciting capability with myriad positive consequences for e-business and CRM. It can nourish the bottom line by controlling costs and increasing revenue. It can attract new customers and encourage existing customers to return. As with any leading-edge technology, organizations must be prudent in their approaches to evaluation and implementation. But NSLR is gaining enough marketplace traction that there are useful levels of insight and experience about deploying the technology. Data exist to provide a reliable road map, and any organization which handles high volumes of customer interactions and wishes to serve its customers well could benefit by examining NLSR's possibilities.

Dr. Val Matula is a distinguished member of the technical staff at Avaya (formerly Lucent Technologies Enterprise Networks Group). He is responsible for advanced speech and systems architecture planning, which includes speech recogntion, text-to-speech and speech coding. Previous work at Lucent's Bell Labs unit has supported systems engineering and planning for ACD management systems.

[return to the August 2000 table of contents]







Technology Marketing Corporation

2 Trap Falls Road Suite 106, Shelton, CT 06484 USA
Ph: +1-203-852-6800, 800-243-6002

General comments: [email protected].
Comments about this site: [email protected].

STAY CURRENT YOUR WAY

© 2024 Technology Marketing Corporation. All rights reserved | Privacy Policy